Continue is a coding assistant that lives inside of a VS Code extension. It supports chatting with your codebase, making edits, and a lot more.
We have found that the Qwen-1.5-7B-Chat-Hybrid
model is the best Hybrid model available for coding. It is good at chatting with a few files at a time in your codebase to learn more about them. It can also make simple code editing suggestions pertaining to a few lines of code at a time.
However, we do not recommend using this model for analyzing large codebases at once or making large or complex file edits.
Note: they provide their own instructions here
This will add a Continue tab to your VS Code Activity Bar.
Note: The following instructions are based on instructions from Continue found here
Select model
.Select model
, then + Add Chat model
to open the new model dialog box.config file
link at the very bottom of the dialog to open config.yaml
.config.yaml
with the following and save:models:
- name: Lemonade
provider: openai
model: Qwen-1.5-7B-Chat-Hybrid
apiBase: http://localhost:8000/api/v1
apiKey: none
Lemonade
where you used to see Select model
. Ready!Note: see the Continue user guide to learn about all of their features.
Here are some examples for trying out Continue. These examples assume you have cloned this repo and allowed Continue to index it.
Open the Continue tab in your VS Code Activity Bar, and in the “Ask anything” box, type a question about your code. Use the @
symbol to specify a file or too.
@getting_started.md?
”@README.md
what do I need to do to set up for @api_oga_hybrid_streaming.py
?”Open a file, select some code, and push Ctrl+I to start a chat about editing that code.
//examples//lemonade//api_basic.py
.print(...
line at the bottom and press ctrl+i
.Start a new chat and prompt:
write a script in the style of
@api_basic.py
that uses the microsoft/Phi-4-mini-instruct model on GPU
Here’s what we got:
# Import necessary modules
from lemonade.api import from_pretrained
# Load the Phi-4-mini-instruct model with the hf-cpu recipe
model, tokenizer = from_pretrained("microsoft/Phi-4-mini-instruct", recipe="hf-cpu")
# Define your prompt
prompt = "This is a sample prompt for the Phi-4-mini-instruct model"
# Tokenize the prompt
input_ids = tokenizer(prompt, return_tensors="pt")
# Generate the response using the model
response = model.generate(input_ids, max_new_tokens=100) # Adjust the max_new_tokens as needed
# Decode the generated response
generated_text = tokenizer.decode(response[0])
# Print the response
print(generated_text)