0

I start llama cpp Python server with the command:

python -m llama_cpp.server --model D:\Mistral-7B-Instruct-v0.3.Q4_K_M.gguf --n_ctx 8192 --chat_format functionary

Then I run my Python script which looks like this:

from openai import OpenAI import json import requests try: client = OpenAI( base_url="http://localhost:8000/v1", api_key="sk-xxx") response = client.chat.completions.create( model="mistralai--Mistral-7B-Instruct-v0.3", messages=[ {"role": "user", "content": "hi"}, ], ) # Extract the assistant's reply response_message = response.choices[0].message print(response_message) except Exception as e: error_msg = str(e) print(f"Exception type: {type(e)}") 

However, I don’t know how to set the top_k value to 1.

I tried changing my code to:

from openai import OpenAI import json import requests try: client = OpenAI( base_url="http://localhost:8000/v1", api_key="sk-xxx") response = client.chat.completions.create( model="mistralai--Mistral-7B-Instruct-v0.3", messages=[ {"role": "user", "content": "hi"}, ], top_k=1 ) # Extract the assistant's reply response_message = response.choices[0].message print(response_message) except Exception as e: error_msg = str(e) print(f"Exception type: {type(e)}") 

Also tried adding top_k value when starting the server like this:

python -m llama_cpp.server --model D:\Mistral-7B-Instruct-v0.3.Q4_K_M.gguf —-top-k 1 --n_ctx 8192 --chat_format functionary

But doesn’t seem to work. Can anyone help?

2
  • I'm too looking for this. Commented Dec 30, 2024 at 8:30
  • Seems like top_k isn’t supported Commented Jan 25 at 7:27

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.