Skip to content

Chatting with models without sessions

You can send independent requests to your knowledge model (KM) without the need to create or maintain a session.

Below is an example script demonstrating how to do it.

Define the constants:

python
import requests

KM_ID = '{YOUR_KM_ID}'
API_KEY = '{YOUR_API_KEY}'
API_URL = 'https://constructor.app/api/platform-kmapi/v1'

Define the headers:

python
headers = {     
    'X-KM-AccessKey': f'Bearer {API_KEY}' 
}

To interact with the Knowledge Model without creating a session, you can pass the history of previous messages directly to the model. The model will generate the following message based on the provided history without creating a new session, and the context will be preserved. Any of the available LLMs can be used with this method. You can retrieve the models suitable for your purposes in the following way. Please note that the model will be specified further by its alias, for example, "gpt-4o-2024-08-06."

python
print(requests.get(f'{API_URL}/language_models', headers=headers).json())
json
 {
  "results": [
    {
      "id": "83db90a3c67648e0ba43b48c2f2c39b6",
      "name": "GPT-4o",
      "description": "Most advanced and efficient model, excelling in non-English languages",
      "hosted_by": {"name": "OpenAI"},
      "code": "gpt"
    },
    {
      "id": "fc5f80c1b7fb49ffb0b9a059d2cf6522",
      "name": "Claude 3.5 Sonnet",
      "description": "Best combination of performance and speed for efficient, high-throughput tasks.",
      "hosted_by": {"name": "Anthropic"},
      "code": "claude"
    },
    {
      "id": "a2300dee72954806aad332aa826f8dc1",
      "name": "Claude 3.5 Haiku",
      "description": "Fastest model that can execute lightweight actions, with industry-leading speed.",
      "hosted_by": {"name": "Anthropic"},
      "code": "claude"
    },
    {
      "id": "fa4138f0e3454efda84767c338a06f33",
      "name": "Claude 3 Opus",
      "description": "Highest-performing model, which can handle complex analysis, longer tasks with many steps, and higher-order math and coding tasks.",
      "hosted_by": {"name": "Anthropic"},
      "code": "claude"
    },
    {
      "id": "13c40686125844f590567d0c99d879d7",
      "name": "Gemini 1.5 Pro",
      "description": "Complex reasoning tasks such as code and text generation, text editing, problem solving, data extraction and generation.",
      "hosted_by": {"name": "Google"},
      "code": "gemini"
    },
    {
      "id": "41350d36e6944876bede32d82a9dad01",
      "name": "Gemini 1.0 Pro",
      "description": "Natural language tasks, multi-turn text and code chat, and code generation.",
      "hosted_by": {"name": "Google"},
      "code": "gemini"
    },
    {
      "id": "5ae60bf998ee4bc8a1136078dc9ca367",
      "name": "Gemini 1.5 Flash",
      "description": "Fast and versatile performance across a diverse variety of tasks.",
      "hosted_by": {"name": "Google"},
      "code": "gemini"
    },
    {
      "id": "9d38cf09f0ff4fe8ab7b2e714adb55b2",
      "name": "GPT-4o Mini",
      "description": "The most cost-efficient small model",
      "hosted_by": {"name": "OpenAI"},
      "code": "gpt"
    },
    {
      "id": "643282cc66eb4875ab51213438a56a2a",
      "name": "GPT-o1",
      "description": "o1 model performs complex reasoning and think before you answer.",
      "hosted_by": {"name": "OpenAI"},
      "code": "gpt"
    },
    {
      "id": "aaf4df69643e4c728aff9773c7039879",
      "name": "GPT-o1 Mini",
      "description": "Light version of the o1 model with twice the context window size and response speed.",
      "hosted_by": {"name": "OpenAI"},
      "code": "gpt"
    }
  ],
  "total": 10
}
python
def send_message_with_context(model, messages, stream="false"):
    headers = {'X-KM-AccessKey': f'Bearer {API_KEY}'}     
    data = {"model": model, "messages": messages, "stream": stream}     
    response = requests.post(f'{API_URL}/knowledge-models/{KM_ID}/chat/completions', headers=headers, json=data)     
    return response.json()

Initialize the context:

Add messages to be sent to the model, representing the chat history. This should be an array containing the message content, where the roles can be defined as one of the following: "system," "user," or "assistant."

python
messages = [
    {"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step.", "name": "Math tutor"},
    {"role": "user", "content": "how can I solve 8x + 7 = -23", "name": "Student"}
]
response = send_message_with_context("gpt-4o-2024-08-06", messages) 
print(response)
json
{
  "id": "1bc5225c9305430d9a668b5961d9bf4f",
  "object": "chat.completion",
  "created": 1736433952,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "To solve the equation \\(8x + 7 = -23\\), follow these steps:\n\n1. Subtract 7 from both sides of the equation to isolate the term with \\(x\\):\n   \\[\n   8x + 7 - 7 = -23 - 7\n   \\]\n   This simplifies to:\n   \\[\n   8x = -30\n   \\]\n\n2. Divide both sides by 8 to solve for \\(x\\):\n   \\[\n   x = \\frac{-30}{8}\n   \\]\n   Simplifying the fraction gives:\n   \\[\n   x = -\\frac{15}{4}\n   \\]\n\nSo, the solution to the equation is \\(x = -\\frac{15}{4}\\).",
        "tool_calls": null
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
}