JSON Mode

When using the Avian API, a common approach is to instruct the model to always return a JSON object that aligns with your use case by specifying this in the system message. While this method works in many instances, there can be cases where the model produces output that doesn't parse into valid JSON objects.

To mitigate these issues and enhance model performance, you can set response_format to { "type": "json_object" } when using models like llama-3.1-405b-instruct. Enabling JSON mode ensures that the model is restricted to generating strings that parse into valid JSON objects.

Important Notes:

  1. Explicit JSON Instruction: Always instruct the model to produce JSON through a message in the conversation, such as the system message. Without an explicit instruction to generate JSON, the model might produce an unending stream of whitespace, causing the request to run continually until it hits the token limit. The API will throw an error if the string "JSON" does not appear somewhere in the context to prevent this.

  2. Handling Partial JSON Responses: The JSON in the model's response may be partial if finish_reason is length, indicating that the generation exceeded max_tokens or the conversation exceeded the token limit. Always check finish_reason before parsing the response to handle this scenario.

  3. Valid JSON Output: JSON mode ensures that the output is valid and parses without errors but does not guarantee that the output matches a specific schema.

Example Code

import openai

# Initialize the OpenAI client with the Avian API
client = openai.Client(
    api_key="your-api-key-here",
    base_url="https://api.avian.io/v1"
)

# Define the function to call the API with JSON mode enabled
def get_response():
    response = client.chat_completions.create(
        model="Meta-Llama-3.1-405B-Instruct",
        response_format={ "type": "json_object" },
        messages=[
            {"role": "system", "content": "You are a helpful assistant designed to output JSON."},
            {"role": "user", "content": "Who won the world series in 2020?"}
        ]
    )

    # Print the JSON response
    print(response.choices[0].message.content)

# Example usage
get_response()

In this example, the response includes a JSON object similar to the following:

"content": "{\"winner\": \"Los Angeles Dodgers\"}"

Note

JSON mode is always enabled when the model generates arguments as part of function calling. This ensures that the arguments are returned as valid JSON objects, improving reliability and consistency in API responses.

Last updated