OpenAI Harmony

0.0.8 · active · verified Thu Apr 09

OpenAI Harmony is a Python library providing a renderer for OpenAI's 'Harmony' response format, specifically designed for its open-weight model series, gpt-oss. It enables structured conversations, reasoning output, and function calls, mimicking the OpenAI Responses API. The library, with a Rust core and Python bindings, ensures consistent formatting, efficient processing, and first-class Python support, including typed stubs. It is crucial for developers building inference solutions for gpt-oss models, as these models are trained on and require the Harmony format for correct operation.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load the Harmony encoding, construct a conversation using various roles, render it into the token format expected by gpt-oss models, and then parse a simulated model's completion back into structured messages.

from openai_harmony import (
    load_harmony_encoding,
    HarmonyEncodingName,
    Role,
    Message,
    Conversation,
    DeveloperContent,
    SystemContent,
)

# Load the Harmony encoding for GPT-OSS models
enc = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)

# Create a conversation with system, developer instructions, and a user message
convo = Conversation.from_messages([
    Message.from_role_and_content(
        Role.SYSTEM,
        SystemContent.new(),
    ),
    Message.from_role_and_content(
        Role.DEVELOPER,
        DeveloperContent.new().with_instructions("Talk like a pirate!")
    ),
    Message.from_role_and_content(Role.USER, "Arrr, how be you?"),
])

# Render the conversation into tokens for completion. This generates the prompt.
tokens_for_model = enc.render_conversation_for_completion(convo, Role.ASSISTANT)
print("Prompt Tokens (to send to model):", tokens_for_model)

# --- Simulate a model response ---
# In a real scenario, `tokens_for_model` would be sent to a gpt-oss model,
# and the model would generate a continuation. For this quickstart, we'll
# append a sample assistant response to the prompt tokens.
model_raw_completion = tokens_for_model + "<|start|>assistant<|message|>Ahoy there, matey! I be shipshape and Bristol fashion. <|end|>"

# Parse the model's raw completion back into structured messages
parsed_messages = enc.parse_messages_from_completion_tokens(model_raw_completion, role=Role.ASSISTANT)

print("\nParsed Messages (including original prompt and simulated response):")
for msg in parsed_messages:
    # Accessing content might depend on the content type (e.g., TextContent, DeveloperContent)
    content_value = msg.content.text if hasattr(msg.content, 'text') else str(msg.content)
    print(f"Role: {msg.role.name}, Content: {content_value}")

view raw JSON →