OpenAI Harmony
OpenAI Harmony is a Python library providing a renderer for OpenAI's 'Harmony' response format, specifically designed for its open-weight model series, gpt-oss. It enables structured conversations, reasoning output, and function calls, mimicking the OpenAI Responses API. The library, with a Rust core and Python bindings, ensures consistent formatting, efficient processing, and first-class Python support, including typed stubs. It is crucial for developers building inference solutions for gpt-oss models, as these models are trained on and require the Harmony format for correct operation.
Warnings
- breaking The `openai-harmony` format is mandatory for OpenAI's gpt-oss series models. These models were specifically trained on this format and will not function correctly or reliably if it is not used.
- gotcha The 'analysis' channel, used by gpt-oss models for internal chain-of-thought reasoning, is not safety-filtered. Content from this channel should *never* be shown directly to end-users as it may contain harmful or unrefined outputs.
- gotcha Common mistakes include incorrect role mapping (e.g., mapping 'system' to `Role.DEVELOPER`) or missing essential imports (`SystemContent`, `Message`, etc.) when constructing Harmony templates, leading to runtime errors or unexpected model behavior.
- breaking When using Harmony-based GPT-5 models, some users have reported receiving malformed JSON outputs (concatenation of multiple JSONs) when interacting with OpenAI SDK versions >= 1.100.2, particularly when using `client.beta.chat.completions.parse` with `response_format`.
Install
-
pip install openai-harmony
Imports
- load_harmony_encoding
from openai_harmony import load_harmony_encoding
- HarmonyEncodingName
from openai_harmony import HarmonyEncodingName
- Role
from openai_harmony import Role
- Message
from openai_harmony import Message
- Conversation
from openai_harmony import Conversation
- DeveloperContent
from openai_harmony import DeveloperContent
- SystemContent
from openai_harmony import SystemContent
Quickstart
from openai_harmony import (
load_harmony_encoding,
HarmonyEncodingName,
Role,
Message,
Conversation,
DeveloperContent,
SystemContent,
)
# Load the Harmony encoding for GPT-OSS models
enc = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)
# Create a conversation with system, developer instructions, and a user message
convo = Conversation.from_messages([
Message.from_role_and_content(
Role.SYSTEM,
SystemContent.new(),
),
Message.from_role_and_content(
Role.DEVELOPER,
DeveloperContent.new().with_instructions("Talk like a pirate!")
),
Message.from_role_and_content(Role.USER, "Arrr, how be you?"),
])
# Render the conversation into tokens for completion. This generates the prompt.
tokens_for_model = enc.render_conversation_for_completion(convo, Role.ASSISTANT)
print("Prompt Tokens (to send to model):", tokens_for_model)
# --- Simulate a model response ---
# In a real scenario, `tokens_for_model` would be sent to a gpt-oss model,
# and the model would generate a continuation. For this quickstart, we'll
# append a sample assistant response to the prompt tokens.
model_raw_completion = tokens_for_model + "<|start|>assistant<|message|>Ahoy there, matey! I be shipshape and Bristol fashion. <|end|>"
# Parse the model's raw completion back into structured messages
parsed_messages = enc.parse_messages_from_completion_tokens(model_raw_completion, role=Role.ASSISTANT)
print("\nParsed Messages (including original prompt and simulated response):")
for msg in parsed_messages:
# Accessing content might depend on the content type (e.g., TextContent, DeveloperContent)
content_value = msg.content.text if hasattr(msg.content, 'text') else str(msg.content)
print(f"Role: {msg.role.name}, Content: {content_value}")