ChatML 與 Harmony：瞭解 OpenAI 的新格式 🔍

社群文章釋出於 2025 年 8 月 9 日

贊

Jisoo Kim

kuotient

OpenAI 剛剛釋出了他們的 **Harmony 格式**，與 gpt-oss 模型一起，引入了一種與我們一直用於 Qwen3 等模型的 ChatML 格式完全不同的結構化推理和工具呼叫的方法。

如果您正在使用推理模型或構建推理基礎設施，瞭解這些格式至關重要。今天，我們將深入探討這兩種格式，進行並排比較，以幫助您瞭解發生了什麼變化以及為什麼它很重要！

什麼是 ChatML？📝

ChatML（聊天標記語言）一直是許多開源模型（特別是 Qwen 系列）的首選格式。它本質上是一種受 XML 啟發的方式來構建對話，它是在需要清晰區分對話不同部分的需求下演變而來的。

主要特點

使用特殊標記 <|im_start|> 和 <|im_end|> 來標記訊息邊界
簡單的基於角色的結構（系統、使用者、助手）
思考/推理包裹在 <think>...</think> 塊中
工具定義和呼叫使用 XML 風格的標籤
經過驗證、久經考驗的格式，在許多模型中得到應用

將 ChatML 視為“經典”方法——直接、類似 XML，並注重清晰度。

什麼是 Harmony？🎭

Harmony 是 OpenAI 專門為 gpt-oss 模型設計的新響應格式。它不僅僅是一種提示格式，更是對模型如何構建輸出（尤其是複雜推理和工具使用）的徹底重新思考。

主要創新

多通道架構：訊息可以標記為 analysis（推理）、commentary（工具前言）或 final（面向使用者）
角色層次結構：system > developer > user > assistant > tool（用於處理指令衝突）
訊息路由：使用 to= 語法在元件之間定向訊息
TypeScript 風格的工具定義：比 JSON 模式更具表達性和簡潔性
每個通道獨立的安全標準：analysis 通道不像 final 那樣經過安全過濾

Harmony 將模型的輸出視為多執行緒對話，其中不同型別的內容透過不同的通道流動。

Harmony 的 TypeScript 風格語法：為何重要？🚀

Harmony 最有趣的設計選擇之一是使用 TypeScript 風格的工具定義，而不是 JSON 模式。這不僅僅是語法糖——其背後有充分的理由。

基於程式碼的定義的力量

正如最近對程式碼代理的研究所示，以程式碼表示動作具有以下幾個優點：

簡潔性：程式碼操作比 JSON 緊湊約 30%
並行性：需要 4 個並行流的 5 個操作？在 JSON 中，那是 20 個獨立的資料塊；在程式碼中，它是一個表示式
變數管理：輕鬆儲存和引用結果 (rock_image = generate_image("rock"))
可讀性：代理日誌變得更加清晰
LLM 流暢性：模型訓練資料中有大量的程式碼

工具定義比較

ChatML（JSON 模式）

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get weather for a city",
    "parameters": {
      "type": "object",
      "properties": {
        "city": {"type": "string"},
        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
      },
      "required": ["city"]
    }
  }
}

Harmony（TypeScript 風格）

namespace functions {
  // Get weather for a city
  type get_weather = (_: {
    city: string,
    unit?: "celsius" | "fahrenheit", // default: celsius
  }) => any;
}

TypeScript 版本不僅更簡潔，而且開發人員更熟悉，LLM 也更容易處理。

深入示例 🏊‍♂️

讓我們看看這些格式如何處理真實世界的場景，從簡單的聊天到使用工具的複雜多步推理。

1. 基本對話（無思考）

最簡單的情況——只是提問，沒有任何推理步驟。

ChatML (Qwen3)

<|im_start|>user
What's the capital of France?<|im_end|>
<|im_start|>assistant
<think>

</think>

The capital of France is Paris.<|im_end|>

OpenAI Harmony

<|start|>user<|message|>What's the capital of France?<|end|>
<|start|>assistant<|channel|>final<|message|>The capital of France is Paris.<|return|>

正在發生什麼： 即使停用了思考，Qwen3 也會包含空的 <think></think> 標籤作為結構提示。Harmony 則更直接——它直接跳轉到 final 通道以處理面向使用者的內容。

2. 帶思考的對話

這裡是哲學差異變得清晰的地方。

ChatML (Qwen3)

<|im_start|>user
If I have 3 apples and buy 5 more, then give away 2, how many do I have?<|im_end|>
<|im_start|>assistant
<think>
Starting with 3 apples
Adding 5 more: 3 + 5 = 8
Giving away 2: 8 - 2 = 6
So the final answer is 6 apples.
</think>

You would have 6 apples. 

Here's the breakdown: You start with 3 apples, buy 5 more (3 + 5 = 8), then give away 2 (8 - 2 = 6).<|im_end|>

OpenAI Harmony

<|start|>user<|message|>If I have 3 apples and buy 5 more, then give away 2, how many do I have?<|end|>
<|start|>assistant<|channel|>analysis<|message|>User starts with 3 apples. Buys 5 more: 3 + 5 = 8. Gives away 2: 8 - 2 = 6. Final count is 6 apples.<|end|>
<|start|>assistant<|channel|>final<|message|>You would have 6 apples.

Here's the breakdown: You start with 3 apples, buy 5 more (3 + 5 = 8), then give away 2 (8 - 2 = 6).<|return|>

有什麼不同

ChatML 將所有內容保留在一條訊息中，並內聯思考
Harmony 將推理（analysis 通道）和答案（final 通道）分成單獨的訊息
Harmony 的方法使得程式化地從面向使用者的內容中過濾推理變得更容易

3. 帶思考的多輪對話

這揭示了每種格式如何處理對話歷史。

ChatML (Qwen3)

<|im_start|>user
What's 15% of 80?<|im_end|>
<|im_start|>assistant
15% of 80 is 12.

To calculate: 80 × 0.15 = 12<|im_end|>
<|im_start|>user
Now add 25 to that result<|im_end|>
<|im_start|>assistant
<think>
The previous result was 12.
Adding 25: 12 + 25 = 37
</think>

Adding 25 to 12 gives us 37.<|im_end|>

重要： Qwen3 刪除了之前輪次的思考以節省標記！

OpenAI Harmony

<|start|>user<|message|>What's 15% of 80?<|end|>
<|start|>assistant<|channel|>analysis<|message|>Calculate 15% of 80: 80 × 0.15 = 12<|end|>
<|start|>assistant<|channel|>final<|message|>15% of 80 is 12.<|return|>
<|start|>user<|message|>Now add 25 to that result<|end|>
<|start|>assistant<|channel|>analysis<|message|>Previous result was 12. Adding 25: 12 + 25 = 37<|end|>
<|start|>assistant<|channel|>final<|message|>Adding 25 to 12 gives us 37.<|return|>

歷史管理： 兩種格式都修剪歷史推理，但 Harmony 的通道系統使其更明確和可控。

4. 函式呼叫

在工具處理方面，兩種格式存在顯著差異。

ChatML (Qwen3)

<|im_start|>system
# Tools

You may call one or more functions to assist with the user query.

<tools>
{"type": "function", "function": {"name": "get_weather", "description": "Get weather for a city", "parameters": {"type": "object", "properties": {"city": {"type": "string"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}}, "required": ["city"]}}}
</tools>

For each function call, return a json object within <tool_call></tool_call> XML tags.<|im_end|>
<|im_start|>user
What's the weather in Tokyo?<|im_end|>
<|im_start|>assistant
<think>
User wants Tokyo weather. Need to call get_weather function.
</think>

I'll check the current weather in Tokyo for you.

<tool_call>
{"name": "get_weather", "arguments": {"city": "Tokyo", "unit": "celsius"}}
</tool_call><|im_end|>
<|im_start|>user
<tool_response>
{"temperature": 22, "condition": "partly cloudy", "humidity": 65}
</tool_response><|im_end|>
<|im_start|>assistant
The current weather in Tokyo is 22°C with partly cloudy skies and 65% humidity.<|im_end|>

OpenAI Harmony

<|start|>developer<|message|># Tools
## functions
namespace functions {
// Get weather for a city
type get_weather = (_: {
  city: string,
  unit?: "celsius" | "fahrenheit", // default: celsius
}) => any;
} // namespace functions<|end|>
<|start|>user<|message|>What's the weather in Tokyo?<|end|>
<|start|>assistant<|channel|>commentary<|message|>I'll check the current weather in Tokyo for you.<|end|>
<|start|>assistant<|channel|>commentary to=functions.get_weather<|constrain|>json<|message|>{"city": "Tokyo", "unit": "celsius"}<|call|>
<|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>{"temperature": 22, "condition": "partly cloudy", "humidity": 65}<|end|>
<|start|>assistant<|channel|>final<|message|>The current weather in Tokyo is 22°C with partly cloudy skies and 65% humidity.<|return|>

主要區別

工具定義：JSON 模式 vs TypeScript 風格型別
工具響應：ChatML 將其包裝為使用者訊息；Harmony 使用專用工具角色
訊息路由：Harmony 的 to= 語法明確顯示了元件之間的流向
通道：Harmony 將工具互動（commentary）與最終答案分開

5. 帶思維鏈的函式呼叫

複雜的場景真正展示了架構上的差異。

ChatML (Qwen3)

<|im_start|>system
# Tools
<tools>
{"type": "function", "function": {"name": "get_weather", "description": "Get weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}}
{"type": "function", "function": {"name": "get_flight_status", "description": "Check flight status", "parameters": {"type": "object", "properties": {"flight_number": {"type": "string"}}}}}
</tools><|im_end|>
<|im_start|>user
Should I pack an umbrella for my UA902 flight to Tokyo?<|im_end|>
<|im_start|>assistant
<think>
User is asking about packing an umbrella for flight UA902 to Tokyo.
I need to:
1. Check the flight status to confirm it's going to Tokyo
2. Check Tokyo's weather to see if rain is expected
Let me start with the flight status.
</think>

I'll help you decide if you need an umbrella. Let me check your flight details and the weather in Tokyo.

<tool_call>
{"name": "get_flight_status", "arguments": {"flight_number": "UA902"}}
</tool_call><|im_end|>
<|im_start|>user
<tool_response>
{"flight": "UA902", "destination": "Tokyo (NRT)", "status": "On time", "departure": "14:30"}
</tool_response><|im_end|>
<|im_start|>assistant
<think>
Flight confirmed to Tokyo, departing at 14:30. Now checking weather.
</think>

<tool_call>
{"name": "get_weather", "arguments": {"city": "Tokyo"}}
</tool_call><|im_end|>
<|im_start|>user
<tool_response>
{"temperature": 18, "condition": "rainy", "precipitation": "80%"}
</tool_response><|im_end|>
<|im_start|>assistant
<think>
Tokyo weather shows rainy conditions with 80% precipitation. Definitely need an umbrella.
</think>

Yes, you should definitely pack an umbrella! Your UA902 flight to Tokyo is on time (departing at 14:30), and Tokyo is currently experiencing rainy weather with an 80% chance of precipitation. Better to be prepared! ☔<|im_end|>

OpenAI Harmony

<|start|>developer<|message|># Tools
## functions
namespace functions {
type get_weather = (_: {city: string}) => any;
type get_flight_status = (_: {flight_number: string}) => any;
} // namespace functions<|end|>
<|start|>user<|message|>Should I pack an umbrella for my UA902 flight to Tokyo?<|end|>
<|start|>assistant<|channel|>analysis<|message|>User asking about packing umbrella for UA902 to Tokyo. Need to: 1) Verify flight destination 2) Check Tokyo weather. Starting with flight status.<|end|>
<|start|>assistant<|channel|>commentary<|message|>I'll help you decide if you need an umbrella. Let me check your flight details and the weather in Tokyo.<|end|>
<|start|>assistant<|channel|>commentary to=functions.get_flight_status<|constrain|>json<|message|>{"flight_number": "UA902"}<|call|>
<|start|>functions.get_flight_status to=assistant<|channel|>commentary<|message|>{"flight": "UA902", "destination": "Tokyo (NRT)", "status": "On time", "departure": "14:30"}<|end|>
<|start|>assistant<|channel|>analysis<|message|>Flight confirmed to Tokyo, on time at 14:30. Now checking weather conditions.<|end|>
<|start|>assistant<|channel|>commentary to=functions.get_weather<|constrain|>json<|message|>{"city": "Tokyo"}<|call|>
<|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>{"temperature": 18, "condition": "rainy", "precipitation": "80%"}<|end|>
<|start|>assistant<|channel|>analysis<|message|>Tokyo has rainy conditions with 80% precipitation. Clear recommendation to pack umbrella.<|end|>
<|start|>assistant<|channel|>final<|message|>Yes, you should definitely pack an umbrella! Your UA902 flight to Tokyo is on time (departing at 14:30), and Tokyo is currently experiencing rainy weather with an 80% chance of precipitation. Better to be prepared! ☔<|return|>

在複雜場景中

ChatML 在同一對話回合中保留工具呼叫之間的思考
Harmony 透過單獨的 analysis 訊息明確跟蹤每個推理步驟
Harmony 的 commentary 通道可以在工具執行前包含面向使用者的“引言”
Harmony 中的訊息路由（to=）建立了清晰的執行跟蹤

這對生態系統意味著什麼 🌍

Harmony 的引入代表了我們對模型輸出思考方式的重大演變

基於通道的安全性：不同通道的不同安全標準對於推理模型來說是一個顛覆性變化
更好的可觀察性：明確的路由和通道使得除錯複雜的代理行為更容易
程式碼優先工具：TypeScript 語法可能成為工具定義的新標準
結構化推理：在格式層面而不是僅僅約定俗成地將思考與輸出分離

主要收穫 📌

ChatML 更簡單，更成熟，並獲得廣泛的生態系統支援
Harmony 引入了強大的新概念，如通道和訊息路由
兩種格式都處理推理和工具，但採用的理念截然不同
Harmony 的 TypeScript 風格定義更簡潔，更適合開發人員
Harmony 中的通道系統能夠對使用者看到的內容進行精細控制

隨著生態系統的發展，瞭解這兩種格式至關重要。雖然我們無法選擇模型使用的格式（只有在訓練基礎模型時才能選擇），但瞭解它們的工作原理有助於我們構建更好的推理基礎設施，並充分利用這些強大的推理模型。

社群

透過拖放到文字輸入框、貼上或點選此處上傳圖片、音訊和影片。

點選或貼上此處以上傳圖片

· 註冊或登入發表評論

贊