轨迹格式

Hermes Agent 轨迹是标准 JSONL 格式（每行一个 JSON 对象），用于训练数据生成和评估。

概述

轨迹数据包含：

完整对话历史 — 带有归一化角色的消息序列
工具调用统计 — 每个工具的成功/失败计数
推理覆盖 — 是否有 <think> 块
元数据 — 模型、时间戳、批处理信息

轨迹条目格式

每个 JSONL 行是一个完整的轨迹条目：

{
  "prompt_index": 0,
  "conversations": [
    {
      "from": "system",
      "value": "You are a helpful assistant...."
    },
    {
      "from": "human",
      "value": "What is Python?"
    },
    {
      "from": "gpt",
      "value": "Python is a programming language...",
      "tool_calls": [
        {
          "name": "terminal",
          "arguments": {"command": "python3 --version"}
        }
      ]
    },
    {
      "from": "tool",
      "value": "<tool_response>\n{\"tool_call_id\": \"call_abc123\", \"name\": \"terminal\", \"content\": \"Python 3.11.6\"}\n</tool_response>"
    },
    {
      "from": "gpt",
      "value": "<think>\nGot the version. I can now answer the user.\n</think>\nPython 3.11.6 is installed on this system."
    }
  ],
  "timestamp": "2026-03-30T14:22:31.456789",
  "model": "anthropic/claude-sonnet-4.6",
  "completed": true
}

字段说明

conversations

对话消息数组，格式为 sharegpt 格式：

字段	类型	描述
`from`	string	角色：`system`、`human`、`gpt`、`tool`
`value`	string	消息内容
`tool_calls`	array	（可选）GPT 消息中的工具调用
`role`	string	（内部）原始角色映射

tool_calls

GPT 消息中的工具调用数组：

{
  "name": "terminal",
  "arguments": {"command": "ls -la"}
}

tool_stats

工具使用统计（归一化模式）：

{
  "terminal": {"count": 2, "success": 2, "failure": 0},
  "read_file": {"count": 1, "success": 1, "failure": 0},
  "web_search": {"count": 0, "success": 0, "failure": 0}
}

所有工具都包含在内，零值默认值确保 HuggingFace datasets 的模式一致性。

归一化规则

推理内容标记

轨迹转换器将所有推理归一化为 <think> 标签，不管模型原始如何生成：

原生思考 token（来自 Anthropic、OpenAI o 系列等提供者的 msg["reasoning"] 字段）：包装为 <think>\n{reasoning}\n</think> 并添加到内容前面。
REASONING_SCRATCHPAD XML（当原生思考被禁用且模型通过系统提示指示的 XML 进行推理时）：通过 convert_scratchpad_to_think() 转换为 <think>。
空 think 块：每个 gpt 轮次保证有 <think> 块。如果没有生成推理，则插入空块：<think>\n </think>\n — 这确保了训练数据格式的一致性。

工具调用归一化

来自 API 格式的工具调用（带有 tool_call_id、函数名、JSON 字符串格式参数）转换为 XML 包装的 JSON：

<tool_call>
{"name": "terminal", "arguments": {"command": "ls -la"}}
</tool_call>

参数从 JSON 字符串解析回对象（不是双重编码）
如果 JSON 解析失败（不应该发生 — 对话中已验证），使用空 {} 并记录警告
一个助手消息中的多个工具调用在一个 gpt 消息中产生多个 <tool_call> 块

工具响应归一化

跟随助手消息的所有工具结果被分组到一个带有 XML 包装 JSON 响应的单个 tool 轮次中：

<tool_response>
{"tool_call_id": "call_abc123", "name": "terminal", "content": "output here"}
</tool_response>

如果工具内容看起来像 JSON（以 { 或 [ 开头），则解析它以便内容字段包含 JSON 对象/数组而不是字符串
多个工具结果在一个消息中用换行符连接
工具名称通过位置与父助手的 tool_calls 数组匹配

系统消息

系统消息在保存时生成（不从对话中获取）。它遵循 Hermes 函数调用提示模板，包含：

解释函数调用协议的前导文字
包含 JSON 工具定义的 <tools> XML 块
FunctionCall 对象的 schema 引用
<tool_call> 示例

工具定义包含 name、description、parameters 和 required（设置为 null 以匹配规范格式）。

加载轨迹

轨迹是标准 JSONL — 使用任何 JSON-lines 阅读器加载：

import json

def load_trajectories(path: str):
    """从 JSONL 文件加载轨迹条目。"""
    entries = []
    with open(path, "r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if line:
                entries.append(json.loads(line))
    return entries

# 仅过滤成功完成的
successful = [e for e in load_trajectories("trajectory_samples.jsonl")
              if e.get("completed")]

# 提取对话用于训练
training_data = [e["conversations"] for e in successful]

加载到 HuggingFace Datasets

from datasets import load_dataset

ds = load_dataset("json", data_files="trajectory_samples.jsonl")

归一化的 tool_stats schema 确保所有条目具有相同的列，防止数据集加载期间出现 Arrow schema 不匹配错误。

控制轨迹保存

在 CLI 中，轨迹保存由以下控制：

# config.yaml
agent:
  save_trajectories: true  # 默认：false

或通过 --save-trajectories 标志。当 Agent 用 save_trajectories=True 初始化时，_save_trajectory() 方法在每个对话轮次结束时调用。

批处理运行器始终保存轨迹（这是它的主要目的）。

零推理跨所有轮次的样本会被批处理运行器自动丢弃，以避免用非推理示例污染训练数据。

概述​

轨迹条目格式​

字段说明​

conversations​

tool_calls​

tool_stats​

归一化规则​

推理内容标记​

工具调用归一化​

工具响应归一化​

系统消息​

加载轨迹​

加载到 HuggingFace Datasets​

控制轨迹保存​

概述