第 7 章：结构化输出 -- 让 Agent 按格式说话

系列教程：OpenAI Agents SDK 从入门到实战

本章目标：掌握 output_type、动态指令、嵌套模型，让 Agent 返回程序能直接用的结构化数据。

为什么需要结构化输出？

第 3 章我们已经初步见识过结构化输出。但那只是开胃菜 -- 这一章我们要把这个能力用到极致。

先回顾一下痛点。如果 AI 返回的是自由文本：

这部电影非常精彩，我给它打8.5分，强烈推荐！导演的叙事手法很独特...

你想从里面提取评分？推荐还是不推荐？得写正则，得做字符串匹配，还得祈祷 AI 下次别换个说法。

但如果 AI 返回的是结构化数据：

MovieReview(title="盗梦空间", rating=8.5, summary="叙事手法独特", recommend=True)

程序直接用 .rating 取评分，用 .recommend 判断是否推荐。稳定、可靠、不用猜。

一句话：自由文本是给人看的，结构化数据是给程序用的。要让 Agent 融入你的系统，结构化输出是必修课。

output_type 基础用法

核心思路很简单：

用 Pydantic BaseModel 定义你要的数据结构
把它传给 Agent 的 output_type 参数
result.final_output 自动变成 Pydantic 对象

import asyncio
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel, Runner, set_tracing_disabled

# 关闭追踪
set_tracing_disabled(True)

# 配置模型（换成你自己的 API 地址和模型）
client = AsyncOpenAI(
    base_url="http://localhost:8317/v1",
    api_key="sk-12345678",
)
model = OpenAIChatCompletionsModel(model="gpt-5.2", openai_client=client)


# 定义输出结构
class MovieReview(BaseModel):
    title: str = Field(description="电影名称")
    rating: float = Field(description="评分，1-10")
    summary: str = Field(description="一句话点评")
    recommend: bool = Field(description="是否推荐")


# 创建 Agent，指定 output_type
agent = Agent(
    name="影评Agent",
    instructions="你是一个专业影评人，请对用户提到的电影给出结构化评价。请严格按照指定的 JSON 格式输出。",
    model=model,
    output_type=MovieReview,
)


async def main():
    result = await Runner.run(agent, input="聊聊《盗梦空间》")

    # result.final_output 是一个 MovieReview 对象，不是字符串
    review = result.final_output
    print(f"电影：{review.title}")
    print(f"评分：{review.rating}")
    print(f"点评：{review.summary}")
    print(f"推荐：{'是' if review.recommend else '否'}")


if __name__ == "__main__":
    asyncio.run(main())

运行后你会看到类似：

电影：盗梦空间
评分：9.2
点评：一部关于梦境与现实的烧脑巨作，叙事层层递进
推荐：是

关键点：result.final_output 的类型跟着 output_type 走。没设 output_type 时它是 str，设了就是你定义的 Pydantic 类的实例。

final_output_as：类型安全的转换

在多 Agent 协作场景中，result.final_output 的类型可能不那么确定（比如经过了 handoff，最终输出的 Agent 可能不是你预期的那个）。这时候 final_output_as() 就派上用场了。

import asyncio
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel, Runner, set_tracing_disabled

set_tracing_disabled(True)

client = AsyncOpenAI(
    base_url="http://localhost:8317/v1",
    api_key="sk-12345678",
)
model = OpenAIChatCompletionsModel(model="gpt-5.2", openai_client=client)


class WeatherInfo(BaseModel):
    city: str = Field(description="城市名称")
    temperature: float = Field(description="温度，摄氏度")
    condition: str = Field(description="天气状况，如晴、多云、雨等")


agent = Agent(
    name="天气Agent",
    instructions="你是一个天气预报员。根据用户提到的城市，给出天气信息（可以虚构数据）。请严格按照 JSON 格式输出。",
    model=model,
    output_type=WeatherInfo,
)


async def main():
    result = await Runner.run(agent, input="北京今天天气怎么样？")

    # 方式一：直接用 final_output（类型提示可能不够精确）
    print(type(result.final_output))  # <class 'WeatherInfo'>

    # 方式二：用 final_output_as 做显式转换（对类型检查器更友好）
    weather = result.final_output_as(WeatherInfo)
    print(f"{weather.city}：{weather.temperature}度，{weather.condition}")

    # 方式三：加上运行时类型检查，类型不对直接报错
    weather = result.final_output_as(WeatherInfo, raise_if_incorrect_type=True)
    print(f"{weather.city}：{weather.temperature}度，{weather.condition}")


if __name__ == "__main__":
    asyncio.run(main())

final_output_as 做了两件事：

默认行为：只是帮类型检查器（mypy、IDE）知道这个值是什么类型，运行时不做任何检查，本质是 cast
raise_if_incorrect_type=True：运行时会用 isinstance 检查，类型不对抛 TypeError

什么时候该用哪种？

场景	建议
单 Agent，output_type 明确	直接用 `result.final_output`
多 Agent 协作，结果可能来自不同 Agent	用 `final_output_as`
生产环境，需要防御性编程	用 `final_output_as(cls, raise_if_incorrect_type=True)`

复杂输出类型：嵌套模型

现实中的数据结构往往不是扁平的。比如一篇文章的分析结果，里面可能包含多个段落评价、多个关键词标签等。Pydantic 天然支持嵌套，Agent 也完全能处理。

import asyncio
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel, Runner, set_tracing_disabled

set_tracing_disabled(True)

client = AsyncOpenAI(
    base_url="http://localhost:8317/v1",
    api_key="sk-12345678",
)
model = OpenAIChatCompletionsModel(model="gpt-5.2", openai_client=client)


# 子模型：单个角色的评价
class CharacterReview(BaseModel):
    name: str = Field(description="角色名称")
    actor: str = Field(description="饰演演员")
    rating: float = Field(description="角色塑造评分，1-10")
    comment: str = Field(description="简短点评")


# 子模型：技术评价
class TechnicalReview(BaseModel):
    cinematography: float = Field(description="摄影评分，1-10")
    soundtrack: float = Field(description="配乐评分，1-10")
    editing: float = Field(description="剪辑评分，1-10")


# 主模型：完整影评，嵌套了上面两个子模型
class DetailedMovieReview(BaseModel):
    title: str = Field(description="电影名称")
    director: str = Field(description="导演")
    year: int = Field(description="上映年份")
    overall_rating: float = Field(description="总评分，1-10")
    summary: str = Field(description="整体评价，两三句话")
    characters: list[CharacterReview] = Field(description="主要角色评价，至少列出2个角色")
    technical: TechnicalReview = Field(description="技术方面的评价")
    tags: list[str] = Field(description="标签，如：科幻、动作、烧脑等")


agent = Agent(
    name="深度影评Agent",
    instructions=(
        "你是一个资深电影评论家，对用户提到的电影进行深度分析。"
        "请从角色、技术、整体等多个维度给出评价。"
        "请严格按照指定的 JSON 格式输出。"
    ),
    model=model,
    output_type=DetailedMovieReview,
)


async def main():
    result = await Runner.run(agent, input="请深度分析一下《星际穿越》")
    review = result.final_output

    # 基本信息
    print(f"《{review.title}》({review.year})")
    print(f"导演：{review.director}")
    print(f"总评分：{review.overall_rating}/10")
    print(f"评价：{review.summary}")
    print(f"标签：{', '.join(review.tags)}")

    # 角色评价（嵌套列表）
    print("\n--- 角色评价 ---")
    for char in review.characters:
        print(f"  {char.name}（{char.actor}）：{char.rating}/10 - {char.comment}")

    # 技术评价（嵌套对象）
    print("\n--- 技术评价 ---")
    print(f"  摄影：{review.technical.cinematography}/10")
    print(f"  配乐：{review.technical.soundtrack}/10")
    print(f"  剪辑：{review.technical.editing}/10")


if __name__ == "__main__":
    asyncio.run(main())

预期输出类似：

《星际穿越》(2014)
导演：克里斯托弗·诺兰
总评分：9.3/10
评价：一部将硬科幻与人类情感完美融合的史诗之作，在宏大的宇宙叙事中探讨了爱与时间的主题。

标签：科幻, 太空, 烧脑, 亲情

--- 角色评价 ---
  库珀（马修·麦康纳）：9.5/10 - 将父爱与探索精神完美融合
  墨菲（杰西卡·查斯坦）：9.0/10 - 从倔强少女到坚毅科学家的成长令人动容

--- 技术评价 ---
  摄影：9.5/10
  配乐：9.8/10
  剪辑：9.2/10

嵌套模型的要点：

子模型也是 BaseModel，可以被主模型引用
支持 list[SubModel]，AI 会返回一个对象列表
嵌套层级理论上没有限制，但模型能力有限，建议不超过 3 层
结构越复杂，Field(description=...) 越要写清楚

动态指令：instructions 用函数代替字符串

到目前为止，我们的 instructions 都是写死的字符串。但实际场景中，指令往往需要动态生成 -- 比如根据当前用户身份、时间、偏好来调整 Agent 的行为。

SDK 支持把 instructions 设成一个函数。这个函数接收 RunContextWrapper 和 Agent 两个参数，返回一个字符串。

import asyncio
from dataclasses import dataclass
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel, Runner, RunContextWrapper, RunConfig, set_tracing_disabled

set_tracing_disabled(True)

client = AsyncOpenAI(
    base_url="http://localhost:8317/v1",
    api_key="sk-12345678",
)
model = OpenAIChatCompletionsModel(model="gpt-5.2", openai_client=client)


# 自定义上下文：存放用户相关信息
@dataclass
class UserContext:
    user_name: str
    language: str
    style: str  # "formal" 或 "casual"


# 输出结构
class Greeting(BaseModel):
    message: str = Field(description="问候语")
    tip_of_the_day: str = Field(description="今日小贴士")


# 动态指令函数：根据上下文生成不同的系统提示
def dynamic_instructions(ctx: RunContextWrapper[UserContext], agent: Agent) -> str:
    user = ctx.context
    style_desc = "正式、礼貌" if user.style == "formal" else "轻松、活泼"
    return (
        f"你是一个问候助手。当前用户是 {user.user_name}。"
        f"请用{user.language}回复，风格要{style_desc}。"
        f"请严格按照指定的 JSON 格式输出。"
    )


agent = Agent[UserContext](
    name="动态问候Agent",
    instructions=dynamic_instructions,  # 传函数，不是字符串
    model=model,
    output_type=Greeting,
)


async def main():
    # 场景一：正式风格
    formal_ctx = UserContext(user_name="张总", language="中文", style="formal")
    result = await Runner.run(
        agent,
        input="你好",
        context=formal_ctx,
    )
    greeting = result.final_output
    print(f"[正式] {greeting.message}")
    print(f"[贴士] {greeting.tip_of_the_day}\n")

    # 场景二：轻松风格
    casual_ctx = UserContext(user_name="小明", language="中文", style="casual")
    result = await Runner.run(
        agent,
        input="嘿",
        context=casual_ctx,
    )
    greeting = result.final_output
    print(f"[轻松] {greeting.message}")
    print(f"[贴士] {greeting.tip_of_the_day}")


if __name__ == "__main__":
    asyncio.run(main())

预期输出类似：

[正式] 张总您好，很高兴为您服务。
[贴士] 每天保持30分钟的阅读习惯，有助于提升思维深度。

[轻松] 嘿小明！今天过得怎么样？
[贴士] 累了就站起来活动活动，对颈椎好！

动态指令的要点：

函数签名是 (RunContextWrapper[T], Agent) -> str，也支持 async 版本
RunContextWrapper 包裹了你自定义的上下文对象，通过 .context 访问
上下文对象通过 Runner.run() 的 context 参数传入
Agent 的泛型参数 Agent[UserContext] 让类型检查器知道上下文是什么类型
动态指令 + 结构化输出可以组合使用，互不冲突

完整可运行示例：智能简历解析器

把前面学到的知识点串起来 -- 嵌套模型 + 动态指令 + 结构化输出：

import asyncio
from dataclasses import dataclass
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel, Runner, RunContextWrapper, set_tracing_disabled

set_tracing_disabled(True)

client = AsyncOpenAI(
    base_url="http://localhost:8317/v1",
    api_key="sk-12345678",
)
model = OpenAIChatCompletionsModel(model="gpt-5.2", openai_client=client)


# ===== 定义嵌套的输出结构 =====

class WorkExperience(BaseModel):
    company: str = Field(description="公司名称")
    position: str = Field(description="职位")
    duration: str = Field(description="在职时间段，如：2020.03-2023.06")
    highlights: list[str] = Field(description="工作亮点，1-3条")


class Education(BaseModel):
    school: str = Field(description="学校名称")
    major: str = Field(description="专业")
    degree: str = Field(description="学历，如：本科、硕士、博士")
    graduation_year: int = Field(description="毕业年份")


class ResumeAnalysis(BaseModel):
    name: str = Field(description="候选人姓名")
    years_of_experience: int = Field(description="工作年限")
    skills: list[str] = Field(description="核心技能列表")
    education: list[Education] = Field(description="教育经历")
    experience: list[WorkExperience] = Field(description="工作经历")
    overall_assessment: str = Field(description="综合评价，两三句话")
    match_score: float = Field(description="与目标岗位的匹配度，0-100")


# ===== 自定义上下文 =====

@dataclass
class RecruitContext:
    target_position: str  # 目标岗位
    required_skills: list[str]  # 必要技能
    min_experience: int  # 最低工作年限


# ===== 动态指令 =====

def recruiter_instructions(ctx: RunContextWrapper[RecruitContext], agent: Agent) -> str:
    req = ctx.context
    skills_str = "、".join(req.required_skills)
    return (
        f"你是一个专业的HR简历分析师。"
        f"当前招聘的岗位是：{req.target_position}。"
        f"必要技能包括：{skills_str}。"
        f"最低工作年限要求：{req.min_experience}年。"
        f"请根据以上要求分析候选人的简历，给出匹配度评分。"
        f"请严格按照指定的 JSON 格式输出。"
    )


# ===== 创建 Agent =====

resume_agent = Agent[RecruitContext](
    name="简历分析Agent",
    instructions=recruiter_instructions,
    model=model,
    output_type=ResumeAnalysis,
)


# ===== 运行 =====

async def main():
    # 模拟招聘需求
    context = RecruitContext(
        target_position="高级Python后端工程师",
        required_skills=["Python", "FastAPI", "PostgreSQL", "Redis", "Docker"],
        min_experience=5,
    )

    # 模拟简历文本
    resume_text = """
    张伟，8年Python开发经验。

    教育背景：
    - 浙江大学，计算机科学与技术，硕士，2016年毕业

    工作经历：
    1. 字节跳动（2020.03 - 至今）：高级后端工程师
       - 负责推荐系统后端架构设计，日均处理请求1亿+
       - 主导微服务拆分，将单体应用拆分为20+个微服务
       - 引入 FastAPI 替换 Flask，接口响应时间降低40%

    2. 美团（2016.07 - 2020.02）：后端工程师
       - 负责商家端订单系统开发和维护
       - 优化 PostgreSQL 慢查询，将核心接口P99降低60%
       - 搭建基于 Redis 的分布式缓存方案

    技能：Python, FastAPI, Django, PostgreSQL, Redis, Docker, Kubernetes, gRPC
    """

    result = await Runner.run(
        resume_agent,
        input=resume_text,
        context=context,
    )

    analysis = result.final_output

    # 打印分析结果
    print(f"候选人：{analysis.name}")
    print(f"工作年限：{analysis.years_of_experience}年")
    print(f"核心技能：{', '.join(analysis.skills)}")
    print(f"匹配度：{analysis.match_score}/100")
    print(f"\n综合评价：{analysis.overall_assessment}")

    print(f"\n--- 教育经历 ---")
    for edu in analysis.education:
        print(f"  {edu.school} | {edu.major} | {edu.degree} | {edu.graduation_year}年毕业")

    print(f"\n--- 工作经历 ---")
    for exp in analysis.experience:
        print(f"  {exp.company} | {exp.position} | {exp.duration}")
        for h in exp.highlights:
            print(f"    - {h}")


if __name__ == "__main__":
    asyncio.run(main())

这个例子同时用到了：

嵌套模型：ResumeAnalysis 里嵌套了 Education 和 WorkExperience 的列表
动态指令：根据招聘上下文（目标岗位、必要技能、年限要求）动态生成提示词
结构化输出：AI 返回的结果可以直接当 Python 对象用

小结

本章你掌握了结构化输出的进阶用法：

output_type + BaseModel：让 Agent 返回结构化对象而非自由文本
final_output_as()：在类型不确定时做安全转换，支持运行时类型检查
嵌套模型：用子 BaseModel 构建复杂数据结构，支持列表嵌套
动态指令：instructions 可以是函数，根据上下文动态生成系统提示
组合使用：动态指令 + 嵌套模型 + 结构化输出，解决真实业务场景

一句话总结：output_type 让 Agent 从"聊天"变成"干活"，动态指令让 Agent 从"固定脚本"变成"随机应变"。

下一步预告

Agent 能按格式输出了，能动态调整行为了。但 Agent 系统越来越复杂，出了问题怎么排查？

下一章我们学习追踪与调试（Tracing & Debug），给 Agent 做体检，搞清楚它到底干了啥。

← 第6章上下文与记忆第8章追踪与调试 →