create_openai_data_generator 出现 OutputParserException 需要结果有 function_call
在使用LangChain和OpenAI的ChatGPT模型(如GPT-3.5/4/4.1)和DeepSeek生成结构化数据时,遇到了KeyError: 'function_call'的错误。该错误表明在调用with_structured_output方法时,模型未能正确返回function_call字段,导致生成结构化数据失败。尽管尝试了多种模型和配置,成功率仍然较低,可能是由于langchain
·
KeyError: 'function_call'
需要结果字段有function_call
失败情况
gpt 3.5 /4/4.1和deepseek都试过 都有问题 成功率很低
感觉是experimental包不稳定
下面是原创的改写方法 舍弃了langchain_experimental包的create_openai_data_generator
import os
from typing import List
from langchain_core.prompts import PromptTemplate, FewShotPromptTemplate
from langchain_experimental.tabular_synthetic_data.prompts import SYNTHETIC_FEW_SHOT_PREFIX, SYNTHETIC_FEW_SHOT_SUFFIX
from langchain_openai import ChatOpenAI
from pydantic.v1 import Field, BaseModel
os.environ['http_proxy'] = '127.0.0.1:7777'
os.environ['https_proxy'] = '127.0.0.1:7777'
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "LangchainDemo"
os.environ["LANGCHAIN_API_KEY"] = ''
# 聊天机器人案例
# 创建模型
model = ChatOpenAI(model='deepseek-r1', api_key="sk-",
base_url="", temperature=1)
class MedicalBilling(BaseModel):
patient_id: int = Field(..., description="患者ID,6位数字")
patient_name: str = Field(..., description="患者姓名,中文姓名")
diagnosis_code: str = Field(..., description="诊断代码,符合ICD-10格式")
procedure_code: str = Field(..., description="医疗操作代码,5位数字")
total_charge: float = Field(..., description="总费用(美元),保留1位小数")
insurance_claim_amount: float = Field(..., description="保险理赔金额(美元),保留1位小数")
class MedicalBillingList(BaseModel):
medical_billings: List[MedicalBilling]
examples = [
{
"example": "Patient ID: 123456, Patient Name: 张娜, Diagnosis Code: J20.9, Procedure Code: 99203, Total Charge: $500, Insurance Claim Amount: $350"
},
{
"example": "Patient ID: 789012, Patient Name: 王兴鹏, Diagnosis Code: M54.5, Procedure Code: 99213, Total Charge: $150, Insurance Claim Amount: $120"
},
{
"example": "Patient ID: 345678, Patient Name: 刘晓辉, Diagnosis Code: E11.9, Procedure Code: 99214, Total Charge: $300, Insurance Claim Amount: $250"
},
]
# Prompt 组合模板
example_prompt = PromptTemplate(
input_variables=['example'],
template="{example}"
)
prompt_template = FewShotPromptTemplate(
prefix=SYNTHETIC_FEW_SHOT_PREFIX,
suffix=SYNTHETIC_FEW_SHOT_SUFFIX,
examples=examples,
example_prompt=example_prompt,
input_variables=['subject', 'extra']
)
# 然后使用 with_structured_output 明确指定使用 function_calling 方法
chain = prompt_template | model.with_structured_output(MedicalBillingList, method="function_calling")
for i in range(2):
result: MedicalBillingList = chain.invoke(
{'extra': "仿照例子来生产模拟数据列表, 列表长度为10", 'subject': "医院账单表"})
print(result)
核对输出

GitCode 天启AI是一款由 GitCode 团队打造的智能助手,基于先进的LLM(大语言模型)与多智能体 Agent 技术构建,致力于为用户提供高效、智能、多模态的创作与开发支持。它不仅支持自然语言对话,还具备处理文件、生成 PPT、撰写分析报告、开发 Web 应用等多项能力,真正做到“一句话,让 Al帮你完成复杂任务”。
更多推荐
所有评论(0)