AI 医疗之后，下一代生命科学赛道可能是医学世界模型

damopa

372人浏览 · 2026-05-21 18:28:12

damopa · 2026-05-21 18:28:12 发布

用工程化框架拆解从风险预测到干预推演的系统架构

过去十年，医疗 AI 的主线是“看见疾病”。

影像 AI 看肺结节、看眼底、看病灶；临床风险模型预测心血管事件、糖尿病、再入院概率；大语言模型开始帮助医生总结病历、解释报告、生成医学文本。

这些能力很重要，但它们大多仍然停留在一个层面：

判断现在是什么，或者预测未来可能发生什么。

过去几年，AI 制药成为资本和产业关注的主战场。AI 被用于靶点发现、分子生成、蛋白结构预测、药物筛选和临床试验优化。

这代表 AI 从“看病”进一步进入“发现药物”。

但如果继续往前看，生命科学 AI 的下一代基础设施，可能不只是识别疾病，也不只是发现分子，而是建立一种能够表示个体状态、编码干预动作、推演状态变化、记录证据链，并持续反馈校准的系统。

这个方向，就是医学世界模型。

换句话说，医学 AI 的下一步，不只是回答：

这个人现在是什么情况？

未来风险有多高？

而是进一步回答：

如果采取某个干预动作，这个人的状态可能如何改变？这个推演依据是什么？后续如何验证和校准？

这篇文章尝试用工程化框架，把“医学世界模型”拆成开发者、医疗 AI 从业者和生命科学投资人都能理解的系统结构。

1. 医疗 AI 的第一阶段：识别与预测

很多医疗 AI 系统，本质上是在做三类任务：

识别
- 这张影像是否异常？
- 这个病灶是否值得进一步检查？
分类
- 这个病例属于哪一种类型？
- 这个分子特征更接近哪一种疾病亚型？
预测
- 这个人未来发生某个结局的风险有多高？
- 这个患者再入院概率是多少？

一个典型风险预测模型可以抽象为：

risk = predict_risk(patient_state)

例如：

patient_state = {
    "age": 52,
    "bmi": 29.1,
    "fasting_glucose": 6.2,
    "hba1c": 6.0,
    "blood_pressure": "138/86",
    "family_history": ["type_2_diabetes"],
    "sleep_duration": 5.8
}

risk = predict_diabetes_risk(patient_state)

输出可能是：

{
  "risk_level": "high",
  "estimated_5y_risk": 0.32
}

这个模型回答的是：

这个人未来风险高不高？

这很有价值，但医学管理不会止步于此。

下一步真正的问题是：

应该先做什么？
饮食、运动、睡眠、药物、随访哪个优先？
哪个干预动作更符合当前机制判断？
多少周后观察哪些指标？
如果没有改善，是动作执行问题，还是机制判断问题？

这时，普通预测模型就不够了。

因为医学真正进入了另一个问题：

Action：干预动作。

2. 医学世界模型解决什么问题？

医学世界模型不是一个更大的医疗聊天机器人，也不是一个自动开方案系统。

它更像一个围绕五个对象组织起来的可审计推演系统：

State       个体当前状态
Action      可定义的干预动作
Transition  动作后的状态转移假设
Evidence    支持推演的证据链
Feedback    真实反馈与下一轮更新

预测模型通常是：

state -> outcome

医学世界模型则是：

state + action + evidence -> transition hypothesis -> feedback update

也就是说：

预测模型关注“未来可能发生什么”；

医学世界模型关注“如果采取某个动作，未来状态可能如何改变”。

这就是从风险预测到干预推演的关键跃迁。

3. State：个体状态表示

医学世界模型的第一步不是训练更大的模型，而是定义状态。

一个简化的 PatientState 可以这样表示：

from dataclasses import dataclass
from typing import Dict, List, Optional

@dataclass
class PatientState:
    demographics: Dict
    clinical_markers: Dict
    symptoms: List[str]
    lifestyle: Dict
    medications: List[str]
    history: Dict
    omics: Optional[Dict] = None
    wearable: Optional[Dict] = None

示例：

patient_state = PatientState(
    demographics={
        "age": 52,
        "sex": "unspecified"
    },
    clinical_markers={
        "bmi": 29.1,
        "fasting_glucose": 6.2,
        "hba1c": 6.0,
        "triglycerides": 2.1,
        "hdl_c": 0.95,
        "blood_pressure": "138/86"
    },
    symptoms=[
        "fatigue",
        "post_meal_sleepiness"
    ],
    lifestyle={
        "sleep_hours": 5.8,
        "exercise_frequency_per_week": 1,
        "diet_pattern": "high_refined_carbohydrate",
        "stress_level": "high"
    },
    medications=[],
    history={
        "family_history": ["type_2_diabetes"],
        "previous_diagnosis": []
    }
)

这里的重点不是字段越多越好，而是状态要能支持后续的 action 和 feedback。

一个不能被干预动作引用、也不能被反馈更新的状态表示，对医学世界模型帮助有限。

4. Action：把干预动作编码成对象

预测模型通常不需要 action。

但医学世界模型必须显式定义 action。

比如，“改善生活方式”不是一个合格的 action，因为它太模糊，不可执行、不可记录、不可复核。

更好的做法是把 action 写成结构化对象：

@dataclass
class InterventionAction:
    action_id: str
    category: str
    description: str
    target_mechanism: List[str]
    intensity: str
    duration_weeks: int
    monitoring_markers: List[str]
    safety_notes: List[str]

示例：

action = InterventionAction(
    action_id="nutrition_low_glycemic_8w",
    category="nutrition",
    description="8-week low-glycemic dietary adjustment with reduced refined carbohydrates",
    target_mechanism=[
        "postprandial_glucose_variability",
        "insulin_resistance",
        "weight_management"
    ],
    intensity="moderate",
    duration_weeks=8,
    monitoring_markers=[
        "fasting_glucose",
        "hba1c",
        "weight",
        "waist_circumference",
        "postprandial_glucose"
    ],
    safety_notes=[
        "not a medical prescription",
        "review with clinician if diabetes medication is used",
        "monitor hypoglycemia risk when relevant"
    ]
)

这一步非常关键。

医学世界模型不是简单“生成建议”，而是让每个干预动作变得：

可描述；
可执行；
可记录；
可审计；
可反馈。

这也是它与普通医疗问答 AI 的根本区别。

5. Transition：状态转移假设，不是疗效承诺

在工程实现中，我们可能很自然地写：

next_state = model.predict_next_state(state, action)

但在医学里，这个写法容易造成误解。

因为它像是在说：

模型可以预测个体疗效。

这在科学和合规上都不稳妥。

更合适的命名是：

transition_hypothesis = estimate_transition_tendency(state, action)

也就是：

估计状态转移倾向。

一个 TransitionHypothesis 可以这样设计：

@dataclass
class TransitionHypothesis:
    expected_direction: Dict
    mechanism_rationale: List[str]
    uncertainty_level: str
    time_window_weeks: int
    assumptions: List[str]

示例：

transition = TransitionHypothesis(
    expected_direction={
        "fasting_glucose": "decrease_possible",
        "postprandial_glucose": "decrease_possible",
        "weight": "slight_decrease_possible",
        "energy_level": "may_improve"
    },
    mechanism_rationale=[
        "lower refined carbohydrate intake may reduce postprandial glucose excursion",
        "weight reduction may improve insulin sensitivity",
        "improved dietary pattern may reduce metabolic stress"
    ],
    uncertainty_level="moderate",
    time_window_weeks=8,
    assumptions=[
        "adequate adherence",
        "no major medication change",
        "baseline data quality is acceptable",
        "no unrecognized endocrine disorder"
    ]
)

注意这里没有写：

will decrease
will cure
will reverse

而是写：

decrease_possible
may_improve
transition tendency

这是医学世界模型非常重要的安全边界。

医学世界模型生成的不是确定性疗效承诺，而是：

机制约束下的状态转移假设。

6. Evidence：证据链对象

医学世界模型不能只输出 transition，还必须说明为什么。

一个简单的证据链对象可以这样设计：

@dataclass
class EvidenceItem:
    source_type: str
    description: str
    strength: str
    url_or_reference: Optional[str] = None

@dataclass
class EvidenceChain:
    items: List[EvidenceItem]
    overall_strength: str
    limitations: List[str]

示例：

evidence_chain = EvidenceChain(
    items=[
        EvidenceItem(
            source_type="clinical_guideline",
            description="Lifestyle modification is commonly recommended for metabolic risk management.",
            strength="high"
        ),
        EvidenceItem(
            source_type="mechanistic_evidence",
            description="Reduced refined carbohydrate intake may lower postprandial glucose excursions.",
            strength="moderate"
        ),
        EvidenceItem(
            source_type="individual_context",
            description="Patient reports high refined carbohydrate intake and low exercise frequency.",
            strength="contextual"
        )
    ],
    overall_strength="moderate",
    limitations=[
        "individual response may vary",
        "adherence is uncertain",
        "not a substitute for clinical evaluation"
    ]
)

这一步对应医学世界模型的可审计性。

如果模型无法解释：

证据来自哪里；
证据强度如何；
适用边界是什么；
有哪些限制；

那么它就不应该被当成医学推演系统。

7. Feedback：复测与长期更新

医学世界模型不是一次性输出。

它必须允许反馈更新。

@dataclass
class FollowUpFeedback:
    timepoint_weeks: int
    observed_markers: Dict
    adherence: Dict
    symptoms_change: Dict
    adverse_events: List[str]

示例：

feedback = FollowUpFeedback(
    timepoint_weeks=8,
    observed_markers={
        "fasting_glucose": 5.8,
        "hba1c": 5.8,
        "weight": -2.1,
        "waist_circumference": -3.0
    },
    adherence={
        "diet": "medium",
        "exercise": "low",
        "sleep": "unchanged"
    },
    symptoms_change={
        "fatigue": "slightly_improved",
        "post_meal_sleepiness": "improved"
    },
    adverse_events=[]
)

然后更新模型：

def update_state_with_feedback(
    previous_state: PatientState,
    action: InterventionAction,
    transition: TransitionHypothesis,
    feedback: FollowUpFeedback
):
    audit_log = {
        "previous_state": previous_state,
        "action": action,
        "expected_transition": transition,
        "observed_feedback": feedback,
        "interpretation": None,
        "next_step": None
    }

    if feedback.adherence["diet"] == "medium":
        audit_log["interpretation"] = "Partial improvement observed; adherence may limit effect size."
        audit_log["next_step"] = "Review action intensity and adherence barriers."
    else:
        audit_log["interpretation"] = "Feedback should be interpreted with caution."
        audit_log["next_step"] = "Collect more context before updating intervention plan."

    return audit_log

医学世界模型的关键不是一次预测正确，而是能否持续校准：

observe -> act -> simulate -> monitor -> update

这也是投资人应该关注的地方：

未来真正有价值的医学 AI，不只是一次性工具，而是能够形成长期反馈闭环的平台。

8. 最小系统工作流

可以把上述对象组合成一个最小工作流：

def medical_world_model_loop(patient_id: str):
    # 1. Observe state
    state = observe_patient_state(patient_id)

    # 2. Generate candidate actions
    candidate_actions = generate_candidate_actions(state)

    # 3. Safety filter
    safe_actions = []
    for action in candidate_actions:
        if pass_safety_gate(state, action):
            safe_actions.append(action)

    # 4. Estimate transitions
    transition_candidates = []
    for action in safe_actions:
        transition = estimate_transition_tendency(state, action)
        evidence = build_evidence_chain(state, action, transition)

        transition_candidates.append({
            "action": action,
            "transition": transition,
            "evidence": evidence
        })

    # 5. Human-in-the-loop review
    selected_action = clinician_or_expert_review(transition_candidates)

    # 6. Execute and monitor
    feedback = collect_follow_up_feedback(patient_id, selected_action)

    # 7. Update state and audit log
    updated_record = update_state_with_feedback(
        previous_state=state,
        action=selected_action,
        transition=selected_action["transition"],
        feedback=feedback
    )

    return updated_record

注意第 5 步：

selected_action = clinician_or_expert_review(transition_candidates)

医学世界模型不应该绕过专业人员，直接给出治疗决策。

更稳妥的系统定位是：

hypothesis generation + decision support + audit trail

即：

假设生成 + 辅助决策 + 可审计记录。

9. Safety Gate：安全边界必须前置

医学场景中，安全边界不能放在最后。

需要有一个 safety_gate：

def pass_safety_gate(state: PatientState, action: InterventionAction) -> bool:
    # Example checks only. Not medical advice.
    contraindications = detect_contraindications(state, action)
    medication_conflicts = check_medication_conflicts(state, action)
    red_flags = detect_red_flags(state)

    if red_flags:
        return False

    if contraindications:
        return False

    if medication_conflicts:
        return False

    return True

示例：

def detect_red_flags(state: PatientState) -> List[str]:
    red_flags = []

    if state.clinical_markers.get("fasting_glucose", 0) > 13.9:
        red_flags.append("very_high_glucose_requires_clinical_evaluation")

    if "chest_pain" in state.symptoms:
        red_flags.append("chest_pain_requires_urgent_evaluation")

    return red_flags

这里的核心思想是：

模型不是越自动越好，而是越可控、可审计、可中止越好。

这也是医学世界模型区别于普通生成式 AI 应用的重要地方。

10. Audit Log：医学世界模型必须留下推演痕迹

一个医学世界模型应该为每次推演留下审计记录：

@dataclass
class AuditLog:
    patient_id: str
    state_snapshot_id: str
    action_id: str
    transition_id: str
    evidence_chain_id: str
    reviewer: str
    decision: str
    uncertainty_level: str
    safety_notes: List[str]
    timestamp: str

示例：

audit_log = AuditLog(
    patient_id="P001",
    state_snapshot_id="S20260521",
    action_id="nutrition_low_glycemic_8w",
    transition_id="T20260521_001",
    evidence_chain_id="E20260521_001",
    reviewer="human_expert",
    decision="approved_for_health_management_context",
    uncertainty_level="moderate",
    safety_notes=[
        "not medical diagnosis",
        "not treatment prescription",
        "clinical review required if symptoms worsen"
    ],
    timestamp="2026-05-21T17:00:00+08:00"
)

如果没有 audit log，系统就很难回答：

当时为什么推荐这个动作？
证据来自哪里？
哪些假设后来被证伪？
哪些反馈改变了下一轮判断？
出现偏差时应该追溯到哪里？

医学 AI 要进入长期健康管理、精准医学、长寿医学，就不能只追求“回答得像”，还要追求“推演过程可追踪”。

11. 可驾驭世界模型：不是控制人体，而是定义方向、动作与反馈

普通世界模型可以模拟未来。

但医学真正关心的是：

如何在证据和反馈约束下，让生命系统朝更好的方向变化？

这就是“可驾驭世界模型”的意义。

从工程角度看，可驾驭不是“控制人体”，而是让系统具备以下接口：

@dataclass
class SteeringInterface:
    objective: Dict
    allowed_actions: List[InterventionAction]
    safety_constraints: List[str]
    feedback_metrics: List[str]
    stop_conditions: List[str]

示例：

steering = SteeringInterface(
    objective={
        "primary": "improve_metabolic_resilience",
        "secondary": ["reduce_glucose_variability", "improve_energy_level"]
    },
    allowed_actions=[
        "nutrition_adjustment",
        "exercise_adjustment",
        "sleep_management",
        "clinical_referral_when_needed"
    ],
    safety_constraints=[
        "no medication change without clinician",
        "stop if red flags appear",
        "avoid unsupported intervention claims"
    ],
    feedback_metrics=[
        "fasting_glucose",
        "postprandial_glucose",
        "weight",
        "waist_circumference",
        "symptom_score"
    ],
    stop_conditions=[
        "adverse_event",
        "red_flag_symptom",
        "data_quality_insufficient"
    ]
)

这才是医学 AI 需要的“可驾驭”：

有目标；
有动作；
有边界；
有反馈；
有停止条件；
有人类复核。

12. 为什么这可能成为下一代生命科学赛道？

从投资视角看，医学世界模型之所以值得关注，不是因为它又提出了一个新名词，而是因为它可能把多个正在发展的方向连接起来：

1. 连接医疗 AI

医疗 AI 过去解决的是识别、分类、预测。医学世界模型把它推进到干预推演。

2. 连接 AI 制药

AI 制药关注分子、靶点和药物研发。医学世界模型关注药物或非药物干预进入个体之后，状态可能如何变化。

3. 连接精准医学

精准医学需要个体分层和个体化决策。医学世界模型提供 State、Action、Transition、Evidence、Feedback 的结构化框架。

4. 连接长寿医学

长寿医学不是一次性诊断，而是长期状态管理，非常适合 state-action-transition-feedback 这种长期闭环。

5. 连接健康管理平台

未来健康管理不只是一次检测报告，而是长期跟踪、复测、反馈和动态调整。医学世界模型有机会成为这种平台的底层架构。

因此，它的潜在价值不是单点工具，而是平台型基础设施。

13. 长寿医学为什么特别适合这个架构？

长寿医学不是单次诊断，而是长期状态管理。

它面对的问题包括：

多系统衰老；
慢性低度炎症；
代谢与免疫变化；
睡眠、压力、运动、营养；
个体差异；
多干预组合；
长期复测；
N-of-1 反馈。

这不是一个简单分类任务，也不是一个单点风险预测任务。

它更像一个长期循环：

while health_management_active:
    state = observe_longitudinal_state(user)
    actions = generate_intervention_candidates(state)
    transitions = estimate_transition_tendencies(state, actions)
    reviewed_plan = human_review(transitions)
    feedback = collect_longitudinal_feedback(reviewed_plan)
    update_model_state(state, reviewed_plan, feedback)

也就是说，长寿医学天然是：

longitudinal state-action-transition-feedback problem

长期状态—动作—转移—反馈问题。

这也是为什么医学世界模型可能会在长寿科技、精准健康管理、功能医学和高端健康管理中率先形成应用场景。

14. 一个完整 JSON 示例

下面是一个简化 JSON，用来表示一次医学世界模型推演记录：

{
  "patient_state": {
    "state_id": "S20260521",
    "clinical_markers": {
      "bmi": 29.1,
      "fasting_glucose": 6.2,
      "hba1c": 6.0,
      "triglycerides": 2.1
    },
    "lifestyle": {
      "sleep_hours": 5.8,
      "exercise_frequency_per_week": 1,
      "diet_pattern": "high_refined_carbohydrate"
    },
    "risk_context": [
      "family_history_type_2_diabetes",
      "possible_insulin_resistance"
    ]
  },
  "candidate_action": {
    "action_id": "nutrition_low_glycemic_8w",
    "category": "nutrition",
    "duration_weeks": 8,
    "target_mechanism": [
      "postprandial_glucose_variability",
      "insulin_resistance"
    ],
    "monitoring_markers": [
      "fasting_glucose",
      "hba1c",
      "weight",
      "waist_circumference"
    ]
  },
  "transition_hypothesis": {
    "expected_direction": {
      "fasting_glucose": "decrease_possible",
      "postprandial_glucose": "decrease_possible",
      "weight": "slight_decrease_possible"
    },
    "uncertainty_level": "moderate",
    "time_window_weeks": 8
  },
  "evidence_chain": {
    "overall_strength": "moderate",
    "limitations": [
      "individual_response_varies",
      "adherence_uncertain",
      "not_a_treatment_prescription"
    ]
  },
  "safety_gate": {
    "requires_clinician_review": false,
    "red_flags": [],
    "notes": [
      "health_management_context_only",
      "not_medical_diagnosis"
    ]
  },
  "feedback_plan": {
    "timepoint_weeks": 8,
    "metrics": [
      "fasting_glucose",
      "hba1c",
      "weight",
      "waist_circumference",
      "symptom_score"
    ]
  }
}

这个 JSON 的重点不是字段本身，而是它把一次医学推演拆成了可检查的对象。

15. 开发者实现时的几个原则

原则 1：不要从聊天机器人开始

医学世界模型的第一步不是：

answer = llm.chat(user_question)

而是：

state = define_state_schema()
action = define_action_schema()
transition = define_transition_schema()
evidence = define_evidence_schema()
feedback = define_feedback_schema()

先定义对象，再谈智能。

原则 2：不要把 transition 写成疗效预测

避免：

effect = predict_treatment_effect(state, action)

建议：

hypothesis = estimate_transition_tendency(state, action, evidence)

原则 3：必须有 evidence object

不要只生成建议：

recommendation = generate_recommendation(state)

而要输出证据链：

recommendation = {
    "action": action,
    "transition_hypothesis": transition,
    "evidence_chain": evidence,
    "uncertainty": uncertainty,
    "safety_notes": safety_notes
}

原则 4：必须 human-in-the-loop

医学世界模型不是自动治疗系统。

decision = human_expert_review(model_output)

应该是核心流程，而不是可选项。

原则 5：必须支持 feedback update

如果没有反馈更新，它就不是世界模型，只是一次性建议生成器。

model_state = update_with_feedback(model_state, observed_feedback)

16. 总结：从工具到基础设施

医疗 AI 的第一阶段，是识别和预测。

AI 制药的当前热点，是分子和靶点。

医学世界模型可能代表生命科学 AI 的下一步：干预推演、长期反馈和个体化状态管理。

从工程角度看，它不是一个简单的 LLM 应用，而是一个结构化系统：

State
  + Action
  + Evidence
  -> Transition Hypothesis
  -> Feedback
  -> Calibration

它的关键不是“自动给方案”，而是让医学推演变得：

可表示；
可审计；
可追踪；
可反馈；
可校准；
可由人类专家复核。

对于长寿医学、功能医学、精准健康管理等长期状态管理场景，这种架构尤其重要。

因为这些场景真正需要的，不是一次性预测，而是长期的状态—动作—转移—反馈闭环。

如果说医疗 AI 的第一波价值在于“看见疾病”，AI 制药的价值在于“发现分子”，那么医学世界模型的下一步价值，可能在于让 AI 进入医学决策更核心的环节：

模拟干预，追踪反馈，持续校准个体状态。

这也是为什么它可能成为 AI 医疗之后，下一代生命科学赛道的重要方向。

参考文献与延伸阅读

Ha, D., & Schmidhuber, J. Recurrent World Models Facilitate Policy Evolution. Advances in Neural Information Processing Systems 31, 2018. https://arxiv.org/abs/1803.10122
LeCun, Y. A Path Towards Autonomous Machine Intelligence. OpenReview, 2022. https://openreview.net/forum?id=BZ5a1r-kVsf
Yang, Y., Wang, Z.-Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.-D., & Chen, J. Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv:2506.02327, 2025. https://arxiv.org/abs/2506.02327
Qazi, M. A., Nadeem, M., & Yaqub, M. Beyond Generative AI: World Models for Clinical Prediction, Counterfactuals, and Planning. arXiv:2511.16333, 2025. https://arxiv.org/abs/2511.16333
Katsoulakis, E., Wang, Q., Wu, H., et al. Digital twins for health: a scoping review. npj Digital Medicine, 7, 77, 2024. https://doi.org/10.1038/s41746-024-01073-0
Pearl, J., & Mackenzie, D. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018.
Xiong, J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. https://doi.org/10.20944/preprints202605.0366.v1
Steerable World 项目网址：https://steerable.world