MedOS

AI-XR-Cobot World Model

World Model for

Stanford University Princeton University

System Architecture

Graphical Abstract

Gemini understands language.
Sora understands video.
MedOS understands Medical Reality.

MedOS constructs an Agentic World Model that fundamentally understands surgical physics, with the State-Action-Transition loop:

P
PERCEPTION

Perception (St)

Not just input, but a deep understanding of the environment.

I
INTERVENTION

Intervention (At)

Actions that actively change the physical world.

S
SIMULATION

Simulation (St+1)

Predicted future state, not just output.

Dual-System Architecture

Mimicking the training path of human surgeons, we employ a "hindsight-driven" distillation paradigm.

System 2: Reasoning

Slow

  • Spatiotemporal Reasoning (Future Frames)
  • Clinical Chain-of-Thought

System 1: Real-Time

Fast

  • Real-Time Edge Inference
  • Internalized Reasoning

New SOTA in

MedQA

GPQA

Autonomous Clinical Discovery

Clinical Case
Case Study

Autonomously Investigating Semaglutide's Immune Effect

From a single patient observation, MedOS autonomously queries the FAERS database to analyze adverse events and performs meta-analysis on inflammatory factors (TNF-α, IL-6).

Agentic Insight

Agentic Workflow: Observation → FAERS Query → Meta-Analysis → Hypothesis Verification

MedSuperVision: The Largest Medical Vision Dataset

We present MedSuperVision, the largest open-source surgical video dataset to date. Curated by board-certified surgeons, it features 85,398 hours of diverse procedures and high-fidelity annotations.

Dataset Overview Part 1

Dataset Overview Part 2

Dataset Analysis

Comprehensive analysis of procedure types and data distribution.

Democratizing Expertise

One of the most impactful findings is the system's ability to level the playing field. In human-AI collaboration studies, MedOS enabled nurses and medical students to achieve diagnostic precision comparable to attending physicians. Furthermore, it acted as a cognitive safety net, restoring the performance of post-call doctors to levels better than their well-rested baseline.

Democratizing Expertise

Overcoming Fatigue Limits

Mastering Unfamiliar Domains

Eliminating Educational Disparities

XR-Robot-Human Collaboration

Robot Human Collaboration 1

Static Jitter
(Pixel)

Drift
(Pixel)

Horizon Tilt
(Angle °)

Eliminating human tremor and drift for surgical precision.

Robot Human Collaboration 2

Procedure Time (s)

Laparoscopic
Cholecystectomy
Simulation

Urethrovesical
Anastomosis
Simulation

Salpingostomy
(Tube Surgery)
Simulation

Immersive Human-AI Symbiosis. The setup (Left) allows the operator to control the robotic arm with intuitive XR feedback, enabling the human-AI team to outperform traditional manual surgery in speed and fluidity.

Towards An Agentic Copilot for Doctors

Powered by MedOS, our autonomous agent can execute complex surgical tasks with precision and adaptability in simulated environments.

Autonomous Case Study 1

Autonomous Case Study 2

Citation

@article{MedOS,
  title={MedOS: AI-XR-Cobot World Model for Clinical Perception and Action},
  author={Yingcheng Charles Wu, Ming Yin, Baiyu Shi, Zaixi Zhang, ..., Mengdi Wang*, Le Cong*},
  journal={Preprint},
  year={2026}
}

Team & Founding Partners

Principal Investigators

Le Cong
Stanford
Mengdi Wang
Princeton

Contributors include:
Yingcheng Charles Wu

Contact: Le Cong: congle@stanford.edu | Mengdi Wang: mengdiw@princeton.edu | Yingcheng Charles Wu: wuyc@stanford.edu

Founding Partners

NVIDIA
Nebius
VITURE
AI4Science Catalyst Institute

Partner with MedOS

Request Early Access for Research & Clinical Trials

Whether you are an AI researcher, a clinician, or a roboticist, tell us your use case.