Reaching 100,000 Messages: What We Learned Building an AI Clinical Assistant

Today, we are excited to announce that internist.ai has surpassed 100,000 messages exchanged between clinicians and our AI assistant. This milestone is not just a number — it represents thousands of clinical interactions where our models supported real medical decision-making in hospital settings.

From Research to Real-World Impact

When we started internist.ai, our goal was clear: build AI that genuinely helps clinicians in their daily work. Not AI that passes benchmarks. Not AI that impresses in demos. AI that earns the trust of doctors who have patients waiting.

Our early research showed that large language models lack essential metacognition for reliable medical reasoning. That finding did not discourage us — it gave us a roadmap. We knew exactly what we had to fix before deploying in clinical settings: the model needed to know what it does not know.

What 100,000 Messages Taught Us

Clinicians value honesty over confidence

The single most appreciated feature of our system is its ability to say “I’m not sure.” In a field where overconfident AI can be dangerous, our focus on metacognitive calibration has proven to be the right bet. Clinicians tell us they trust the system more because it sometimes declines to answer.

Integration matters more than intelligence

A brilliant model that sits outside the clinical workflow will not get used. Our work on implementing LLMs directly into electronic health records was critical. When the AI is available right where the clinician is already working — inside the EHR — adoption follows naturally. Over 80% of our messages come from within the hospital’s existing systems, not from a separate interface.

Data quality beats data quantity

Our research on high-quality, mixed-domain training data directly informed how we built our production models. Curating better data — not just more data — led to measurable improvements in clinical accuracy and relevance. Every message processed reinforces this lesson.

The physician must stay in the loop

We designed our system around the physician-in-the-loop paradigm. The AI suggests, the clinician decides. This is not a limitation — it is a feature. After 100,000 messages, we have seen that this approach leads to better outcomes and higher trust than fully autonomous systems.

By the Numbers

100,000+ messages processed
Deployed at UCLouvain Saint-Luc university hospital
11 peer-reviewed publications supporting the system
133 citations on our most-referenced paper
0 patient data stored in the cloud — full on-premise deployment

What Comes Next

Reaching 100,000 messages is a milestone, but the work is far from done. We are focused on:

Expanding deployment to additional hospital departments and institutions
Releasing new model versions with improved clinical reasoning and multilingual support
Deepening our evaluation methodology to ensure safety scales with capability
Growing the open-source ecosystem so other teams can build on our work

We believe that AI in healthcare should be open, rigorously validated, and clinician-centered. Every one of those 100,000 messages strengthens that conviction.

Thank you to the clinicians, researchers, and institutions who made this milestone possible.