ChatGPT Will Not Find Your Next Cancer Drug

Mar 10, 2026

Everyone in pharma is talking about AI, but nobody is asking the right question.

The question is not whether LLMs can read papers. They can. Thousands per second. They find connections no human would catch; a kinase studied in oncology that turns out relevant in neurodegeneration, a mechanism buried in a 2003 paper that explains a clinical failure twenty years later.

LLMs synthesize, hypothesize, they reason about mechanisms.

But they cannot do experiments. And experiments are where drugs come from.

This is the part the pitch decks skip. In software, the loop can be closed naturally. An LLM can write code, run it, see the output and iterate. The experiment is computational. The whole cycle lives inside the computer.

In drug discovery, the loop is not that easy.

A hypothesis about binding affinity cannot be resolved by generating more text. A prediction about off-target toxicity is not validated by writing a better paragraph. You need new data. Data that does not exist yet. Data that comes from physics-based simulations or wet lab assays. Not from autocomplete.

Without that, an LLM is just a very confident librarian that occasionally makes things up. In a chatbot, hallucination is embarrassing. In your pipeline, hallucination is half a year of work and ten million dollars lost chasing a fabricated result that looked exactly like a real one.

So the real challenge is not AI that reads. It is AI that experiments.

Some teams are solving this with robotic lab automation. Real metal, real plates, real assays. That matters.

We are solving a different part of the puzzle. At PAULING.AI we connect LLM reasoning to computational chemistry (molecular docking, molecular dynamics, ADMET prediction, etc.) so the AI does not just hypothesize. It tests. It generates data that did not exist before the query.

And here is what nobody talks about: the hard part is not building a cool demo. It is the reliability. A drug discovery workflow chains dozens of sequential steps. If each step runs at 90 percent reliability, your end-to-end success rate is under 10 percent. That is a hackathon project, it is not a product.

The real engineering challenge is driving step reliability to 99 percent and above. Handling edge cases, bad inputs, ambiguous outputs, fully autonomously. No human babysitting every transition. Because that’s what makes the end to end workflow reliable and useful to real drug discovery scientists.

That is the difference between AI that impresses a board room and AI that actually moves a drug discovery program forward.

Your next cancer drug will not come from a prompt. It will come from a system that reasons, experiments, and iterates. Autonomously. Reliably. Every time.

Pauling AI Blog

Discussion about this post

Ready for more?