Phil built onepot CORE:
a 3.4B-compound space you can actually execute
Talk to anyone doing early-stage drug discovery, and they will tell you the same thing: DMTA is still limited by "make." We can design faster than we can synthesize. We can run assays quickly once they're set up. We can analyze data in hours. But getting the next set of molecules still takes weeks and months, blocking the project.
Enumerated chemical spaces were supposed to help. The idea is extremely straightforward: define a region of chemical space that's actually synthesizable from real building blocks using reliable reactions, enumerate it, and let teams search it like a supplier catalog. But in practice, most teams still hit the main constraint: lead times are just too long. We believe that a "space" is only useful if it maps cleanly to a pipeline that can deliver real molecules, reliably, on predictable and short timelines.
At onepot we built the system around that mapping, and Phil is the reason it works.
Phil is our AI chemist: a decision-making layer that designs, runs, and analyzes experiments on our automated synthesis platform. While other approaches treat synthesis automation as a robotics problem, we understand that chemistry requires constant choices of conditions, reagent handling, workup, and purification strategy. These choices demand both deliberate decision-making, and planning: what to try next when something fails, and how to improve a protocol so it works across diverse substrates. Most automation efforts leave the reasoning portion to humans, which is why "automation" often becomes a just faster way to execute human bottlenecked work.
We're publishing a preprint describing onepot CORE (Compounds On-demand via Robotic Execution): an enumerated chemical space plus the operational pipeline that executes it. CORE v1 contains 3.4 billion products supported by seven reactions that are widely used in medicinal chemistry. We construct CORE by selecting a reaction set, sourcing and curating building blocks from supplier catalogs with real-time availability constraints, enumerating candidate products at large scale, and then applying ML-based feasibility assessment so the space reflects what is likely to work under our protocols—not just what matches a template on paper.
That feasibility step is the difference between a "large virtual library" and something you can actually run. Historically, spaces have relied heavily on hand-written filters ("exclude these groups," "avoid these motifs"), which creates the usual tradeoff: loose filters produce failures, strict filters remove valid chemistry. Our approach is data-driven: we train feasibility scoring on multiple tiers of execution data from high-throughput screenings up to full synthesis runs.
CORE is paired with an end-to-end automated workflow. When a compound is selected, we evaluate available routes, source missing building blocks from suppliers, register materials into a standardized storage format, and prepare stock solutions. When all necessary reagents have been acquired, Phil then determines optimal conditions and runs reactions on automated liquid handlers (including glovebox-compatible workflows when required). After reaction and workup, crude is purified by semi-prep HPLC with mass-based fraction collection, then reanalyzed by LC/MS with UV-based purity reporting. Compounds ship either as dry solid in barcoded vials or as plated DMSO solutions.
We know chemical synthesis automation is only as good as the quality of its outputs; here, identity and purity are paramount. In the draft we include a validation set where we synthesized 24 diverse compounds from CORE and confirmed both identity and purity by 1H NMR. We also include an "assay suitability" check using a series of DPP4 inhibitors: we resynthesized known actives and ran a standard fluorescence assay to verify that synthesized compounds behave in biology the way you'd expect. Our pipeline produces molecules that are high quality enough for real biological assays, while also being fast enough for efficient SAR iteration.
What matters to us is not publishing a static catalog. It's building a dynamic space that improves as the system runs, while also responding to external factors: compound availability, shipment delays, and more. As Phil executes more chemistry, the protocols get better, the feasibility models improve, and adding reactions becomes an operational process rather than a long manual project. That is the model: tight integration from "search a space" to "receive a compound," and a feedback loop that makes the space larger, more reliable, and faster over time.
This is CORE v1. We're going to expand reaction scope, add one-pot sequences, and move toward multi-step chemistry. But even at v1, the goal is already clear: if you can compress synthesis lead times and increase reliability, you don't just speed up DMTA — you enable new projects previously thought impractical.
If you're doing hit expansion, SAR around a lead series, or you're blocked by synthesis throughput, we want to talk — reach out to hello@onepot.ai .