An end-to-end mission that connects sufferers with related trials in seconds
A number of months in the past, I got here throughout a staggering reality — practically 80% of medical trials wrestle to recruit sufficient members and lots of fail completely as a result of they’ll’t discover eligible sufferers in time.
It made me surprise: in an period the place AI can write poetry and detect illnesses, why is matching sufferers to trials nonetheless so handbook and sluggish?
That thought sparked what began as a weekend mission… and changed into weeks of constructing, testing and deploying one thing I now name the Medical Trial Matchmaker.
That is an AI-powered net app that pulls reside information from ClinicalTrials.gov, processes it utilizing pure language fashions and finds the perfect trial matches for a affected person in seconds.
In case you have ever tried trying to find a medical trial on-line, you know the way overwhelming it’s. You may seek for “kind 2 diabetes” and get lots of of outcomes stuffed with dense medical jargon, multi-paragraph eligibility standards and unclear recruitment particulars.
For medical doctors, that is time-consuming. For sufferers, it’s virtually unusable.
Matching is a multi-step course of. You need to discover trials which are nonetheless recruiting. Then it’s important to learn detailed eligibility guidelines. You then cross-check these guidelines with a affected person’s profile. Lastly, you slim it right down to essentially the most promising trials.
It’s sluggish, it’s tedious and it’s not pleasant for somebody who simply needs solutions shortly.
The thought was easy. What if we might train AI to grasp each affected person profiles and trial descriptions after which match them primarily based on that means — not simply precise key phrases?
To make that work, I wanted three issues:
First, a dependable supply of trial information, which I bought from the ClinicalTrials.gov API.
Second, a strategy to symbolize textual content in a type AI can perceive, so I used sentence embeddings.
And third, a rating system to attain how related every trial is for a given affected person.
The first step is fetching trials from ClinicalTrials.gov. The app can pull reside trial information or use cached trial datasets. For my default presets, I give attention to circumstances like diabetes, coronary heart failure and most cancers and I solely fetch trials which are at present recruiting.
Step two is embedding the trial descriptions. That is the place the AI mannequin is available in. I take advantage of the all-MiniLM-L6-v2
mannequin from Sentence Transformers, which turns every trial’s description right into a vector — a set of numbers that seize the that means of the textual content. This implies “lung most cancers” and “pulmonary carcinoma” shall be acknowledged as comparable even when the phrases are completely different.
Step three is processing the affected person profile. On this demo, I take advantage of artificial affected person information saved in JSON recordsdata. These include particulars like age and medical circumstances. The affected person information can also be embedded right into a vector.
Step 4 is matching and rating. The app compares the affected person vector to every trial vector utilizing cosine similarity, which measures how shut the meanings are. The trials with the very best similarity scores are proven first, together with causes for the match, akin to “age inside vary” or “situation match.”
Step 5 is reviewing or exporting. You possibly can browse the ends in the online app or obtain them as a CSV for later.
I constructed the backend in Python utilizing Flask. For AI, I used Sentence Transformers. The trial information comes from the ClinicalTrials.gov API. For deployment, I selected Hugging Face Areas and ran every thing inside a Docker container. Pandas dealt with the information processing, Transformers powered the mannequin loading and Gunicorn ran the manufacturing server.
The interface is saved deliberately easy. You choose a demo affected person, click on “Fetch Stay Trials,” after which click on “Match” to see the ranked outcomes. Every outcome reveals the trial title, recruitment standing, a direct hyperlink to the ClinicalTrials.gov web page and a brief rationalization of why it matches.
Deploying this wasn’t as simple as pushing code. I needed to remedy mannequin caching points as a result of Hugging Face Areas doesn’t permit writing to sure directories. I labored round this by altering the cache location within the Dockerfile. I additionally needed to scale back the batch dimension when constructing embeddings so the app would run inside free internet hosting reminiscence limits. And since this was my first time pushing to Hugging Face Areas, I had to determine Git authentication as nicely.
I wish to add filters for trial section and recruitment location. I additionally wish to let customers enter their very own affected person profiles as a substitute of simply utilizing demo information. And finally, I’d wish to develop the record of supported medical circumstances so the software might be helpful for extra folks.
Though that is only a demo, it’s a small instance of how AI could make healthcare processes quicker and extra accessible. With the correct instruments, we will take complicated medical data and switch it into one thing that helps folks make higher selections.
Stay App on Hugging Face Areas: Clinical Trial Matchmaker Live Demo
GitHub Repository: Clinical Trial Matchmaker Code