Close Menu
    Trending
    • Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025
    • How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins
    • Become a Better Data Scientist with These Prompt Engineering Tips and Tricks
    • Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025
    • Transform Complexity into Opportunity with Digital Engineering
    • OpenAI Is Fighting Back Against Meta Poaching AI Talent
    • Lessons Learned After 6.5 Years Of Machine Learning
    • Handling Big Git Repos in AI Development | by Rajarshi Karmakar | Jul, 2025
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Text-based clinical outcome predictions | by Heloisa Oss Boll | Feb, 2025
    Machine Learning

    Text-based clinical outcome predictions | by Heloisa Oss Boll | Feb, 2025

    Team_AIBS NewsBy Team_AIBS NewsFebruary 6, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Digital well being data are an vital knowledge useful resource that accommodates a wealth of affected person info, comparable to pictures, vitals, and textual content. Scientific notes, normally related to hospital visits, describe a affected person’s journey from admission to discharge.

    Regardless of their richness, leveraging scientific textual content for predicting affected person outcomes presents challenges, together with the prevalence of medical abbreviations and their appreciable size. Nonetheless, using notes just isn’t solely promising however is reaching new heights due to the current developments in pure language processing and enormous language fashions.

    However how one can use this info in apply? On this article, we’ll discover the paper Revisiting Clinical Outcome Prediction for MIMIC-IV by Röhr et al., 2024, which particulars text-based consequence predictions based mostly on the notes from one of the crucial in style EHR datasets, MIMIC.

    Background on MIMIC

    MIMIC is a big, freely-available database of deidentified well being knowledge from over 40,000 sufferers in essential care at Beth Israel Deaconess Medical Heart.

    Its newest model is MIMIC-IV. In distinction to MIMIC-III, it options a number of updates. For instance, it now incorporates Emergency Division (ED) knowledge along with Intensive Care Unit (ICU) knowledge. Moreover, it adopts a brand new medical coding customary, ICD-10. This will increase the dataset’s complexity as MIMIC-IV now encompasses each ICD-9 and ICD-10.

    • ICD stands for Worldwide Classification of Illnesses and organizes ailments and procedures by way of a coding system; for example, a affected person with diabetes is assigned the code E0800 (ICD-10) or 250.0x (ICD-9).

    Lastly, the anonymization customary has modified from HIPAA-compliant, initially based mostly on random identifiers, to utilizing censoring markers comparable to “___”.

    Getting ready admission notes

    When sufferers are admitted to the hospital, medical doctors usually need to assess varied elements, comparable to their threat of mortality and sure diagnoses. When utilizing notes as an enter for predictions, it’s essential to preprocess the scientific notes precisely to forestall knowledge leakage.

    • As an illustration, MIMIC-IV accommodates discharge notes that doc the whole affected person journey within the hospital.
    • To acquire admission notes, it’s essential to filter the notes to incorporate solely the related sections current at admission time, particularly: “Chief criticism, (Historical past of) Current sickness, Medical historical past, Admission drugs, Allergy symptoms, Bodily examination, Household historical past, and Social historical past.”
    • The admission notes are used as enter for all of the duties.

    On this paper, the admission notes had been encoded with encoder-only Transformer fashions. These fashions remodel textual content into numerical, vectorized representations that can be utilized for downstream predictions.

    An instance of the best way to remodel textual content into vectors. In our case, “textual content” is an admission notice. Supply: Text and Image Embeddings in Transformers.

    Fashions are then skilled to, given encoded admission notes, predict the next affected person outcomes:

    Affected person routing (PR)

    This job predicts the hospital unit to which a affected person is transferred from the Emergency Division (ED). It makes use of routing logs and goals to categorise sufferers into one in every of 18 potential hospital models, comparable to surgical procedure or obstetrics. It highlights real-world ED operations, the place well timed and correct affected person transfers are essential. Nonetheless, fashions face challenges when predicting uncommon unit locations.

    Diagnoses (DIA)

    This job goals to map scientific notes to ICD-10 codes. It’s multi-label since sufferers might have a number of diagnoses. Whereas MIMIC-IV has over 1,600 distinctive ICD-10 prognosis codes, imbalanced label distribution stays a big problem, as most diagnoses are rare.

    Procedures (PRO)

    Like diagnoses, this job includes predicting medical procedures based mostly on ICD-10 codes, which embrace over 4,000 process labels. The massive label house, together with sparse knowledge for a lot of procedures, presents a problem for uncommon procedures.

    Size-of-stay (LOS)

    The duty goals to foretell the period of a affected person’s keep within the ICU contemplating 4 classes: ≤3 days, 3–7 days, 7–14 days, and >14 days. LOS is completely different from others because it doesn’t embrace ED knowledge, focusing solely on ICU admissions. The authors spotlight that elements unrelated to affected person well being, comparable to hospital capability and administrative rules, might influence predictions of size of keep, including complexity to it.

    In-hospital mortality

    Whereas this job was included in earlier research, the authors determined to exclude it from MIMIC-IV, primarily due to the complexities concerned in preprocessing. Regardless of filtering the notes, they nonetheless noticed mentions of demise, which results in knowledge leakage and in the end leads to overly assured fashions.

    Outcomes

    The authors noticed that fashions pre-trained on MIMIC-III knowledge didn’t generalize properly to MIMIC-IV. PubMedBERT demonstrated superior efficiency throughout duties because of its domain-specific tokenization and pre-training on biomedical textual content.

    All fashions struggled with uncommon labels, notably in DIA and PRO duties. MIMIC-IV has a essential long-tail distribution, with solely about 6% of labels (round 100 codes) representing 67% of the information. The remaining 94% (1,517 codes) are sparse, encompassing solely 33%!

    Even PubMedBERT has issue reaching a excessive PR-AUC for these uncommon labels, primarily enhancing its efficiency on the extra prevalent head labels:

    Conclusion

    Regardless of the richness of notes, challenges associated to knowledge imbalance, annotation high quality, and job complexity stay for text-based prediction. For extra particulars on outcomes and future analysis, check with the original article.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleAI in Business Analytics: Transforming Data into Insights
    Next Article Supercharge Your RAG with Multi-Agent Self-RAG
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025

    July 1, 2025
    Machine Learning

    Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025

    July 1, 2025
    Machine Learning

    Handling Big Git Repos in AI Development | by Rajarshi Karmakar | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    AI crawler wars threaten to make the web more closed for everyone

    February 11, 2025

    Is Your Business Struggling? Take These Steps to Drive Your Company to Success

    March 31, 2025

    SETTING UP AN MLOPS WORKFLOW WITH DAGSHUB AND MLFLOW | by Hasan Tuğra Aykaç | May, 2025

    May 15, 2025
    Our Picks

    Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025

    July 1, 2025

    How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins

    July 1, 2025

    Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.