Close Menu
    Trending
    • STOP Building Useless ML Projects – What Actually Works
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    • Why PDF Extraction Still Feels LikeHack
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»How I Built a Smarter Question Answering System Using RAG | by Taj Elkatawneh | Jun, 2025
    Machine Learning

    How I Built a Smarter Question Answering System Using RAG | by Taj Elkatawneh | Jun, 2025

    Team_AIBS NewsBy Team_AIBS NewsJune 25, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In an age the place language fashions can generate fluent responses on virtually any matter, the problem is not nearly getting solutions. The true query is, are these solutions correct, and helpful? That’s what led me to construct my very own Retrieval-Augmented Era (RAG) system. Not as a chatbot clone, however as a centered, doc conscious query answering system. It permits a language mannequin reply utilizing essentially the most related items of context as a substitute of simply guessing. Over the previous few weeks, I’ve been engaged on implementing this and it feels helpful. I constructed a RAG system from scratch, not by following a tutorial phrase for phrase, however by figuring issues out step-by-step. I began with a easy objective: to construct a system that might reply consumer questions extra intelligently.

    The Downside I Wished to Clear up

    Giant Language Fashions (LLMs) are extremely good at sounding right. However they don’t really know what’s true until they’re given dependable context. I needed to repair that. As an alternative of counting on pretraining alone, I got down to construct a system that solutions questions utilizing info drawn straight from paperwork I present. I didn’t need simply any reply, I need solutions primarily based on actual paperwork, filtering for relevance, and a method to see precisely the place the reply got here from. That meant mixing retrieval and era, which is what RAG is all about.

    Getting the Paperwork Prepared

    Step one was loading paperwork in PDF format. For this, I used PyPDFLoader, which extracts the textual content whereas preserving metadata corresponding to filename and web page quantity. To make this textual content usable for retrieval, I then cut up it into semantically significant chunks utilizing RecursiveCharacterTextSplitter

    def split_documents(paperwork):
    text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=512,
    chunk_overlap=100,
    length_function=len,
    is_separator_regex=True,
    )
    chunks = []
    for doc in paperwork:
    split_texts = text_splitter.split_text(doc["content"])
    for i, chunk_content in enumerate(split_texts):
    chunks.append({
    "content material": chunk_content,
    "metadata": {**doc["metadata"], "chunk_id": i}
    })
    return chunks

    This ensured that when a related passage is retrieved later, I’d know precisely the place it got here from, an important a part of constructing belief within the reply.

    Semantic Embeddings and Vector Storage

    Subsequent, I embedded the textual content utilizing the all-mpnet-base-v2 mannequin from SentenceTransformers. I went with this one as a result of it has a very good dimension dimension, not too small or overkill, and nonetheless provides good semantic embeddings.

    def get_embedding(textual content="None"):
    embedding = embedding_model.encode(textual content).tolist()
    return embedding

    These embeddings have been saved in Pinecone, a vector database that helps actual time similarity search. This meant that when a consumer requested a query, my system may shortly establish essentially the most related chunks.


    def upsert_chunks_to_pinecone(index, chunks, batch_size=100):
    vectors = []
    for i, chunk in enumerate(chunks):
    content material = chunk["content"]
    metadata = chunk.get("metadata", {})
    metadata["text"] = content material

    embedding = get_embedding(content material)
    vector_id = str(uuid4())
    vectors.append((vector_id, embedding, metadata))

    if len(vectors) == batch_size or i == len(chunks) - 1:
    index.upsert(vectors=vectors)
    print(f"Upserted batch ending at chunk {i + 1}")
    vectors = []
    print(f"All {len(chunks)} vectors upserted to Pinecone.")

    Including a Layer of Security and Relevance Filtering

    Earlier than doing that, although, I applied a bit of filtering layer. I didn’t need the system to reply unsafe or questions out of scope. I wrote a operate that checks for issues like violence, hate, or express content material. And for area relevance, I used a language mannequin to resolve if the query had something to do with knowledge science, AI, or linear algebra, which is the area I skilled it for.

    Reply Era with Groq and LLaMA 3

    Then comes the ultimate step, producing an precise reply. For that, I used Groq’s API with the Llama 3.3 70B mannequin. It’s quick, correct, and doesn’t waste time. I go within the consumer’s query, and it returns a solution that’s related to the fabric given. It additionally exhibits the precise chunks it pulled from, so I can see the place the reply is coming from. Not hallucinated nonsense.

    Whats Subsequent

    This venture remains to be a piece in progress. I’ve already prolonged it additional by including a light-weight internet interface, Gradio. I’m additionally trying into multi-hop retrieval for extra advanced, multi-part questions. The core concept, although, will stay the identical: to reply questions with confidence as a result of the solutions are grounded in context. Constructing this RAG system has been one of the crucial sensible and eye-opening initiatives I’ve labored on to this point.

    LinkedIn

    Github



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleBumble Is Cutting Almost One-Third of Its Global Staff
    Next Article Stop Chasing “Efficiency AI.” The Real Value Is in “Opportunity AI.”
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025
    Machine Learning

    Why PDF Extraction Still Feels LikeHack

    July 1, 2025
    Machine Learning

    🚗 Predicting Car Purchase Amounts with Neural Networks in Keras (with Code & Dataset) | by Smruti Ranjan Nayak | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    CTGT’s AI Platform Built to Eliminate Bias, Hallucinations in AI Models

    June 27, 2025

    Bridging Music and Mind: Cognitive Psychology in Magenta’s Creative Process | by Anwinkbiju | Mar, 2025

    March 23, 2025

    Hitting ‘Unsubscribe’ to Annoying Emails Isn’t Safe Anymore

    June 16, 2025
    Our Picks

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025

    The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.