Close Menu
    Trending
    • Revisiting Benchmarking of Tabular Reinforcement Learning Methods
    • Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025
    • Qantas data breach to impact 6 million airline customers
    • He Went From $471K in Debt to Teaching Others How to Succeed
    • An Introduction to Remote Model Context Protocol Servers
    • Blazing-Fast ML Model Serving with FastAPI + Redis (Boost 10x Speed!) | by Sarayavalasaravikiran | AI Simplified in Plain English | Jul, 2025
    • AI Knowledge Bases vs. Traditional Support: Who Wins in 2025?
    • Why Your Finance Team Needs an AI Strategy, Now
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»How I Built a Smarter Question Answering System Using RAG | by Taj Elkatawneh | Jun, 2025
    Machine Learning

    How I Built a Smarter Question Answering System Using RAG | by Taj Elkatawneh | Jun, 2025

    Team_AIBS NewsBy Team_AIBS NewsJune 25, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In an age the place language fashions can generate fluent responses on virtually any matter, the problem is not nearly getting solutions. The true query is, are these solutions correct, and helpful? That’s what led me to construct my very own Retrieval-Augmented Era (RAG) system. Not as a chatbot clone, however as a centered, doc conscious query answering system. It permits a language mannequin reply utilizing essentially the most related items of context as a substitute of simply guessing. Over the previous few weeks, I’ve been engaged on implementing this and it feels helpful. I constructed a RAG system from scratch, not by following a tutorial phrase for phrase, however by figuring issues out step-by-step. I began with a easy objective: to construct a system that might reply consumer questions extra intelligently.

    The Downside I Wished to Clear up

    Giant Language Fashions (LLMs) are extremely good at sounding right. However they don’t really know what’s true until they’re given dependable context. I needed to repair that. As an alternative of counting on pretraining alone, I got down to construct a system that solutions questions utilizing info drawn straight from paperwork I present. I didn’t need simply any reply, I need solutions primarily based on actual paperwork, filtering for relevance, and a method to see precisely the place the reply got here from. That meant mixing retrieval and era, which is what RAG is all about.

    Getting the Paperwork Prepared

    Step one was loading paperwork in PDF format. For this, I used PyPDFLoader, which extracts the textual content whereas preserving metadata corresponding to filename and web page quantity. To make this textual content usable for retrieval, I then cut up it into semantically significant chunks utilizing RecursiveCharacterTextSplitter

    def split_documents(paperwork):
    text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=512,
    chunk_overlap=100,
    length_function=len,
    is_separator_regex=True,
    )
    chunks = []
    for doc in paperwork:
    split_texts = text_splitter.split_text(doc["content"])
    for i, chunk_content in enumerate(split_texts):
    chunks.append({
    "content material": chunk_content,
    "metadata": {**doc["metadata"], "chunk_id": i}
    })
    return chunks

    This ensured that when a related passage is retrieved later, I’d know precisely the place it got here from, an important a part of constructing belief within the reply.

    Semantic Embeddings and Vector Storage

    Subsequent, I embedded the textual content utilizing the all-mpnet-base-v2 mannequin from SentenceTransformers. I went with this one as a result of it has a very good dimension dimension, not too small or overkill, and nonetheless provides good semantic embeddings.

    def get_embedding(textual content="None"):
    embedding = embedding_model.encode(textual content).tolist()
    return embedding

    These embeddings have been saved in Pinecone, a vector database that helps actual time similarity search. This meant that when a consumer requested a query, my system may shortly establish essentially the most related chunks.


    def upsert_chunks_to_pinecone(index, chunks, batch_size=100):
    vectors = []
    for i, chunk in enumerate(chunks):
    content material = chunk["content"]
    metadata = chunk.get("metadata", {})
    metadata["text"] = content material

    embedding = get_embedding(content material)
    vector_id = str(uuid4())
    vectors.append((vector_id, embedding, metadata))

    if len(vectors) == batch_size or i == len(chunks) - 1:
    index.upsert(vectors=vectors)
    print(f"Upserted batch ending at chunk {i + 1}")
    vectors = []
    print(f"All {len(chunks)} vectors upserted to Pinecone.")

    Including a Layer of Security and Relevance Filtering

    Earlier than doing that, although, I applied a bit of filtering layer. I didn’t need the system to reply unsafe or questions out of scope. I wrote a operate that checks for issues like violence, hate, or express content material. And for area relevance, I used a language mannequin to resolve if the query had something to do with knowledge science, AI, or linear algebra, which is the area I skilled it for.

    Reply Era with Groq and LLaMA 3

    Then comes the ultimate step, producing an precise reply. For that, I used Groq’s API with the Llama 3.3 70B mannequin. It’s quick, correct, and doesn’t waste time. I go within the consumer’s query, and it returns a solution that’s related to the fabric given. It additionally exhibits the precise chunks it pulled from, so I can see the place the reply is coming from. Not hallucinated nonsense.

    Whats Subsequent

    This venture remains to be a piece in progress. I’ve already prolonged it additional by including a light-weight internet interface, Gradio. I’m additionally trying into multi-hop retrieval for extra advanced, multi-part questions. The core concept, although, will stay the identical: to reply questions with confidence as a result of the solutions are grounded in context. Constructing this RAG system has been one of the crucial sensible and eye-opening initiatives I’ve labored on to this point.

    LinkedIn

    Github



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleBumble Is Cutting Almost One-Third of Its Global Staff
    Next Article Stop Chasing “Efficiency AI.” The Real Value Is in “Opportunity AI.”
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025

    July 2, 2025
    Machine Learning

    Blazing-Fast ML Model Serving with FastAPI + Redis (Boost 10x Speed!) | by Sarayavalasaravikiran | AI Simplified in Plain English | Jul, 2025

    July 2, 2025
    Machine Learning

    From Training to Drift Monitoring: End-to-End Fraud Detection in Python | by Aakash Chavan Ravindranath, Ph.D | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Texas Instruments to make ‘historic’ $60bn US chip investment

    June 19, 2025

    British Business Investments Commits £10M to Twin Path Ventures for UK AI

    February 21, 2025

    The One Thing That Will Ruin Your Business Faster Than Anything Else

    March 13, 2025
    Our Picks

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025

    Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025

    July 2, 2025

    Qantas data breach to impact 6 million airline customers

    July 2, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.