Close Menu
    Trending
    • Revisiting Benchmarking of Tabular Reinforcement Learning Methods
    • Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025
    • Qantas data breach to impact 6 million airline customers
    • He Went From $471K in Debt to Teaching Others How to Succeed
    • An Introduction to Remote Model Context Protocol Servers
    • Blazing-Fast ML Model Serving with FastAPI + Redis (Boost 10x Speed!) | by Sarayavalasaravikiran | AI Simplified in Plain English | Jul, 2025
    • AI Knowledge Bases vs. Traditional Support: Who Wins in 2025?
    • Why Your Finance Team Needs an AI Strategy, Now
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»Hitchhiker’s Guide to RAG with ChatGPT API and LangChain
    Artificial Intelligence

    Hitchhiker’s Guide to RAG with ChatGPT API and LangChain

    Team_AIBS NewsBy Team_AIBS NewsJune 27, 2025No Comments7 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    generate tons of phrases and responses primarily based on common data, however what occurs once we want solutions requiring correct and particular data? Solely generative fashions ceaselessly wrestle to supply solutions on area particular questions for a bunch of causes; perhaps the information they have been skilled on at the moment are outdated, perhaps what we’re asking for is actually particular and specialised, perhaps we wish responses that take into consideration private or company information that simply aren’t public… 🤷‍♀️ the record goes on.

    So, how can we leverage generative AI whereas conserving our responses correct, related, and down-to-earth? An excellent reply to this query is the Retrieval-Augmented Generation (RAG) framework. RAG is a framework that consists of two key elements: retrieval and technology (duh!). In contrast to solely generative fashions which are pre-trained on particular information, RAG incorporates an additional step of retrieval that enables us to push extra info into the mannequin from an exterior supply, akin to a database or a doc. To place it in another way, a RAG pipeline permits for offering coherent and pure responses (supplied by the technology step), that are additionally factually correct and grounded in a data base of our selection (supplied by the retrieval step).

    On this manner, RAG may be a particularly priceless software for functions the place extremely specialised information is required, as as an example buyer help, authorized recommendation, or technical documentation. One typical instance of a RAG utility is buyer help chatbots, answering buyer points primarily based on an organization’s database of help paperwork and FAQs. One other instance could be advanced software program or technical merchandise with in depth troubleshooting guides. Another instance could be authorized recommendation — a RAG mannequin would entry and retrieve customized information from regulation libraries, earlier instances, or agency tips. The examples are actually countless; nonetheless, in all these instances, the entry to exterior, particular, and related to the context information permits the mannequin to supply extra exact and correct responses.

    So, on this submit, I stroll you thru constructing a easy RAG pipeline in Python, using ChatGPT API, LangChain, and FAISS.

    What about RAG?

    From a extra technical perspective, RAG is a way used to boost an LLM’s responses by injecting it with extra, domain-specific info. In essence, RAG permits for a mannequin to additionally take into consideration extra exterior info — like a recipe ebook, a technical handbook, or an organization’s inside data base — whereas forming its responses.

    This is essential as a result of it permits us to remove a bunch of issues inherent to LLMs, as as an example:

    • Hallucinations — making issues up
    • Outdated info — if the mannequin wasn’t skilled on latest information
    • Transparency — not understanding the place responses are coming from

    To make this work, the exterior paperwork are first processed into vector embeddings and saved in a vector database. Then, once we submit a immediate to the LLM, any related information is retrieved from the vector database and handed to the LLM together with our immediate. Consequently, the response of the LLM is shaped by contemplating each our immediate and any related info current within the vector database within the background. Such a vector database may be hosted domestically or within the cloud, utilizing a service like Pinecone or Weaviate.

    Picture by creator

    What about ChatGPT API, LangChain, and FAISS?

    The primary part for constructing a RAG pipeline is the LLM mannequin that may generate the responses. This may be any LLM, like Gemini or Claude, however on this submit, I shall be utilizing OpenAI’s ChatGPT fashions by way of their API platform. So as to use their API, we have to register and procure an API key. We additionally want to ensure the respective Python libraries are put in.

    pip set up openai

    The opposite main part of constructing a RAG is processing exterior information — producing embeddings from paperwork and storing them in a vector database. The preferred framework for performing such a job is LangChain. Particularly, LangChain permits:

    • Load and extract textual content from numerous doc varieties (PDFs, DOCX, TXT, and so on.)
    • Break up the textual content into chunks appropriate for producing the embeddings
    • Generate vector embeddings (on this submit, with the help of OpenAI’s API)
    • Retailer and search embeddings by way of vector databases like FAISS, Chroma, and Pinecone

    We are able to simply set up the required LangChain libraries by:

    pip set up langchain langchain-community langchain-openai

    On this submit, I’ll be utilizing LangChain along with FAISS, a neighborhood vector database developed by Fb AI Analysis. FAISS is a really light-weight bundle, and is thus acceptable for constructing a easy/small RAG pipeline. It may be simply put in with:

    pip set up faiss-cpu

    Placing every little thing collectively

    So, in abstract, I’ll use:

    • ChatGPT fashions by way of OpenAI’s API because the LLM
    • LangChain, together with OpenAI’s API, to load the exterior recordsdata, course of them, and generate the vector embeddings
    • FAISS to generate a neighborhood vector database

    The file that I shall be feeding into the RAG pipeline for this submit is a textual content file with some information about me. This textual content file is situated within the folder ‘RAG recordsdata’.

    Now we’re all arrange, and we are able to begin by specifying our API key and initializing our mannequin:

    from openai import OpenAI # Chat_GPT API key api_key = "your key" 
    
    # initialize LLM 
    llm = ChatOpenAI(openai_api_key=api_key, mannequin="gpt-4o-mini", temperature=0.3)

    Then we are able to load the recordsdata we need to use for the RAG, generate the embeddings, and retailer them as a vector database as follows:

    # loading paperwork for use for RAG 
    text_folder = "rag_files"  
    
    all_documents = []
    for filename in os.listdir(text_folder):
        if filename.decrease().endswith(".txt"):
            file_path = os.path.be part of(text_folder, filename)
            loader = TextLoader(file_path)
            all_documents.prolong(loader.load())
    
    # generate embeddings
    embeddings = OpenAIEmbeddings(openai_api_key=api_key)
    
    # create vector database w FAISS 
    vector_store = FAISS.from_documents(paperwork, embeddings)
    retriever = vector_store.as_retriever()

    Lastly, we are able to wrap every little thing in a easy executable Python file:

    def principal():
        print("Welcome to the RAG Assistant. Sort 'exit' to give up.n")
        
        whereas True:
            user_input = enter("You: ").strip()
            if user_input.decrease() == "exit":
                print("Exiting…")
                break
    
            # get related paperwork
            relevant_docs = retriever.get_relevant_documents(user_input)
            retrieved_context = "nn".be part of([doc.page_content for doc in relevant_docs])
    
            # system immediate
            system_prompt = (
                "You're a useful assistant. "
                "Use ONLY the next data base context to reply the consumer. "
                "If the reply will not be within the context, say you do not know.nn"
                f"Context:n{retrieved_context}"
            )
    
            # messages for LLM 
            messages = [
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_input}
            ]
    
            # generate response
            response = llm.invoke(messages)
            assistant_message = response.content material.strip()
            print(f"nAssistant: {assistant_message}n")
    
    if __name__ == "__main__":
        principal()

    Discover how the system immediate is outlined. Basically, a system immediate is an instruction given to the LLM that units the conduct, tone, or constraints of the assistant earlier than the consumer interacts. For instance, we may set the system immediate to make the LLM present responses like speaking to a 4-year-old or a rocket scientist — right here we ask to supply responses solely primarily based on the exterior information we supplied, the ‘Maria information’

    So, let’s see what we’ve cooked! 🍳

    Firstly, I ask a query that’s irrelevant to the supplied exterior datasource, to be sure that the mannequin solely makes use of the supplied datasource when forming the responses and never common data.


    … after which I requested some questions particularly from the file I supplied…

    ✨✨✨✨

    On my thoughts

    Apparently, it is a very simplistic instance of a RAG setup — there’s rather more to think about when implementing it in an actual enterprise atmosphere, akin to safety issues round how information is dealt with, or efficiency points when coping with a bigger, extra real looking data corpus and elevated token utilization. Nonetheless, I consider OpenAI’s API is actually spectacular and provides immense, untapped potential for constructing customized, context-specific AI functions.


    Cherished this submit? Let’s be pals! Be part of me on

    📰Substack 💌 Medium 💼LinkedIn ☕Buy me a coffee!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThat Time Our AI Turned Corporate Spy (And How We Caught It) | by Sneha Rani | Jun, 2025
    Next Article Salesforce CEO Marc Benioff: AI Is Handling Half of Tasks
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025
    Artificial Intelligence

    An Introduction to Remote Model Context Protocol Servers

    July 2, 2025
    Artificial Intelligence

    How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Streamlining Educational Content with AI Video Generators

    April 15, 2025

    Lessons from COVID-19: Why Probability Distributions Matter | by Sunghyun Ahn | Dec, 2024

    December 31, 2024

    Agentic AI 102: Guardrails and Agent Evaluation

    May 17, 2025
    Our Picks

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025

    Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025

    July 2, 2025

    Qantas data breach to impact 6 million airline customers

    July 2, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.