Close Menu
    Trending
    • Revisiting Benchmarking of Tabular Reinforcement Learning Methods
    • Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025
    • Qantas data breach to impact 6 million airline customers
    • He Went From $471K in Debt to Teaching Others How to Succeed
    • An Introduction to Remote Model Context Protocol Servers
    • Blazing-Fast ML Model Serving with FastAPI + Redis (Boost 10x Speed!) | by Sarayavalasaravikiran | AI Simplified in Plain English | Jul, 2025
    • AI Knowledge Bases vs. Traditional Support: Who Wins in 2025?
    • Why Your Finance Team Needs an AI Strategy, Now
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»AI Agents Processing Time Series and Large Dataframes
    Artificial Intelligence

    AI Agents Processing Time Series and Large Dataframes

    Team_AIBS NewsBy Team_AIBS NewsApril 22, 2025No Comments13 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Brokers are AI methods, powered by LLMs, that may cause about their targets and take actions to attain a ultimate purpose. They’re designed not simply to answer queries, however to orchestrate a sequence of operations, together with processing knowledge (i.e. dataframes and time sequence). This skill unlocks quite a few real-world functions for democratizing entry to knowledge evaluation, similar to automating reporting, no-code queries, help on knowledge cleansing and manipulation. 

    Brokers that may work together with dataframes in two alternative ways: 

    • with pure language — the LLM reads the desk as a string and tries to make sense of it based mostly on its data base
    • by producing and executing code — the Agent prompts instruments to course of the dataset as an object. 

    So, by combining the ability of NLP with the precision of code execution, AI Brokers allow a broader vary of customers to work together with complicated datasets and derive insights.

    On this tutorial, I’m going to point out methods to course of dataframes and time sequence with AI Brokers. I’ll current some helpful Python code that may be simply utilized in different related circumstances (simply copy, paste, run) and stroll by means of each line of code with feedback so that you could replicate this instance (hyperlink to full code on the finish of the article).

    Setup

    Let’s begin by organising Ollama (pip set up ollama==0.4.7), a library that permits customers to run open-source LLMs regionally, while not having cloud-based companies, giving extra management over knowledge privateness and efficiency. Because it runs regionally, any dialog knowledge doesn’t go away your machine.

    To begin with, you could obtain Ollama from the web site. 

    Then, on the immediate shell of your laptop computer, use the command to obtain the chosen LLM. I’m going with Alibaba’s Qwen, because it’s each good and light-weight.

    After the obtain is accomplished, you may transfer on to Python and begin writing code.

    import ollama
    llm = "qwen2.5"

    Let’s check the LLM:

    stream = ollama.generate(mannequin=llm, immediate='''what time is it?''', stream=True)
    for chunk in stream:
        print(chunk['response'], finish='', flush=True)

    Time Sequence

    A time sequence is a sequence of knowledge factors measured over time, typically used for evaluation and forecasting. It permits us to see how variables change over time, and it’s used to establish tendencies and seasonal patterns.

    I’m going to generate a pretend time sequence dataset to make use of for example.

    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    
    ## create knowledge
    np.random.seed(1) #<--for reproducibility
    size = 30
    ts = pd.DataFrame(knowledge=np.random.randint(low=0, excessive=15, measurement=size),
                      columns=['y'],
                      index=pd.date_range(begin='2023-01-01', freq='MS', durations=size).strftime('%Y-%m'))
    
    ## plot
    ts.plot(variety="bar", figsize=(10,3), legend=False, colour="black").grid(axis='y')

    Often, time sequence datasets have a extremely easy construction with the principle variable as a column and the time because the index.

    Earlier than remodeling it right into a string, I wish to make it possible for the whole lot is positioned beneath a column, in order that we don’t lose any piece of data.

    dtf = ts.reset_index().rename(columns={"index":"date"})
    dtf.head()

    Then, I shall change the info sort from dataframe to dictionary.

    knowledge = dtf.to_dict(orient='information')
    knowledge[0:5]

    Lastly, from dictionary to string.

    str_data = "n".be part of([str(row) for row in data])
    str_data

    Now that we’ve a string, it may be included in a immediate that any language mannequin is ready to course of. Once you paste a dataset right into a immediate, the LLM reads the info as plain textual content, however can nonetheless perceive the construction and that means based mostly on patterns seen throughout coaching.

    immediate = f'''
    Analyze this dataset, it accommodates month-to-month gross sales knowledge of a web-based retail product:
    {str_data}
    '''

    We will simply begin a chat with the LLM. Please be aware that, proper now, this isn’t an Agent because it doesn’t have any Software, we’re simply utilizing the language mannequin. Whereas it doesn’t course of numbers like a pc, the LLM can acknowledge column names, time-based patterns, tendencies, and outliers, particularly with smaller datasets. It may possibly simulate evaluation and clarify findings, but it surely received’t carry out exact calculations independently, because it’s not executing code like an Agent.

    messages = [{"role":"system", "content":prompt}]
    
    whereas True:
        ## Person
        q = enter('🙂 >')
        if q == "stop":
            break
        messages.append( {"function":"consumer", "content material":q} )
       
        ## Mannequin
        agent_res = ollama.chat(mannequin=llm, messages=messages, instruments=[])
        res = agent_res["message"]["content"]
       
        ## Response
        print("👽 >", f"x1b[1;30m{res}x1b[0m")
        messages.append( {"role":"assistant", "content":res} )

    The LLM recognizes numbers and understands the general context, the same way it might understand a recipe or a line of code. 

    As you can see, using LLMs to analyze time series is great for quick and conversational insights.

    Agent

    LLMs are good for brainstorming and lite exploration, while an Agent can run code. Therefore, it can handle more complex tasks like plotting, forecasting, and anomaly detection. So, let’s create the Tools. 

    Sometimes, it can be more effective to treat the “final answer” as a Tool. For example, if the Agent does multiple actions to generate intermediate results, the final answer can be thought of as the Tool that integrates all of this information into a cohesive response. By designing it this way, you have more customization and control over the results.

    def final_answer(text:str) -> str:
        return text
    
    tool_final_answer = {'type':'function', 'function':{
      'name': 'final_answer',
      'description': 'Returns a natural language response to the user',
      'parameters': {'type': 'object',
                    'required': ['text'],
                    'properties': {'textual content': {'sort':'str', 'description':'pure language response'}}
    }}}
    
    final_answer(textual content="hello")

    Then, the coding Software.

    import io
    import contextlib
    
    def code_exec(code:str) -> str:
        output = io.StringIO()
        with contextlib.redirect_stdout(output):
            attempt:
                exec(code)
            besides Exception as e:
                print(f"Error: {e}")
        return output.getvalue()
    
    tool_code_exec = {'sort':'operate', 'operate':{
      'title': 'code_exec',
      'description': 'Execute python code. Use all the time the operate print() to get the output.',
      'parameters': {'sort': 'object',
                    'required': ['code'],
                    'properties': {
                        'code': {'sort':'str', 'description':'code to execute'},
    }}}}
    
    code_exec("from datetime import datetime; print(datetime.now().strftime('%H:%M'))")

    Furthermore, I shall add a few utils features for Software utilization and to run the Agent.

    dic_tools = {"final_answer":final_answer, "code_exec":code_exec}
    
    # Utils
    def use_tool(agent_res:dict, dic_tools:dict) -> dict:
        ## use software
        if "tool_calls" in agent_res["message"].keys():
            for software in agent_res["message"]["tool_calls"]:
                t_name, t_inputs = software["function"]["name"], software["function"]["arguments"]
                if f := dic_tools.get(t_name):
                    ### calling software
                    print('🔧 >', f"x1b[1;31m{t_name} -> Inputs: {t_inputs}x1b[0m")
                    ### tool output
                    t_output = f(**tool["function"]["arguments"])
                    print(t_output)
                    ### ultimate res
                    res = t_output
                else:
                    print('🤬 >', f"x1b[1;31m{t_name} -> NotFoundx1b[0m")
        ## don't use tool
        if agent_res['message']['content'] != '':
            res = agent_res["message"]["content"]
            t_name, t_inputs = '', ''
        return {'res':res, 'tool_used':t_name, 'inputs_used':t_inputs}

    When the Agent is attempting to unravel a process, I need it to maintain monitor of the Instruments which have been used, the inputs that it tried, and the outcomes it will get. The iteration ought to cease solely when the mannequin is able to give the ultimate reply.

    def run_agent(llm, messages, available_tools):
        tool_used, local_memory = '', ''
        whereas tool_used != 'final_answer':
            ### use instruments
            attempt:
                agent_res = ollama.chat(mannequin=llm,
                                        messages=messages,                                                                                                              instruments=[v for v in available_tools.values()])
                dic_res = use_tool(agent_res, dic_tools)
                res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
            ### error
            besides Exception as e:
                print("⚠️ >", e)
                res = f"I attempted to make use of {tool_used} however did not work. I'll attempt one thing else."
                print("👽 >", f"x1b[1;30m{res}x1b[0m")
                messages.append( {"role":"assistant", "content":res} )
            ### update memory
            if tool_used not in ['','final_answer']:
                local_memory += f"nTool used: {tool_used}.nInput used: {inputs_used}.nOutput: {res}"
                messages.append( {"function":"assistant", "content material":local_memory} )
                available_tools.pop(tool_used)
                if len(available_tools) == 1:
                    messages.append( {"function":"consumer", "content material":"now activate the software final_answer."} )
            ### instruments not used
            if tool_used == '':
                break
        return res

    In regard to the coding Software, I’ve observed that Brokers are likely to recreate the dataframe at each step. So I’ll use a reminiscence reinforcement to remind the mannequin that the dataset already exists. A trick generally used to get the specified behaviour. Finally, reminiscence reinforcements make it easier to to get extra significant and efficient interactions.

    # Begin a chat
    messages = [{"role":"system", "content":prompt}]
    reminiscence = '''
    The dataset already exists and it is referred to as 'dtf', do not create a brand new one.
    '''
    whereas True:
        ## Person
        q = enter('🙂 >')
        if q == "stop":
            break
        messages.append( {"function":"consumer", "content material":q} )
    
        ## Reminiscence
        messages.append( {"function":"consumer", "content material":reminiscence} )     
       
        ## Mannequin
        available_tools = {"final_answer":tool_final_answer, "code_exec":tool_code_exec}
        res = run_agent(llm, messages, available_tools)
       
        ## Response
        print("👽 >", f"x1b[1;30m{res}x1b[0m")
        messages.append( {"role":"assistant", "content":res} )

    Creating a plot is something that the LLM alone can’t do. But keep in mind that even if Agents can create images, they can’t see them, because after all, the engine is still a language model. So the user is the only one who visualises the plot.

    The Agent is using the library statsmodels to train a model and forecast the time series. 

    Large Dataframes

    LLMs have limited memory, which restricts how much information they can process at once, even the most advanced models have token limits (a few hundred pages of text). Additionally, LLMs don’t retain memory across sessions unless a retrieval system is integrated. In practice, to effectively work with large dataframes, developers often use strategies like chunking, RAG, vector databases, and summarizing content before feeding it into the model.

    Let’s create a big dataset to play with.

    import random
    import string
    
    length = 1000
    
    dtf = pd.DataFrame(data={
        'Id': [''.join(random.choices(string.ascii_letters, k=5)) for _ in range(length)],
        'Age': np.random.randint(low=18, excessive=80, measurement=size),
        'Rating': np.random.uniform(low=50, excessive=100, measurement=size).spherical(1),
        'Standing': np.random.alternative(['Active','Inactive','Pending'], measurement=size)
    })
    
    dtf.tail()

    I’ll add a web-searching Software, in order that, with the flexibility to execute Python code and search the web, a general-purpose AI beneficial properties entry to all of the accessible data and might make data-driven choices. 

    In Python, the simplest solution to create a web-searching Software is with the well-known personal browser DuckDuckGo (pip set up duckduckgo-search==6.3.5). You’ll be able to instantly use the unique library or import the LangChain wrapper (pip set up langchain-community==0.3.17).

    from langchain_community.instruments import DuckDuckGoSearchResults
    
    def search_web(question:str) -> str:
      return DuckDuckGoSearchResults(backend="information").run(question)
    
    tool_search_web = {'sort':'operate', 'operate':{
      'title': 'search_web',
      'description': 'Search the online',
      'parameters': {'sort': 'object',
                    'required': ['query'],
                    'properties': {
                        'question': {'sort':'str', 'description':'the subject or topic to look on the internet'},
    }}}}
    
    search_web(question="nvidia")

    In complete, the Agent now has 3 instruments.

    dic_tools = {'final_answer':final_answer,
                 'search_web':search_web,
                 'code_exec':code_exec}

    Since I can’t add the total dataframe within the immediate, I shall feed solely the primary 10 rows in order that the LLM can perceive the final context of the dataset. Moreover, I’ll specify the place to seek out the total dataset.

    str_data = "n".be part of([str(row) for row in dtf.head(10).to_dict(orient='records')])
    
    immediate = f'''
    You're a Information Analyst, you'll be given a process to unravel as finest you may.
    You will have entry to the next instruments:
    - software 'final_answer' to return a textual content response.
    - software 'code_exec' to execute Python code.
    - software 'search_web' to seek for info on the web.
    
    In case you use the 'code_exec' software, bear in mind to all the time use the operate print() to get the output.
    The dataset already exists and it is referred to as 'dtf', do not create a brand new one.
    
    This dataset accommodates credit score rating for every buyer of the financial institution. Here is the primary rows:
    {str_data}
    '''

    Lastly, we are able to run the Agent.

    messages = [{"role":"system", "content":prompt}]
    reminiscence = '''
    The dataset already exists and it is referred to as 'dtf', do not create a brand new one.
    '''
    whereas True:
        ## Person
        q = enter('🙂 >')
        if q == "stop":
            break
        messages.append( {"function":"consumer", "content material":q} )
    
        ## Reminiscence
        messages.append( {"function":"consumer", "content material":reminiscence} )     
       
        ## Mannequin
        available_tools = {"final_answer":tool_final_answer, "code_exec":tool_code_exec, "search_web":tool_search_web}
        res = run_agent(llm, messages, available_tools)
       
        ## Response
        print("👽 >", f"x1b[1;30m{res}x1b[0m")
        messages.append( {"role":"assistant", "content":res} )

    In this interaction, the Agent used the coding Tool properly. Now, I want to make it utilize the other tool as well.

    At last, I need the Agent to put together all the pieces of information obtained so far from this chat. 

    Conclusion

    This article has been a tutorial to demonstrate how to build from scratch Agents that process time series and large dataframes. We covered both ways that models can interact with the data: through natural language, where the LLM interprets the table as a string using its knowledge base, and by generating and executing code, leveraging tools to process the dataset as an object.

    Full code for this article: GitHub

    I hope you enjoyed it! Feel free to contact me for questions and feedback, or just to share your interesting projects.

    👉 Let’s Connect 👈



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleCybersecurity Meets AI & ML: Defending the Digital Frontier with Intelligence | by Hamza Ahmed | Apr, 2025
    Next Article How to Win New Clients — Without Any Sales Experience
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025
    Artificial Intelligence

    An Introduction to Remote Model Context Protocol Servers

    July 2, 2025
    Artificial Intelligence

    How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Many Music Producers Are Secretly Using AI: New Study

    April 21, 2025

    The Top Medium Stories of 2025 by reads and shares | by Muhammad Ansar Abbas | Jan, 2025

    January 19, 2025

    Student debit crises | by Myaseen | Dec, 2024

    December 11, 2024
    Our Picks

    Revisiting Benchmarking of Tabular Reinforcement Learning Methods

    July 2, 2025

    Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025

    July 2, 2025

    Qantas data breach to impact 6 million airline customers

    July 2, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.