Close Menu
    Trending
    • STOP Building Useless ML Projects – What Actually Works
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    • Why PDF Extraction Still Feels LikeHack
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Prompts for LLMs — An Exercise in Brainstorming | by Borabastab | May, 2025
    Machine Learning

    Prompts for LLMs — An Exercise in Brainstorming | by Borabastab | May, 2025

    Team_AIBS NewsBy Team_AIBS NewsMay 24, 2025No Comments8 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Diagram 2 — Exhibits the general efficiency comparability
    • 2–3 disks: All three nail it.
    • 4 disks: System 1 stumbles (~70% appropriate), 1.5 and a pair of stay dependable.
    • 5–6 disks: System 1 is a legal responsibility; System 2 is flawless; System 1.5 is the sensible hero (98%+ accuracy).
    # Code to validate strikes and test correctness
    def validate_moves(strikes, num_disks):
    pegs = {"A": record(vary(num_disks, 0, -1)), "B": [], "C": []}
    illegal_moves = 0
        for transfer in strikes:
    # Parse transfer like "Transfer disk 1 from A to C"
    disk, src, tgt = parse_move(transfer)
    # Examine if transfer is authorized
    if not pegs[src] or pegs[src][-1] != disk:
    illegal_moves += 1
    proceed
    if pegs[tgt] and pegs[tgt][-1] < disk:
    illegal_moves += 1
    proceed
    # Execute the transfer
    pegs[src].pop()
    pegs[tgt].append(disk)
    # Examine if puzzle is solved
    is_complete = pegs["C"] == record(vary(num_disks, 0, -1))
    return is_complete and illegal_moves == 0, illegal_moves

    You’d assume “quick” all the time wins. But:

    • System 1 makes use of few tokens — till you consider re-asks, handbook checks and bug repair site visitors.
    • System 2 invests extra tokens up entrance, then sails house error free.
    # Simplified token effectivity calculation
    def calculate_total_cost(system, num_disks, retry_probability):
    # Preliminary request
    immediate = generate_prompt(system, num_disks)
    prompt_tokens = estimate_tokens(immediate)
    completion_tokens = estimate_completion_tokens(system, num_disks)
        # Base value
    value = estimate_cost(prompt_tokens, completion_tokens)
    # Add retry prices
    expected_retries = 0
    if system == "system1":
    # System 1 has larger retry charges as complexity will increase
    expected_retries = retry_probability * (num_disks - 1)
    elif system == "system1.5":
    # System 1.5 has decrease retry charges
    expected_retries = retry_probability * (num_disks - 3) if num_disks > 3 else 0
    # System 2 has near-zero retries # Complete value together with retries
    total_cost = value * (1 + expected_retries)
    return total_cost
    Diagram 3 — exhibits the essential token utilization for every system throughout totally different disk counts
    Diagram 4 — exhibits the place System 2 turns into less expensive than System 1 at larger complexity (disk 6). It’s the “Crossover Level” talked about the place System 2 turns into extra economical than System 1 regardless of larger preliminary token utilization.
    Diagram 5 — exhibits what number of tokens are required for every appropriate transfer

    Perception: It’s cheaper to “pay” for rigorous reasoning as soon as than to pay the value of repeated failures.

    Allow us to have a look at the precise knowledge from our benchmarks:

    Discover that as complexity will increase, System 2’s preliminary larger token value turns into justified by its excellent accuracy.

    Diagram 6 — A set of warmth maps displaying the efficiency metrics throughout totally different techniques and disk counts, making it simpler to visualise the patterns.
    Diagram 7 — formatted desk with our benchmark knowledge. Rows with legitimate options are highlighted in mild inexperienced, whereas invalid options are in mild pink.

    System 1:

    Transfer disk 1 from A to C
    Transfer disk 2 from A to B
    Transfer disk 1 from C to B
    Transfer disk 3 from A to C
    Transfer disk 1 from B to A
    Transfer disk 2 from B to C
    ...
    [stops at step 20, missing 11 moves]

    System 1.5:

    1. Transfer disk 1 from A to C [clear path]
    2. Transfer disk 2 from A to B [free disk 3]
    3. Transfer disk 1 from C to B [group small disks]
    4. Transfer disk 3 from A to C [continue clearing larger disks]
    ...
    31. Transfer disk 5 from A to C [final transfer]

    System 2:

    Technique:
    - For n disks, we'd like (2^n - 1) strikes
    - For five disks, we'd like 31 strikes
    - We'll use recursion:
    1. Transfer 4 disks from A to B
    2. Transfer disk 5 from A to C
    3. Transfer 4 disks from B to C
    Strikes:
    Transfer disk 1 from A to C
    Transfer disk 2 from A to B
    ...
    Transfer disk 5 from A to C

    Conventional immediate engineering = crafting the right one-liner. Stream engineering = designing a multi-stage workflow:

    # Stream engineering instance with multi-step prompting
    def flow_engineering_approach(num_disks):
    # Step 1: Ask the mannequin to grasp the issue
    understand_prompt = f"Clarify the Tower of Hanoi puzzle with {num_disks} disks and its guidelines."
    understanding = call_llm(understand_prompt)
        # Step 2: Ask for an answer technique
    strategy_prompt = f"Primarily based on this understanding, define a method to resolve the Tower of Hanoi with {num_disks} disks."
    technique = call_llm(strategy_prompt)
    # Step 3: Ask for execution
    execution_prompt = f"Utilizing this technique, record the exact strikes to resolve the Tower of Hanoi with {num_disks} disks."
    strikes = call_llm(execution_prompt)
    # Step 4: Validate the answer
    validation_prompt = f"Confirm if these strikes appropriately remedy the Tower of Hanoi with {num_disks} disks in accordance with the principles."
    validation = call_llm(validation_prompt)
    return {
    "understanding": understanding,
    "technique": technique,
    "strikes": strikes,
    "validation": validation
    }
    1. Comprehend: “Clarify the principles.”
    2. Plan: “Define your recursive technique.”
    3. Execute: “Listing strikes.”
    4. Validate: “Examine legality; repair errors.”

    That additional “plan” + “validate” steps are your System 1.5 in motion, a tiny workflow that unlocks reliability.

    • Code technology: fast scaffolds vs. examined modules vs. full architectural plans
    # Instance of System 2 strategy for code technology
    def generate_code_with_system2(necessities):
    immediate = f"""You might be an skilled software program engineer. Plan rigorously earlier than writing code.
        Necessities: {necessities}    First, define the structure and important elements.
    Then, determine potential edge instances and find out how to deal with them.
    Lastly, write clear, well-documented code that meets these necessities.
    Embrace error dealing with, exams, and feedback explaining key selections.
    """
    return call_llm(immediate)
    • Medical diagnostics: symptom matching vs. differential reasoning vs. care-path design
    • Authorized due diligence: clause lookup vs. threat evaluation vs. contract drafting
    • Begin easy (S1): for primary queries, maintain prompts tight.
    • Escalate sensible (S1.5): chain-of-thought for medium complexity duties.
    • Lock in (S2): when failure isn’t an possibility, construct a circulation with planning and validation.
    # Adaptive system choice primarily based on error charges
    def adaptive_prompt_system(job, complexity):
    # Begin with System 1
    system = "system1"
    max_attempts = 3
    makes an attempt = 0
        whereas makes an attempt < max_attempts:
    immediate = generate_prompt(system, job, complexity)
    response = call_llm(immediate)
    # Validate the response
    legitimate, errors = validate_response(response, job)
    if legitimate:
    return response
    # Escalate to extra sturdy techniques after failures
    makes an attempt += 1
    if makes an attempt == 1:
    system = "system1.5"
    elif makes an attempt == 2:
    system = "system2"
    # Return finest effort with System 2
    return response

    Tip: Instrument your LLM calls. Monitor errors, retries and token utilization. Mechanically swap techniques when accuracy dips beneath your threshold.

    Stream engineering is the seed of agentic AI — small brokers executing micro-flows beneath human oversight, collaborating like a symphony.

    # Easy agent-based workflow
    class Agent:
    def __init__(self, title, system, specialization):
    self.title = title
    self.system = system
    self.specialization = specialization
        def course of(self, job, context):
    immediate = self.generate_prompt(job, context)
    return call_llm(immediate)
    def agentic_workflow(job):
    # Create specialised brokers
    planner = Agent("Planner", "system2", "breaking down advanced duties")
    researcher = Agent("Researcher", "system1", "retrieving data")
    reasoner = Agent("Reasoner", "system1.5", "analyzing data")
    validator = Agent("Validator", "system2", "checking correctness")
    # Coordination workflow
    plan = planner.course of(job, {})
    research_results = researcher.course of(plan["research_tasks"], {})
    evaluation = reasoner.course of(plan["analysis_tasks"], research_results)
    validation = validator.course of(plan["validation_tasks"], evaluation)
    return {
    "plan": plan,
    "analysis": research_results,
    "evaluation": evaluation,
    "validation": validation,
    "final_result": evaluation if validation["is_valid"] else None
    }

    One agent handles knowledge ingestion, one other causes, one other validates, all beneath a conductor (you or a grasp agent).

    Diagram 8 — The visualisation illustrates the important thing perception of our article: whereas System 2 makes use of extra tokens initially, it turns into extra environment friendly for advanced issues (5–6 disks) due to its larger accuracy and lack of want for retries.

    Need to see these outcomes firsthand? Observe these steps:

    1. Clone the repository: git clone https://github.com/Bora-Bastab/tower-of-hanoi-llm-benchmark.git
    2. Set up dependencies: pip set up -r necessities.txt
    3. Run the benchmark: python hanoi_benchmark.py

    The script will generate CSV outcomes and all of the visualisations talked about on this article. Be happy to switch the parameters in hanoi_benchmark.py to check totally different eventualities.

    Whereas the Tower of Hanoi is a traditional mathematical puzzle, its underlying rules have discovered sensible purposes throughout numerous domains:

    1. Pc Science and Algorithms

    The recursive nature of the Tower of Hanoi makes it a wonderful instrument for educating and understanding recursion in pc science. It’s usually used for example how advanced issues could be damaged down into less complicated sub issues.

    2. Knowledge Backup Methods

    Within the realm of knowledge administration, the Tower of Hanoi algorithm has impressed backup rotation schemes. These schemes optimise the frequency and retention of backups, guaranteeing knowledge integrity whereas minimising storage necessities.

    3. Neuropsychological Assessments

    Psychologists utilise the Tower of Hanoi job to evaluate government features, similar to planning and problem-solving skills. It’s significantly helpful in evaluating people with frontal lobe impairments or cognitive problems.

    4. Synthetic Intelligence and Machine Studying

    The issue-solving methods derived from the Tower of Hanoi have influenced AI algorithms, particularly in areas requiring planning and sequential resolution making. It serves as a benchmark for evaluating the effectivity of AI fashions in dealing with recursive duties.

    To successfully apply the insights from the Tower of Hanoi and the mentioned prompting methods:

    • Perceive the Complexity: Recognise the character of the issue at hand. Easy duties may profit from intuitive approaches (System 1), whereas advanced issues might require deliberate, step-by-step methods (System 2).
    • Steadiness Effectivity and Accuracy: Intention for a hybrid strategy (System 1.5) when applicable, combining the velocity of instinct with the accuracy of analytical considering.
    • Leverage Recursive Pondering: Break down advanced issues into smaller, manageable components, much like the recursive steps within the Tower of Hanoi, to simplify and remedy them successfully.
    • Apply in Numerous Fields: Contemplate how these methods could be tailored past theoretical workout routines, similar to in software program growth, knowledge administration, cognitive assessments and AI planning.

    By integrating these approaches, we will improve downside fixing capabilities and optimise resolution making processes throughout numerous purposes. We have to keep in mind that prompting is not only a hack, it’s a design. Our immediate is the interface between human intention and machine cognition.

    • Experiment: Strive 3–6 disks with every system.
    • Measure: Monitor accuracy, value and developer hours saved.
    • Share: Submit your outcomes, riff on the workflow and be a part of the dialog.

    Keep in mind: Typically the sluggish, deliberate path is the quickest approach house. So subsequent time, earlier than you hit “ship” on that one liner, ask your self: “Which system do I actually need?”



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleStartups Boost Light in Phone Cameras
    Next Article Ethical Considerations in Developing AI Girlfriend Chatbots
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025
    Machine Learning

    Why PDF Extraction Still Feels LikeHack

    July 1, 2025
    Machine Learning

    🚗 Predicting Car Purchase Amounts with Neural Networks in Keras (with Code & Dataset) | by Smruti Ranjan Nayak | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    5 Risk-Taking Lessons From Founders Who Bet Big and Won

    January 14, 2025

    North Korean hackers stole $1.3bn in crypto this year, report says

    December 20, 2024

    Use Code FAMPLAN to Get This Popular Cybersecurity Software for Only $16

    March 14, 2025
    Our Picks

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025

    The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.