Thursday, June 19, 2025

How Database Agents Reduce Costs by 90 % : A Beginner's Guide

 In the world of AI-powered applications, querying databases using AI models like GPT can be expensive. Every word, number, or symbol sent to the AI (known as tokens) contributes to costs. This is where database agents come to the rescue. They act as intelligent intermediaries between users and databases, significantly reducing costs by generating optimized SQL queries and minimizing the data sent to the AI.

In this blog, we’ll explore how database agents work, how they save costs, and why they’re a game-changer for businesses. If you're a novice, don't worry—this guide is beginner-friendly.


What Is a Database Agent?

A database agent is an AI-powered tool designed to interact with databases intelligently. It bridges the gap between human users and databases by:

  1. Understanding natural language questions (e.g., "Show me the top 10 products sold this month").
  2. Generating efficient SQL queries based on the database schema.
  3. Executing the SQL query to retrieve results.
  4. Providing meaningful responses, such as summaries, charts, or detailed explanations.

By leveraging database agents, users with no knowledge of SQL can still query databases effectively.


How Do Database Agents Reduce Costs?

1. Generating SQL Queries Instead of Processing Raw Data

Without a database agent, you would need to send large amounts of raw data (e.g., an entire table or dataset) to the AI for processing. This uses a lot of tokens, which increases costs.

With a database agent:

  • The AI generates an SQL query based on your question.
  • The query is executed directly on the database, and only the results are processed further.

Example:

  • Without Agent: Send 10,000 rows of sales data to the AI for analysis.
  • With Agent: Send a query like SELECT product_name, SUM(sales) FROM sales WHERE date > '2025-06-01' GROUP BY product_name; which retrieves just a few rows.

This transformation alone can reduce token usage by 90% or more.


2. Preprocessing Data at the Database Level

Database agents offload the heavy lifting of data filtering, grouping, and aggregation to the database itself. Instead of sending raw data to the AI, the database performs these operations efficiently.

Example:

  • Instead of sending all transaction logs to the AI, the database agent might run:
    SELECT COUNT(*) FROM transactions WHERE status = 'failed' AND date = CURRENT_DATE;
    
    Then, it sends the result (e.g., "There were 15 failed transactions today") to the AI for further reasoning.

This means fewer tokens are used, and the AI focuses on high-level analysis rather than data processing.


3. Schema-Aware Query Generation

A database agent understands the structure of your database (schema), including:

  • Tables and their relationships.
  • Columns and their data types.

Using this schema knowledge, the agent generates precise queries without trial-and-error attempts. This avoids unnecessary back-and-forth interactions with the AI, saving both time and tokens.

Example:
If a user asks, "What are the top 5 products by revenue?", the agent:

  1. Looks at the schema to find the products and sales tables.
  2. Automatically generates a query like:
    SELECT product_name, SUM(revenue) AS total_revenue
    FROM sales
    GROUP BY product_name
    ORDER BY total_revenue DESC
    LIMIT 5;
    

No need to send irrelevant data or guesswork—this efficiency translates to lower costs.


4. Caching and Reusing Results

Database agents can cache frequently used queries and their results. If a user asks the same question multiple times (e.g., "What were last month’s sales?"), the agent retrieves the cached result instead of querying the database again or generating a new SQL query.

Benefits:

  • No additional token usage.
  • Faster response times for repeated queries.

5. Handling Errors Without Wasting Resources

When users write SQL manually, errors like syntax mistakes or invalid table references are common. Fixing these errors can lead to multiple attempts, increasing costs.

Database agents validate and optimize queries before execution. They handle potential issues like:

  • SQL injection attacks.
  • Syntax errors.
  • Null values.

This proactive approach reduces unnecessary retries and ensures efficient token usage.


How Database Agents Work in Practice

Imagine you’re running a business and want to analyze your sales data. Here’s how a database agent would help:

  1. User Input:
    You type: "What are the top 3 products sold this month?"

  2. SQL Query Generation:
    The database agent generates a query based on your database schema:

    SELECT product_name, COUNT(*) AS sales_count
    FROM sales
    WHERE sale_date >= DATE_TRUNC('month', CURRENT_DATE)
    GROUP BY product_name
    ORDER BY sales_count DESC
    LIMIT 3;
    
  3. Query Execution:
    The query is executed on the database, and the result (e.g., 3 rows) is retrieved.

  4. AI Processing:
    The result is sent to the AI for further reasoning, such as generating a summary:

    • "The top 3 products sold this month are Product A (500 units), Product B (450 units), and Product C (300 units)."
  5. Response to User:
    The user sees the summarized result without worrying about SQL or raw data.


Why Is This Approach Cost-Efficient?

  • Minimized Token Usage: Only the relevant query result (e.g., 3 rows) is sent to the AI, not the entire dataset.
  • Efficient Querying: SQL queries are optimized for performance, reducing unnecessary database loads.
  • Error Handling: The database agent avoids costly retries by validating queries upfront.
  • Reusable Insights: Cached results save time and money for repeated questions.

Key Benefits of Using Database Agents

  1. Cost Savings:
    • By reducing token usage, businesses save on AI-related costs.
  2. Ease of Use:
    • Users don’t need to know SQL; they can ask questions in plain English.
  3. Faster Insights:
    • Queries are executed directly on the database, reducing response times.
  4. Scalability:
    • Even large datasets are handled efficiently, as most processing happens at the database level.

A Quick Example with Code

Here’s a simplified Python implementation of a database agent:

import os
from sqlalchemy import create_engine
import pandas as pd

# Connect to the database
DATABASE_URL = "postgresql+psycopg2://user:password@localhost:5432/mydatabase"
engine = create_engine(DATABASE_URL)

# User Question
user_input = "What are the top 3 products sold this month?"

# Simplified SQL Generation (based on schema)
sql_query = """
SELECT product_name, COUNT(*) AS sales_count
FROM sales
WHERE sale_date >= DATE_TRUNC('month', CURRENT_DATE)
GROUP BY product_name
ORDER BY sales_count DESC
LIMIT 3;
"""

# Execute SQL Query
result = pd.read_sql_query(sql_query, engine)

# Display Results
print("Top 3 Products Sold This Month:")
print(result)

Conclusion

Database agents are revolutionizing how businesses interact with their data. By generating SQL queries based on a database schema and executing them efficiently, these agents significantly reduce token usage and costs. They also make data querying accessible to non-technical users, opening up powerful insights for everyone.

If you're new to AI and databases, adopting a database agent is a smart way to save costs and improve efficiency in your data workflows. As the world of AI continues to evolve, database agents will play an increasingly important role in making data-driven decision-making fast, affordable, and accessible.

Building an AI-Powered SQL Assistant: A Beginner's Guide to the Code

 In this blog, we'll break down the provided Python code that creates a streamlit-based AI-powered SQL assistant. This assistant enables users to interact with a database, write optimized SQL queries, and retrieve results—all powered by OpenAI's GPT-4-turbo model. By the end of this blog, you'll understand how the code works and how you can use or modify it for your own projects.


Introduction

The code builds a user-friendly web app that:

  1. Connects to a PostgreSQL database.
  2. Uses OpenAI's GPT-4 model to generate SQL queries.
  3. Executes the queries on the database.
  4. Displays the results, SQL query, and explanations.

The app is beginner-friendly and uses Streamlit for the interface, SQLAlchemy for database interaction, and LangChain for managing the AI functionality.


Breaking Down the Code

1. Setup and Dependencies

Before diving into the code, the following libraries are used:

  • Streamlit: For creating the web-based user interface.
  • SQLAlchemy: For managing database connections.
  • LangChain: For managing interactions with the OpenAI language model.
  • dotenv: For loading environment variables securely.
  • Pandas: For handling query results as dataframes.
  • sqlparse and sqlglot: For validating and formatting SQL queries.

Install these dependencies using:

pip install streamlit sqlalchemy langchain pandas python-dotenv sqlparse sqlglot

2. Class Overview: SQLAgentApp

The code is organized into a class called SQLAgentApp. This class encapsulates the entire application logic, making the code modular and reusable.


3. Environment Setup

The _init_environment method ensures that the OpenAI API key is loaded from environment variables using the dotenv library.

def _init_environment(self):
    self.openai_key = os.getenv("OPENAI_API_KEY")
    if not self.openai_key:
        st.error("OPENAI_API_KEY not found in environment variables")
        st.stop()
  • If the API key is missing, the app stops and shows an error message.
  • This ensures secure handling of sensitive API keys.

4. Database Initialization

The _init_database method connects to a PostgreSQL database using SQLAlchemy. It retrieves database credentials from environment variables or uses default values.

def _init_database(self):
    db_config = {
        "user": os.getenv("PG_USER", "postgres"),
        "password": os.getenv("PG_PASSWORD", "postgres_password"),
        "host": os.getenv("PG_HOST", "localhost"),
        "port": os.getenv("PG_PORT", "5432"),
        "database": os.getenv("PG_DATABASE", "postgres_db")
    }

    self.DATABASE_URL = (
        f"postgresql+psycopg2://{db_config['user']}:{db_config['password']}"
        f"@{db_config['host']}:{db_config['port']}/{db_config['database']}"
    )
  • The code uses create_engine to establish a connection to the database.
  • If the connection fails, an error message is displayed using Streamlit, and the app stops.

5. Initializing the LLM

The _init_llm method creates an instance of OpenAI's GPT-4-turbo model using LangChain.

def _init_llm(self):
    self.llm = ChatOpenAI(
        model_name="gpt-4-turbo-preview",
        temperature=0,
        streaming=True,
        api_key=self.openai_key
    )
  • temperature=0 ensures deterministic responses.
  • Streaming is enabled for real-time responses in Streamlit.

6. Prompt Engineering

The _get_prompt_prefix method constructs a detailed prompt for the AI model. It includes:

  • The database schema (tables and columns).
  • Clear rules for generating SQL queries.
  • Formatting guidelines for the response.
def _get_prompt_prefix(self) -> str:
    schema = self._get_schema_info()
    current_year = datetime.now().year
    return f"""
    You are a world-class SQL expert with 15+ years of experience. ...
    Database Schema:
    {schema}
    """
  • The schema is dynamically retrieved using SQLAlchemy's inspect module.
  • The prompt ensures the AI generates PostgreSQL-compliant SQL.

7. Agent Initialization

The _init_agent method creates a LangChain SQL agent.

self.agent = create_sql_agent(
    llm=self.llm,
    toolkit=self.toolkit,
    agent_type=AgentType.OPENAI_FUNCTIONS,
    verbose=True,
    memory=self.memory,
    prefix=self._get_prompt_prefix(),
    format_instructions=self._get_format_instructions(),
    handle_parsing_errors=True,
    max_iterations=5,
    early_stopping_method="generate"
)
  • LangChain SQL Agent: Combines the AI model with SQL tools to generate and execute queries.
  • ConversationBufferMemory: Stores the chat history for better context.
  • Format Instructions: Ensures the AI provides clear, structured responses.

8. User Interface

The run_ui method builds the entire web interface using Streamlit.

Sidebar

The sidebar displays:

  1. Database Schema: A list of tables and columns retrieved dynamically.
  2. Chat History: Previous user and AI messages.

Main Chat Interface

The main interface handles:

  1. User Input: Users can ask questions about the database.
  2. AI Responses: The AI responds with SQL queries, results, and explanations.
if prompt := st.chat_input("Ask a data question..."):
    st.session_state.messages.append({"role": "user", "content": prompt})
    
    # AI generates response
    response = self.agent.run(prompt)
  • User input is sent to the AI agent.
  • The response includes the SQL query, results, and analysis.

9. SQL Execution and Processing

The _process_response method extracts and executes the SQL query generated by the AI.

if "```sql" in response:
    sql_part = response.split("```sql")[1].split("```")[0].strip()
    result["sql"] = sqlparse.format(sql_part, reindent=True)
    result["data"] = pd.read_sql_query(result["sql"], self.engine)
  • SQL Parsing: Extracts the SQL code from the response.
  • Execution: Runs the SQL query on the database using Pandas.
  • Result Formatting: Processes the query result into a readable format.

10. Error Handling

The code includes robust error handling:

  • If the database connection fails, the app stops with an error message.
  • SQL execution errors are caught and displayed to the user.

11. Running the App

Finally, the app is launched using:

if __name__ == "__main__":
    app = SQLAgentApp()
    app.run_ui()

How the App Works

Step-by-Step Workflow

  1. The user inputs a question, such as "Show all active products."
  2. The AI generates a PostgreSQL query based on the database schema.
  3. The query is executed, and the results are displayed in a table.
  4. The AI provides an explanation of the query and optimization tips.

Key Features

  • Dynamic Schema Retrieval: Automatically fetches database tables and columns.
  • AI-Generated SQL: Ensures accurate and optimized queries.
  • Real-Time Execution: Runs queries instantly and displays results.
  • Detailed Analysis: Explains the results and suggests optimizations.

Conclusion

This code demonstrates how to build an AI-powered SQL assistant that integrates database querying, AI, and a user-friendly interface. By following this guide, you can adapt the application to your own database and enhance it with additional features.

import os
import re
import sqlparse
from dotenv import load_dotenv
from typing import List, Dict, Any
from sqlalchemy import create_engine, inspect
from sqlglot import transpile # FIXED IMPORT
import pandas as pd
import streamlit as st
from datetime import datetime

# LangChain imports
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from langchain.agents import create_sql_agent, AgentType
from langchain.callbacks import StreamlitCallbackHandler
from langchain_community.utilities import SQLDatabase
from langchain_community.agent_toolkits import SQLDatabaseToolkit
import sqlparse

# Load environment variables
load_dotenv()
os.environ['OPENAI_API_KEY'] = "" # Replace with your OpenAI key
class SQLAgentApp:
def __init__(self):
self._init_environment()
self._init_database()
self._init_llm()
self._init_agent()

def _init_environment(self):
self.openai_key = os.getenv("OPENAI_API_KEY")
if not self.openai_key:
st.error("OPENAI_API_KEY not found in environment variables")
st.stop()

def _init_database(self):
# Initialize database connection
db_config = {
"user": os.getenv("PG_USER", "postgres"),
"password": os.getenv("PG_PASSWORD", "postgres_password"),
"host": os.getenv("PG_HOST", "localhost"),
"port": os.getenv("PG_PORT", "5432"),
"database": os.getenv("PG_DATABASE", "postgres_db")
}

self.DATABASE_URL = (
f"postgresql+psycopg2://{db_config['user']}:{db_config['password']}"
f"@{db_config['host']}:{db_config['port']}/{db_config['database']}"
)

try:
self.engine = create_engine(self.DATABASE_URL)
self.db = SQLDatabase(self.engine)
self.inspector = inspect(self.engine)
except Exception as e:
st.error(f"Database connection failed: {str(e)}")
st.stop()

def _init_llm(self):
self.llm = ChatOpenAI(
model_name="gpt-4-turbo-preview",
temperature=0,
streaming=True,
api_key=self.openai_key
)

def _get_prompt_prefix(self) -> str:
schema = self._get_schema_info()
current_year = datetime.now().year
return f"""
You are a world-class SQL expert with 15+ years of experience. Follow these rules:
1. Generate PostgreSQL-compliant SQL
2. Validate queries with SQLGlot before execution
3. Use CTEs for complex logic
4. Optimize for query performance
5. Add comments for complex operations
6. Prevent SQL injection attacks
7. Handle NULLs with COALESCE
8. Include LIMIT unless explicitly disabled
9. For date queries:
- Use CURRENT_DATE for today's date
- Use CURRENT_TIMESTAMP for current time
- For relative dates (e.g., last month), use date arithmetic
- Format dates consistently (YYYY-MM-DD)
- Current year is {current_year}

Database Schema:
{schema}

Response Format:
1. Valid SQL Query
2. Execution Result Summary
3. Step-by-Step Explanation
4. Optimization Tips
5. Alternative Approaches

"""

def _init_agent(self):
self.toolkit = SQLDatabaseToolkit(db=self.db, llm=self.llm)
self.memory = ConversationBufferMemory()

self.agent = create_sql_agent(
llm=self.llm,
toolkit=self.toolkit,
agent_type=AgentType.OPENAI_FUNCTIONS,
verbose=True,
memory=self.memory,
prefix=self._get_prompt_prefix(), # Now this method exists before being used
format_instructions=self._get_format_instructions(),
handle_parsing_errors=True,
max_iterations=5,
early_stopping_method="generate"
)

def _get_format_instructions(self) -> str:
return """
Always respond with markdown containing:
- **SQL Query** (syntax-validated)
- **Result** (summarized output)
- **Explanation** (detailed reasoning)
- **Optimizations** (index/partition suggestions)
- **Alternatives** (different methodologies)
"""

def _get_schema_info(self) -> str:
schema = []
for table in self.inspector.get_table_names():
columns = [f"{col['name']} ({col['type']})"
for col in self.inspector.get_columns(table)]
schema.append(f"Table {table}:\n " + "\n ".join(columns))
return "\n\n".join(schema)

def run_ui(self):
st.set_page_config(page_title="AI SQL Agent", layout="wide")
st.title("🛠️ AI-Powered SQL Assistant")

# Initialize chat history if not exists
if "messages" not in st.session_state:
st.session_state.messages = []

# Sidebar with schema info and chat history
with st.sidebar:
st.header("📖 Database Schema")
for table in self.inspector.get_table_names():
with st.expander(f"📑 {table}"):
cols = [
f"{col['name']} ({col['type']})"
for col in self.inspector.get_columns(table)
]
st.code("\n".join(cols))

st.divider()
st.header("📝 Chat History")
# Display chat history in sidebar
for msg in st.session_state.messages:
role = "💬 User" if msg["role"] == "user" else "🤖 Assistant"
with st.expander(f"{role}: {msg['content'][:50]}...", expanded=False):
st.markdown(msg["content"])

# Main chat interface
# Display all previous messages in the main area
for msg in st.session_state.messages:
with st.chat_message(msg["role"]):
st.markdown(msg["content"])

# Chat input
if prompt := st.chat_input("Ask a data question (e.g., 'Show all active products'):"):
# Add user message to chat history
st.session_state.messages.append({"role": "user", "content": prompt})

# Display user message immediately
with st.chat_message("user"):
st.markdown(prompt)

# Create a placeholder for the assistant's response
with st.chat_message("assistant"):
try:
st_callback = StreamlitCallbackHandler(st.container())
response = self.agent.run(prompt, callbacks=[st_callback])
processed = self._process_response(response)

# Display SQL
if processed["sql"]:
with st.expander("🔍 Generated SQL"):
st.code(processed["sql"], language="sql")

# Display Data
if processed["data"] is not None:
st.subheader("📊 Results Preview")
st.dataframe(processed["data"].head(10), use_container_width=True)

# Display Analysis
if processed["explanation"]:
st.subheader("📝 Explanation")
st.markdown(processed["explanation"])

# Add assistant's response to chat history with the summary
st.session_state.messages.append({
"role": "assistant",
"content": processed["summary"]
})

# Force a rerun to update the display while maintaining history
st.rerun()

except Exception as e:
error_message = f" Error: {str(e)}"
st.error(error_message)
st.session_state.messages.append({
"role": "assistant",
"content": error_message
})
# Force a rerun to update the display while maintaining history
st.rerun()

def _process_response(self, response: str) -> dict:
result = {
"sql": "",
"data": None,
"explanation": "",
"optimizations": "",
"alternatives": "",
"summary": "" # Add summary field
}

try:
# Handle both string and tuple responses
if isinstance(response, tuple):
response = response[0]

# Extract SQL Query
if "```sql" in response:
sql_part = response.split("```sql")[1].split("```")[0].strip()
result["sql"] = sqlparse.format(sql_part, reindent=True)

# Execute SQL directly
try:
result["data"] = pd.read_sql_query(result["sql"], self.engine)

# Format the data into a clean text explanation
if result["data"] is not None and not result["data"].empty:
explanation_parts = []
explanation_parts.append(f"Found {len(result['data'])} records:")

# Format each row in a clean text way
for _, row in result["data"].iterrows():
record_info = []
for col in result["data"].columns:
if col != 'id': # Skip ID in the display
value = row[col]
if pd.isna(value):
value = 'None'
record_info.append(f"{value}")
explanation_parts.append(f"• {' - '.join(record_info)}")

result["explanation"] = "\n".join(explanation_parts)
except Exception as e:
st.error(f"SQL Execution Error: {str(e)}")
result["error"] = f"SQL Execution Error: {str(e)}"

# Extract other response parts
sections = ["Result", "Explanation", "Optimizations", "Alternatives", "Execution Result Summary"]
current_section = None

for line in response.split("\n"):
if line.startswith("**"):
current_section = line.strip("*: ").lower()
elif current_section and current_section in result:
result[current_section] += line + "\n"
elif "Execution Result Summary" in line:
current_section = "summary"
elif current_section == "summary":
result["summary"] += line + "\n"

# If no summary was found, use the explanation as summary
if not result["summary"].strip():
result["summary"] = result["explanation"]

return result

except Exception as e:
return {"error": f"Response processing error: {str(e)}"}

if __name__ == "__main__":
app = SQLAgentApp() # Create an instance of the class
app.run_ui() # Call the method to start the UI


Wednesday, June 18, 2025

Optimizing LLM Queries for CSV Files to Minimize Token Usage: A Beginner's Guide

When working with large CSV files and querying them using a Language Model (LLM), optimizing your approach to minimize token usage is crucial. This helps reduce costs, improve performance, and make your system more efficient. Here’s a beginner-friendly guide to help you understand how to achieve this.


What Are Tokens, and Why Do They Matter?

Tokens are the building blocks of text that LLMs process. A single word like "cat" or punctuation like "." counts as a token. Longer texts mean more tokens, which can lead to higher costs and slower query responses. By optimizing how you query CSV data, you can significantly reduce token usage.


Key Strategies to Optimize LLM Queries for CSV Files

1. Preprocess and Filter Data

Before sending data to the LLM, filter and preprocess it to retrieve only the relevant rows and columns. This minimizes the size of the input text.

How to Do It:

  • Use Python or database tools to preprocess the CSV file.
  • Filter for only the rows and columns necessary for your query.
    import pandas as pd
    
    # Load CSV file
    df = pd.read_csv("data.csv")
    
    # Filter relevant data
    filtered_df = df[df["Category"] == "AI"]
    filtered_df = filtered_df[["Name", "Score"]]
    
    # Save filtered data to a smaller CSV file
    filtered_df.to_csv("filtered_data.csv", index=False)
    

Benefit: Instead of sending an entire CSV file, you send only the required subset of data.


2. Summarize Data

Use aggregation or summarization techniques to condense the data before passing it to the LLM.

Example Use Case:
Instead of sending 1,000 rows of sales data, aggregate it at a higher level:

  • Compute totals, averages, or other metrics.
  • Pass only the summary (e.g., "Total sales for 2025: $1,000,000").
summary = df.groupby("Year")["Sales"].sum()
summary_text = f"Total sales by year: {summary.to_dict()}"

Benefit: A single line of summary text replaces thousands of rows, saving tokens.


3. Use Metadata for Efficient Queries

Add metadata to your CSV file, such as tags or categories, to make filtering easier and faster.

Example:
If your CSV contains transaction logs, include metadata like:

  • Date
  • Category
  • Priority

Instead of passing raw logs, query the metadata first:

  • Query: "Show me high-priority transactions from June 2025."
  • Result: Filtered data with only relevant rows.

4. Chunk Large Data

If your CSV file is too large to process at once, split it into smaller chunks and query the LLM sequentially.

How to Do It:

  • Divide the file into multiple parts.
  • Process each chunk individually.
  • Combine the results.
# Split CSV into chunks
chunk_size = 100  # Number of rows per chunk
for i, chunk in enumerate(pd.read_csv("large_file.csv", chunksize=chunk_size)):
    chunk.to_csv(f"chunk_{i}.csv", index=False)

Benefit: Smaller chunks fit within the LLM’s context window, reducing token usage.


5. Use Embedding-Based Search

Convert your CSV data into embeddings (vector representations). Store these embeddings in a vector database and perform similarity searches to retrieve the most relevant rows for your query.

Tools to Use:

  • Use libraries like sentence-transformers to generate embeddings.
  • Store them in vector databases like Pinecone or Weaviate.

Example:

  • Query: "Find rows similar to 'AI projects with a score above 90'."
  • Result: Only the top 5–10 most relevant rows are sent to the LLM.

6. Pre-Format Data

Format your data into a compact structure, like JSON or a short table, before sending it to the LLM.

Example:
Instead of sending raw CSV rows:

Name, Score, Category
Alice, 95, AI
Bob, 88, Data Science

Send this:

[{"Name": "Alice", "Score": 95, "Category": "AI"}]

Benefit: Structured data reduces unnecessary tokens and improves query clarity.


7. Leverage Caching

Cache the results of frequently used queries. If a query has already been processed, return the cached result instead of re-querying the LLM.

How to Implement:

  • Use a key-value store (e.g., Redis) to save query results.
  • Check the cache before querying the LLM.

Example:

  • Query: "Summarize sales data for 2025."
  • Cache lookup: Return the previously computed summary if available.

8. Ask Specific Questions

Frame your queries to be as specific as possible. Avoid open-ended or vague queries, as they may require processing more data.

Example:

  • Instead of: "Tell me about the CSV file."
  • Ask: "What are the top 5 products by sales in 2025?"

A Practical Workflow for CSV Optimization

  1. Load and Preprocess Data

    • Use Python or database tools to filter and preprocess the CSV.
  2. Summarize or Chunk Data

    • Aggregate or split the data into smaller sets.
  3. Query Efficiently

    • Use metadata or embeddings to retrieve only relevant rows.
  4. Send Compact Data to LLM

    • Format the filtered data into a compact structure.
  5. Cache Results

    • Store frequently queried results to avoid redundant token usage.

Example Query Optimization

Scenario: You have a CSV file with 1,000 rows of sales data. You want to know the top 3 products by sales.

Non-Optimized Query:

"Here is my CSV file: [entire 1,000 rows]. What are the top 3 products by sales?"

Optimized Query:

  1. Preprocess the CSV:
    top_products = df.groupby("Product")["Sales"].sum().nlargest(3)
    
  2. Send the result to the LLM:
    "These are the top 3 products by sales: {'Product A': 500, 'Product B': 450, 'Product C': 400}. Provide insights on these."
    

Conclusion

By filtering, summarizing, and structuring your data before querying an LLM, you can significantly reduce token usage and costs. Start small, experiment with these techniques, and gradually build an efficient workflow for handling large CSV files. Happy optimizing! 🚀

Friday, June 13, 2025

Enhancing Prompt Engineering Accuracy in Software Requirements Processing with Multi-Agent Systems and Multiple LLMs

When processing software requirements or testing documents using large language models (LLMs), ensuring the accuracy and reliability of the results is a critical challenge. One effective approach is using a multi-agent system that leverages multiple LLMs to cross-validate outputs, compare results, and improve overall accuracy. This approach reduces reliance on a single model and ensures robustness in processing, especially for tasks like test case generation, requirement validation, and data summarization.

This blog explores how to use multi-agent systems with different LLMs to process software requirements, compare results, and measure accuracy.


What is a Multi-Agent System?

A multi-agent system involves multiple independent AI agents (in this case, LLMs) working together to:

  1. Perform the same task independently.
  2. Cross-validate or refine each other’s outputs.
  3. Provide diverse perspectives to improve the quality and reliability of the final result.

In our case, the agents will be different LLMs (e.g., GPT-4, Claude, Cohere, or Google Bard).


Use Case Overview

We’ll process a software requirements CSV file to:

  1. Extract high-priority functional requirements.
  2. Generate test cases for these requirements.
  3. Use multiple LLMs to process the data and compare their outputs.

Finally, we’ll measure accuracy by comparing the results and identifying discrepancies.


Step-by-Step Guide

Step 1: Setup and Prepare the Data

We’ll begin by preparing the software requirements data in a CSV format.

Example CSV File (requirements.csv):

ID,Requirement,Type,Priority,Status
1,Users must be able to register and log in using their email and password,Functional,High,Approved
2,Search functionality must return relevant results within 2 seconds,Functional,Medium,Pending
3,The platform must handle 500 concurrent users,Non-Functional,High,Approved
4,Payment processing must support credit cards and PayPal securely,Functional,High,Approved
5,Daily backups of all data must be performed automatically,Non-Functional,Medium,Pending

Load the CSV Data in Python

import pandas as pd

# Load the CSV file
csv_file = 'requirements.csv'
df = pd.read_csv(csv_file)

# Convert the data to JSON format
json_data = df.to_json(orient='records')
print("JSON Data for AI Query:")
print(json_data)

Step 2: Define the Prompt

We’ll use the following prompt to query multiple LLMs. The task involves extracting high-priority functional requirements and generating test cases.

Prompt Template:

Here is the software requirements data in JSON format:

{json_data}

Task:
1. Extract all "High Priority" functional requirements.
2. Generate 2 test cases for each extracted requirement.
3. Return the output in a structured JSON format.

Step 3: Query Multiple LLMs

We’ll use Python to query multiple LLMs (e.g., GPT-4, Claude, Cohere) simultaneously. Here’s how you can set up the multi-agent system.

Install Required Libraries

pip install openai anthropic cohere

Python Code for Multi-Agent Queries

import openai
import requests
import json

# Set API Keys for different LLMs
openai.api_key = 'your-openai-api-key'
anthropic_api_key = 'your-anthropic-api-key'
cohere_api_key = 'your-cohere-api-key'

# Define the prompt
prompt = f"""
Here is the software requirements data in JSON format:

{json_data}

Task:
1. Extract all "High Priority" functional requirements.
2. Generate 2 test cases for each extracted requirement.
3. Return the output in a structured JSON format.
"""

# Query GPT (OpenAI)
def query_gpt(prompt):
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=prompt,
        max_tokens=500,
        temperature=0
    )
    return response.choices[0].text.strip()

# Query Claude (Anthropic)
def query_claude(prompt):
    url = "https://api.anthropic.com/v1/complete"
    headers = {
        "x-api-key": anthropic_api_key,
        "Content-Type": "application/json"
    }
    data = {
        "prompt": prompt,
        "model": "claude-v1",
        "max_tokens_to_sample": 500,
        "temperature": 0
    }
    response = requests.post(url, headers=headers, json=data)
    return response.json()['completion'].strip()

# Query Cohere
def query_cohere(prompt):
    url = "https://api.cohere.ai/generate"
    headers = {"Authorization": f"Bearer {cohere_api_key}"}
    data = {
        "model": "command-xlarge-nightly",
        "prompt": prompt,
        "max_tokens": 500,
        "temperature": 0
    }
    response = requests.post(url, headers=headers, json=data)
    return response.json()['generations'][0]['text'].strip()

# Execute queries
gpt_output = query_gpt(prompt)
claude_output = query_claude(prompt)
cohere_output = query_cohere(prompt)

# Display outputs
print("GPT Output:")
print(gpt_output)

print("\nClaude Output:")
print(claude_output)

print("\nCohere Output:")
print(cohere_output)

Step 4: Compare and Validate Outputs

To measure the accuracy of the outputs, we’ll compare the results from all LLMs and identify discrepancies. This can be done by:

  1. Manual Comparison: Reviewing outputs side-by-side.
  2. Automated Comparison: Using Python to check for differences.

Automated Comparison Code

# Parse outputs into JSON
gpt_data = json.loads(gpt_output)
claude_data = json.loads(claude_output)
cohere_data = json.loads(cohere_output)

# Compare outputs
def compare_outputs(data1, data2):
    discrepancies = []
    for key in data1.keys():
        if data1[key] != data2[key]:
            discrepancies.append({
                "Key": key,
                "Data1": data1[key],
                "Data2": data2[key]
            })
    return discrepancies

# Compare GPT vs Claude
discrepancies_gpt_claude = compare_outputs(gpt_data, claude_data)

# Compare GPT vs Cohere
discrepancies_gpt_cohere = compare_outputs(gpt_data, cohere_data)

# Display discrepancies
print("Discrepancies between GPT and Claude:")
print(discrepancies_gpt_claude)

print("\nDiscrepancies between GPT and Cohere:")
print(discrepancies_gpt_cohere)

Step 5: Measure Accuracy

To evaluate the accuracy of each model:

  1. Define a ground truth (a set of manually verified correct outputs).
  2. Compare each model’s output against the ground truth using metrics like:
    • Precision: Correct results out of all results generated by the model.
    • Recall: Correct results out of all possible correct results.
    • F1-Score: Harmonic mean of precision and recall.

Accuracy Calculation Code

from sklearn.metrics import precision_score, recall_score, f1_score

def calculate_accuracy(predicted, ground_truth):
    # Convert results to sets for comparison
    predicted_set = set(predicted)
    ground_truth_set = set(ground_truth)
    
    # Calculate metrics
    tp = len(predicted_set & ground_truth_set)  # True positives
    fp = len(predicted_set - ground_truth_set)  # False positives
    fn = len(ground_truth_set - predicted_set)  # False negatives
    
    precision = tp / (tp + fp) if tp + fp > 0 else 0
    recall = tp / (tp + fn) if tp + fn > 0 else 0
    f1 = 2 * (precision * recall) / (precision + recall) if precision + recall > 0 else 0
    
    return {"Precision": precision, "Recall": recall, "F1-Score": f1}

# Example ground truth and predicted outputs
ground_truth = [
    "Users must be able to register with valid credentials.",
    "Payment can be processed securely via credit card."
]

predicted_gpt = [
    "Users must be able to register with valid credentials.",
    "Payment can be processed securely via credit card."
]

predicted_claude = [
    "Users must be able to register with valid credentials.",
    "Payments can be processed using PayPal securely."
]

# Calculate accuracy for GPT
accuracy_gpt = calculate_accuracy(predicted_gpt, ground_truth)
print("GPT Accuracy:", accuracy_gpt)

# Calculate accuracy for Claude
accuracy_claude = calculate_accuracy(predicted_claude, ground_truth)
print("Claude Accuracy:", accuracy_claude)

Benefits of Multi-Agent Systems

  1. Increased Reliability: Cross-validating outputs from multiple models ensures higher accuracy.
  2. Error Detection: Discrepancies highlight potential errors or ambiguities in the data or model behavior.
  3. Diverse Perspectives: Different LLMs may interpret prompts differently, providing complementary insights.

Conclusion

Using a multi-agent system with multiple LLMs (e.g., GPT, Claude, Cohere), you can improve the accuracy and reliability of software requirement processing tasks. By comparing outputs, identifying discrepancies, and calculating metrics like precision and recall, you can ensure robust results for tasks like test case generation, requirement extraction, and validation.

This approach is ideal for critical projects where accuracy is paramount, and errors can have significant consequences. With Python and Generative AI, you can build a scalable and reliable pipeline for processing technical documents. 🚀

Processing Software Requirements or Testing Documents in CSV Format with Python and Generative AI

 Software requirements or testing documents often contain structured data, and querying or processing these documents effectively can make tasks like test case generation, requirement analysis, and data summarization much easier. In this blog, we’ll explore how to use Python and Generative AI to process software requirements or testing documents stored in CSV files.

We’ll cover:

  1. Reading and preparing the CSV file.
  2. Writing queries for Generative AI using prompt engineering.
  3. Using Generative AI to extract, process, and generate additional data based on the CSV.
  4. Saving results back to a CSV file.

1. Understanding the Example CSV

Here’s an example of a software requirements CSV file (requirements.csv):

ID,Requirement,Type,Priority,Status
1,Users must be able to register and log in using their email and password,Functional,High,Approved
2,Search functionality must return relevant results within 2 seconds,Functional,Medium,Pending
3,The platform must handle 500 concurrent users,Non-Functional,High,Approved
4,Payment processing must support credit cards and PayPal securely,Functional,High,Approved
5,Daily backups of all data must be performed automatically,Non-Functional,Medium,Pending

2. Python Code to Read and Process the CSV

We’ll use Python to:

  1. Read the CSV file into a DataFrame.
  2. Convert it to JSON for Generative AI queries.
  3. Write prompts to query the data.
  4. Save the AI-generated results back to a CSV.

Setup

Install Required Libraries:

pip install pandas openai

Step 1: Read the CSV File

We’ll use pandas to read the CSV file into a DataFrame and convert it into JSON format for use with Generative AI.

import pandas as pd

# Read the CSV file
csv_file = 'requirements.csv'
df = pd.read_csv(csv_file)

# Display the DataFrame
print("Original DataFrame:")
print(df)

# Convert the DataFrame into JSON for AI queries
json_data = df.to_json(orient='records')
print("\nData in JSON format:")
print(json_data)

Output:

Original DataFrame:
   ID                                          Requirement           Type  \
0   1  Users must be able to register and log in using...    Functional   
1   2  Search functionality must return relevant resul...    Functional   
2   3     The platform must handle 500 concurrent users  Non-Functional   
3   4  Payment processing must support credit cards an...    Functional   
4   5  Daily backups of all data must be performed aut...  Non-Functional   

  Priority    Status  
0     High  Approved  
1   Medium   Pending  
2     High  Approved  
3     High  Approved  
4   Medium   Pending  

Data in JSON format:
[{"ID":1,"Requirement":"Users must be able to register and log in using their email and password","Type":"Functional","Priority":"High","Status":"Approved"}, ...]

Step 2: Write a Query Using Prompt Engineering

To interact with the JSON data, we’ll craft a prompt for Generative AI. For this example, we’ll filter all high-priority functional requirements and generate test cases for each.

Prepare the Prompt

import openai

# Set your OpenAI API key
openai.api_key = 'your-openai-api-key'

# Define the prompt
prompt = f"""
Here is the software requirements data in JSON format:

{json_data}

Task:
1. Extract all "High Priority" functional requirements.
2. Generate 2 test cases for each extracted requirement.
3. Return the output in a structured JSON format.
"""

# Display the prompt
print("Generated Prompt:")
print(prompt)

Step 3: Query the Generative AI Model

We’ll send the prompt to OpenAI’s API (e.g., GPT-4) and process the output.

# Query the OpenAI API
response = openai.Completion.create(
    engine="text-davinci-003",  # Use the appropriate engine
    prompt=prompt,
    max_tokens=500,
    temperature=0
)

# Extract the response text
ai_output = response.choices[0].text.strip()

# Display AI's Output
print("\nAI Output:")
print(ai_output)

Expected AI Output:

{
  "High Priority Requirements": [
    {
      "Requirement": "Users must be able to register and log in using their email and password",
      "Test Cases": [
        "Verify that the user can register with a valid email and password.",
        "Verify that the user cannot register with an invalid email format."
      ]
    },
    {
      "Requirement": "Payment processing must support credit cards and PayPal securely",
      "Test Cases": [
        "Verify that payment can be processed securely via credit card.",
        "Verify that payment can be processed securely via PayPal."
      ]
    }
  ]
}

Step 4: Save the Results Back to a CSV

Now, we’ll parse the AI’s JSON output and save the results in a structured CSV file.

Parse and Save the Results

import json

# Parse the AI output (assumes it is valid JSON)
output_data = json.loads(ai_output)

# Flatten the data for CSV storage
flattened_data = []
for item in output_data['High Priority Requirements']:
    for test_case in item['Test Cases']:
        flattened_data.append({
            "Requirement": item['Requirement'],
            "Test Case": test_case
        })

# Convert to a DataFrame
output_df = pd.DataFrame(flattened_data)

# Save to a new CSV file
output_file = 'high_priority_test_cases.csv'
output_df.to_csv(output_file, index=False)

print(f"\nGenerated test cases saved to {output_file}")

Generated CSV (high_priority_test_cases.csv):

Requirement,Test Case
Users must be able to register and log in using their email and password,Verify that the user can register with a valid email and password.
Users must be able to register and log in using their email and password,Verify that the user cannot register with an invalid email format.
Payment processing must support credit cards and PayPal securely,Verify that payment can be processed securely via credit card.
Payment processing must support credit cards and PayPal securely,Verify that payment can be processed securely via PayPal.

Complete Python Script

Here’s the complete code for the workflow:

import pandas as pd
import openai
import json

# Set OpenAI API key
openai.api_key = 'your-openai-api-key'

# Step 1: Read the CSV file
csv_file = 'requirements.csv'
df = pd.read_csv(csv_file)
json_data = df.to_json(orient='records')

# Step 2: Define the prompt
prompt = f"""
Here is the software requirements data in JSON format:

{json_data}

Task:
1. Extract all "High Priority" functional requirements.
2. Generate 2 test cases for each extracted requirement.
3. Return the output in a structured JSON format.
"""

# Step 3: Query the OpenAI API
response = openai.Completion.create(
    engine="text-davinci-003",
    prompt=prompt,
    max_tokens=500,
    temperature=0
)

# Extract and parse the AI output
ai_output = response.choices[0].text.strip()
output_data = json.loads(ai_output)

# Step 4: Flatten the data for saving to CSV
flattened_data = []
for item in output_data['High Priority Requirements']:
    for test_case in item['Test Cases']:
        flattened_data.append({
            "Requirement": item['Requirement'],
            "Test Case": test_case
        })

# Convert to DataFrame and save to CSV
output_df = pd.DataFrame(flattened_data)
output_file = 'high_priority_test_cases.csv'
output_df.to_csv(output_file, index=False)

print(f"Generated test cases saved to {output_file}")

Conclusion

With this workflow, you can:

  1. Load and process CSV data for software requirements or testing documents.
  2. Use Generative AI with prompt engineering to extract, analyze, or generate additional information (e.g., test cases).
  3. Save AI-generated results into a structured CSV file for further use.

This approach is highly modular and can be adapted for different tasks, such as summarizing requirements, identifying gaps, or validating completeness. Happy engineering! 🚀

AI's Impact on the IT Industry 2026