Software requirements or testing documents often contain structured data, and querying or processing these documents effectively can make tasks like test case generation, requirement analysis, and data summarization much easier. In this blog, we’ll explore how to use Python and Generative AI to process software requirements or testing documents stored in CSV files.
We’ll cover:
- Reading and preparing the CSV file.
- Writing queries for Generative AI using prompt engineering.
- Using Generative AI to extract, process, and generate additional data based on the CSV.
- Saving results back to a CSV file.
1. Understanding the Example CSV
Here’s an example of a software requirements CSV file (requirements.csv
):
ID,Requirement,Type,Priority,Status
1,Users must be able to register and log in using their email and password,Functional,High,Approved
2,Search functionality must return relevant results within 2 seconds,Functional,Medium,Pending
3,The platform must handle 500 concurrent users,Non-Functional,High,Approved
4,Payment processing must support credit cards and PayPal securely,Functional,High,Approved
5,Daily backups of all data must be performed automatically,Non-Functional,Medium,Pending
2. Python Code to Read and Process the CSV
We’ll use Python to:
- Read the CSV file into a DataFrame.
- Convert it to JSON for Generative AI queries.
- Write prompts to query the data.
- Save the AI-generated results back to a CSV.
Setup
Install Required Libraries:
pip install pandas openai
Step 1: Read the CSV File
We’ll use pandas to read the CSV file into a DataFrame and convert it into JSON format for use with Generative AI.
import pandas as pd
# Read the CSV file
csv_file = 'requirements.csv'
df = pd.read_csv(csv_file)
# Display the DataFrame
print("Original DataFrame:")
print(df)
# Convert the DataFrame into JSON for AI queries
json_data = df.to_json(orient='records')
print("\nData in JSON format:")
print(json_data)
Output:
Original DataFrame:
ID Requirement Type \
0 1 Users must be able to register and log in using... Functional
1 2 Search functionality must return relevant resul... Functional
2 3 The platform must handle 500 concurrent users Non-Functional
3 4 Payment processing must support credit cards an... Functional
4 5 Daily backups of all data must be performed aut... Non-Functional
Priority Status
0 High Approved
1 Medium Pending
2 High Approved
3 High Approved
4 Medium Pending
Data in JSON format:
[{"ID":1,"Requirement":"Users must be able to register and log in using their email and password","Type":"Functional","Priority":"High","Status":"Approved"}, ...]
Step 2: Write a Query Using Prompt Engineering
To interact with the JSON data, we’ll craft a prompt for Generative AI. For this example, we’ll filter all high-priority functional requirements and generate test cases for each.
Prepare the Prompt
import openai
# Set your OpenAI API key
openai.api_key = 'your-openai-api-key'
# Define the prompt
prompt = f"""
Here is the software requirements data in JSON format:
{json_data}
Task:
1. Extract all "High Priority" functional requirements.
2. Generate 2 test cases for each extracted requirement.
3. Return the output in a structured JSON format.
"""
# Display the prompt
print("Generated Prompt:")
print(prompt)
Step 3: Query the Generative AI Model
We’ll send the prompt to OpenAI’s API (e.g., GPT-4) and process the output.
# Query the OpenAI API
response = openai.Completion.create(
engine="text-davinci-003", # Use the appropriate engine
prompt=prompt,
max_tokens=500,
temperature=0
)
# Extract the response text
ai_output = response.choices[0].text.strip()
# Display AI's Output
print("\nAI Output:")
print(ai_output)
Expected AI Output:
{
"High Priority Requirements": [
{
"Requirement": "Users must be able to register and log in using their email and password",
"Test Cases": [
"Verify that the user can register with a valid email and password.",
"Verify that the user cannot register with an invalid email format."
]
},
{
"Requirement": "Payment processing must support credit cards and PayPal securely",
"Test Cases": [
"Verify that payment can be processed securely via credit card.",
"Verify that payment can be processed securely via PayPal."
]
}
]
}
Step 4: Save the Results Back to a CSV
Now, we’ll parse the AI’s JSON output and save the results in a structured CSV file.
Parse and Save the Results
import json
# Parse the AI output (assumes it is valid JSON)
output_data = json.loads(ai_output)
# Flatten the data for CSV storage
flattened_data = []
for item in output_data['High Priority Requirements']:
for test_case in item['Test Cases']:
flattened_data.append({
"Requirement": item['Requirement'],
"Test Case": test_case
})
# Convert to a DataFrame
output_df = pd.DataFrame(flattened_data)
# Save to a new CSV file
output_file = 'high_priority_test_cases.csv'
output_df.to_csv(output_file, index=False)
print(f"\nGenerated test cases saved to {output_file}")
Generated CSV (high_priority_test_cases.csv
):
Requirement,Test Case
Users must be able to register and log in using their email and password,Verify that the user can register with a valid email and password.
Users must be able to register and log in using their email and password,Verify that the user cannot register with an invalid email format.
Payment processing must support credit cards and PayPal securely,Verify that payment can be processed securely via credit card.
Payment processing must support credit cards and PayPal securely,Verify that payment can be processed securely via PayPal.
Complete Python Script
Here’s the complete code for the workflow:
import pandas as pd
import openai
import json
# Set OpenAI API key
openai.api_key = 'your-openai-api-key'
# Step 1: Read the CSV file
csv_file = 'requirements.csv'
df = pd.read_csv(csv_file)
json_data = df.to_json(orient='records')
# Step 2: Define the prompt
prompt = f"""
Here is the software requirements data in JSON format:
{json_data}
Task:
1. Extract all "High Priority" functional requirements.
2. Generate 2 test cases for each extracted requirement.
3. Return the output in a structured JSON format.
"""
# Step 3: Query the OpenAI API
response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
max_tokens=500,
temperature=0
)
# Extract and parse the AI output
ai_output = response.choices[0].text.strip()
output_data = json.loads(ai_output)
# Step 4: Flatten the data for saving to CSV
flattened_data = []
for item in output_data['High Priority Requirements']:
for test_case in item['Test Cases']:
flattened_data.append({
"Requirement": item['Requirement'],
"Test Case": test_case
})
# Convert to DataFrame and save to CSV
output_df = pd.DataFrame(flattened_data)
output_file = 'high_priority_test_cases.csv'
output_df.to_csv(output_file, index=False)
print(f"Generated test cases saved to {output_file}")
Conclusion
With this workflow, you can:
- Load and process CSV data for software requirements or testing documents.
- Use Generative AI with prompt engineering to extract, analyze, or generate additional information (e.g., test cases).
- Save AI-generated results into a structured CSV file for further use.
This approach is highly modular and can be adapted for different tasks, such as summarizing requirements, identifying gaps, or validating completeness. Happy engineering! 🚀
Comments