Prerequisites
Python 3.10 or higher
PraisonAI Agents package installed
pandas
package installed
Use Pandas Tools to manipulate and analyze data with AI agents.
Install Dependencies
First, install the required packages:
pip install praisonaiagents pandas
Import Components
Import the necessary components:
from praisonaiagents import Agent, Task, PraisonAIAgents
from praisonaiagents.tools import read_csv, read_excel, write_csv, write_excel, filter_data, get_summary, group_by, pivot_table
Create Agent
Create a data analysis agent:
data_agent = Agent(
name = "DataAnalyst" ,
role = "Data Analysis Specialist" ,
goal = "Analyze and transform data efficiently." ,
backstory = "Expert in data manipulation and statistical analysis." ,
tools = [read_csv, read_excel, write_csv, write_excel, filter_data, get_summary, group_by, pivot_table],
self_reflect = False
)
Define Task
Define the analysis task:
analysis_task = Task(
description = "Analyze sales data to identify trends and patterns." ,
expected_output = "Statistical summary and key insights." ,
agent = data_agent,
name = "sales_analysis"
)
Run Agent
Initialize and run the agent:
agents = PraisonAIAgents(
agents = [data_agent],
tasks = [analysis_task],
process = "sequential"
)
agents.start()
What are Pandas Tools? Pandas Tools provide data analysis capabilities for AI agents:
Data filtering and selection
Statistical analysis
Data transformation
Group operations
Pivot table creation
Key Components
Data Agent Create specialized data agents:
Agent( tools = [read_csv, read_excel, write_csv, write_excel, filter_data, get_summary, group_by, pivot_table])
Analysis Task Define analysis tasks:
Task( description = "analysis_query" )
Process Types Sequential or parallel processing:
Analysis Options Customize analysis parameters:
group_by = "category" , agg_func = "mean"
Available Functions
from praisonaiagents.tools import read_csv
from praisonaiagents.tools import read_excel
from praisonaiagents.tools import write_csv
from praisonaiagents.tools import write_excel
from praisonaiagents.tools import filter_data
from praisonaiagents.tools import get_summary
from praisonaiagents.tools import group_by
from praisonaiagents.tools import pivot_table
Function Details
read_csv(filepath: str, **kwargs)
Reads CSV files into pandas DataFrames:
Flexible data import
Error handling
Additional pandas options
# Basic usage
df = read_csv( "data.csv" )
# With additional options
df = read_csv(
"data.csv" ,
encoding = "utf-8" ,
sep = "," ,
header = 0 ,
usecols = [ "name" , "age" , "salary" ]
)
# Returns: pd.DataFrame
# Error case: Returns dict with 'error' key
read_excel(filepath: str, **kwargs)
Reads Excel files into pandas DataFrames:
Multiple sheet support
Error handling
Additional pandas options
# Basic usage
df = read_excel( "data.xlsx" )
# With sheet specification
df = read_excel(
"data.xlsx" ,
sheet_name = "Sheet1" ,
header = 0 ,
usecols = "A:D"
)
# Returns: pd.DataFrame
# Error case: Returns dict with 'error' key
write_csv(df: pd.DataFrame, filepath: str, **kwargs)
Writes DataFrames to CSV files:
Directory creation
Error handling
Formatting options
# Basic usage
success = write_csv(df, "output.csv" )
# With formatting options
success = write_csv(
df,
"output.csv" ,
index = False ,
encoding = "utf-8" ,
sep = "," ,
date_format = "%Y-%m- %d "
)
# Returns: bool (True if successful)
write_excel(df: pd.DataFrame, filepath: str, **kwargs)
Writes DataFrames to Excel files:
Directory creation
Error handling
Excel-specific options
# Basic usage
success = write_excel(df, "output.xlsx" )
# With formatting options
success = write_excel(
df,
"output.xlsx" ,
sheet_name = "Data" ,
index = False ,
freeze_panes = ( 1 , 0 )
)
# Returns: bool (True if successful)
filter_data(df: pd.DataFrame, query: str)
Filters DataFrames using query strings:
SQL-like syntax
Complex conditions
Efficient filtering
# Basic filtering
filtered_df = filter_data(df, "age > 25" )
# Complex conditions
filtered_df = filter_data(
df,
"age > 25 and salary >= 50000 and city == 'New York'"
)
# Returns: pd.DataFrame
# Error case: Returns dict with 'error' key
get_summary(df: pd.DataFrame)
Generates comprehensive data summaries:
Basic statistics
Data types
Memory usage
Null counts
# Get data summary
summary = get_summary(df)
# Returns: Dict[str, Any]
# Example output:
# {
# "shape": [1000, 5],
# "columns": ["name", "age", "salary", "city"],
# "dtypes": {
# "name": "object",
# "age": "int64",
# "salary": "float64",
# "city": "object"
# },
# "null_counts": {
# "name": 0,
# "age": 2,
# "salary": 1,
# "city": 0
# },
# "numeric_summary": {
# "age": {
# "count": 998,
# "mean": 35.5,
# "std": 12.3,
# "min": 18,
# "25%": 28,
# "50%": 35,
# "75%": 45,
# "max": 65
# },
# ...
# },
# "memory_usage": 40960
# }
group_by(df: pd.DataFrame, columns: Union[str, List[str]], agg_dict: Dict[str, Union[str, List[str]]])
Groups and aggregates data:
Multiple grouping columns
Multiple aggregations
Custom functions
# Basic grouping
grouped_df = group_by(
df,
columns = "city" ,
agg_dict = { "salary" : "mean" }
)
# Complex grouping
grouped_df = group_by(
df,
columns = [ "city" , "department" ],
agg_dict = {
"salary" : [ "mean" , "sum" , "count" ],
"age" : [ "min" , "max" , "mean" ]
}
)
# Returns: pd.DataFrame
# Error case: Returns dict with 'error' key
pivot_table(df: pd.DataFrame, index: Union[str, List[str]], columns: Optional[Union[str, List[str]]] = None, values: Optional[Union[str, List[str]]] = None, aggfunc: str = “mean”)
Creates pivot tables:
Multiple index levels
Column pivoting
Custom aggregations
Flexible reshaping
# Basic pivot
pivot_df = pivot_table(
df,
index = "city" ,
columns = "department" ,
values = "salary"
)
# Complex pivot
pivot_df = pivot_table(
df,
index = [ "city" , "department" ],
columns = "year" ,
values = [ "salary" , "headcount" ],
aggfunc = {
"salary" : "mean" ,
"headcount" : "sum"
}
)
# Returns: pd.DataFrame
# Error case: Returns dict with 'error' key
Example Agent Configuration
from praisonaiagents import Agent
from praisonaiagents.tools import (
read_csv, read_excel, write_csv, write_excel,
filter_data, get_summary, group_by, pivot_table
)
agent = Agent(
name = "DataAnalyzer" ,
description = "An agent that analyzes data using pandas" ,
tools = [
read_csv, read_excel, write_csv, write_excel,
filter_data, get_summary, group_by, pivot_table
]
)
Dependencies
The pandas tools require the following Python packages:
pandas: For data manipulation and analysis
numpy: For numerical operations (automatically installed with pandas)
openpyxl: For Excel file support (optional)
These will be automatically installed when needed.
Error Handling
All functions include comprehensive error handling:
File I/O errors
Data format errors
Query syntax errors
Memory errors
Errors are handled consistently:
File operations return bool for success/failure
Data operations return error details in result
All errors are logged for debugging
Common Use Cases
Data Analysis:
# Load and analyze sales data
df = read_csv( "sales_data.csv" )
summary = get_summary(df)
print ( f "Total records: { summary[ 'shape' ][ 0 ] } " )
print ( f "Missing values: { summary[ 'null_counts' ] } " )
# Filter high-value transactions
high_value = filter_data(df, "amount > 10000" )
write_excel(high_value, "high_value_sales.xlsx" )
Sales Reporting:
# Create sales summary by region and product
sales_summary = group_by(
df,
columns = [ "region" , "product" ],
agg_dict = {
"amount" : [ "sum" , "mean" , "count" ],
"profit" : [ "sum" , "mean" ]
}
)
# Create pivot table for quarterly analysis
quarterly_sales = pivot_table(
df,
index = "product" ,
columns = "quarter" ,
values = "amount" ,
aggfunc = "sum"
)
Data Transformation:
# Load and clean customer data
customers = read_excel( "customers.xlsx" )
filtered = filter_data(
customers,
"age >= 18 and status == 'active'"
)
# Group by demographics
demographics = group_by(
filtered,
columns = [ "age_group" , "region" ],
agg_dict = {
"spending" : [ "mean" , "sum" ],
"visits" : "count"
}
)
write_csv(demographics, "demographic_analysis.csv" )
Examples
Basic Data Analysis Agent
from praisonaiagents import Agent, Task, PraisonAIAgents
from praisonaiagents.tools import read_csv, filter_data, get_summary, group_by
# Create data analysis agent
data_agent = Agent(
name = "DataExpert" ,
role = "Data Analyst" ,
goal = "Process and analyze data efficiently." ,
backstory = "Expert in data analysis and statistical methods." ,
tools = [read_csv, filter_data, get_summary, group_by],
self_reflect = False
)
# Define analysis task
analysis_task = Task(
description = "Analyze sales performance data." ,
expected_output = "Sales analysis report." ,
agent = data_agent,
name = "sales_analysis"
)
# Run agent
agents = PraisonAIAgents(
agents = [data_agent],
tasks = [analysis_task],
process = "sequential"
)
agents.start()
Advanced Data Operations with Multiple Agents
# Create data import/export agent
io_agent = Agent(
name = "DataIO" ,
role = "Data IO Specialist" ,
goal = "Handle data import and export operations." ,
tools = [read_csv, read_excel, write_csv, write_excel],
self_reflect = False
)
# Create analysis agent
analysis_agent = Agent(
name = "Analyzer" ,
role = "Data Analysis Specialist" ,
goal = "Perform complex data analysis." ,
tools = [filter_data, get_summary, group_by, pivot_table],
self_reflect = False
)
# Define tasks
io_task = Task(
description = "Import and export data" ,
agent = io_agent,
name = "data_io"
)
analysis_task = Task(
description = "Analyze data" ,
agent = analysis_agent,
name = "data_analysis"
)
# Run agents
agents = PraisonAIAgents(
agents = [io_agent, analysis_agent],
tasks = [io_task, analysis_task],
process = "sequential"
)
agents.start()
Best Practices
Common Patterns
Data Analysis Pipeline
# Data preparation agent
prep_agent = Agent(
name = "Preparer" ,
role = "Data Preparation" ,
tools = [read_csv, read_excel, write_csv, write_excel, filter_data, get_summary, group_by, pivot_table]
)
# Analysis agent
analyst = Agent(
name = "Analyst" ,
role = "Data Analyst"
)
# Define tasks
prep_task = Task(
description = "Clean and prepare data" ,
agent = prep_agent
)
analysis_task = Task(
description = "Analyze prepared data" ,
agent = analyst
)
# Run workflow
agents = PraisonAIAgents(
agents = [prep_agent, analyst],
tasks = [prep_task, analysis_task]
)