Agate Data Analysis Library

1.14.2 · active · verified Sun Mar 29

Agate is a Python data analysis library that is optimized for humans instead of machines. It is presented as an alternative to numpy and pandas, designed to solve real-world problems with readable code. It is currently at version 1.14.2 and has a steady release cadence, actively maintained by the wireservice team.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates loading data from a CSV (simulated in-memory), filtering rows, grouping data, aggregating results (mean and sum), and ordering the final table. It highlights `agate`'s immutable operations, where methods like `where()` and `group_by()` return new table objects.

import agate
import csv
import io

# Create a dummy CSV in memory for a runnable example
csv_data = """name,age,city,salary
Alice,30,New York,70000
Bob,24,London,50000
Charlie,30,New York,75000
David,35,London,90000
Eve,24,Paris,60000
"""

# Use io.StringIO to simulate a file for from_csv
with io.StringIO(csv_data) as f:
    # agate automatically infers types with TypeTester by default
    table = agate.Table.from_csv(f)

print("Original Table:")
table.print_table()

# Filter rows where age is less than 30
filtered_table = table.where(lambda row: row['age'] < 30)
print("\nFiltered Table (age < 30):")
filtered_table.print_table()

# Group by city and calculate average salary
from agate.aggregations import Sum, Mean

by_city = table.group_by('city')
averages = by_city.aggregate([
    ('average_salary', Mean('salary')),
    ('total_employees', Sum('age', cast=True)) # Using Sum on age as a proxy for count
])

print("\nAggregated by City (Average Salary & Total Employees):")
averages.print_table()

# Order by average salary, descending
ordered_averages = averages.order_by('average_salary', reverse=True)
print("\nOrdered by Average Salary (Descending):")
ordered_averages.print_table()

view raw JSON →