Agate Data Analysis Library

raw JSON →
1.14.2 verified Tue May 12 auth: no python install: verified quickstart: stale

Agate is a Python data analysis library that is optimized for humans instead of machines. It is presented as an alternative to numpy and pandas, designed to solve real-world problems with readable code. It is currently at version 1.14.2 and has a steady release cadence, actively maintained by the wireservice team.

pip install agate
error ModuleNotFoundError: No module named 'agate'
cause This error occurs when the 'agate' library is not installed in your Python environment.
fix
Install the 'agate' library using pip: 'pip install agate'.
error ImportError: cannot import name 'Table' from 'agate'
cause This error occurs when attempting to import 'Table' from 'agate' without the library being installed or due to an incorrect import statement.
fix
Ensure 'agate' is installed and use the correct import statement: 'from agate import Table'.
error AttributeError: module 'agate' has no attribute 'Table'
cause This error occurs when the 'agate' module is not properly installed or there is a naming conflict with another module.
fix
Verify that 'agate' is installed correctly and that there are no conflicting module names in your project.
error agate.exceptions.CastError: Can not parse value '200.000.000' as Decimal.
cause This error occurs when 'agate' encounters a value formatted with periods as thousand separators, which it cannot parse as a decimal number.
fix
Ensure that numerical values are formatted correctly, using commas as thousand separators or removing them entirely before processing with 'agate'.
error ImportError: cannot import name 'isawaitable' from 'inspect'
cause This error occurs when attempting to import 'isawaitable' from the 'inspect' module in a Python version that does not support it.
fix
Ensure you are using a Python version that includes 'isawaitable' in the 'inspect' module, or update your Python interpreter to a compatible version.
breaking Agate has dropped official support for Python 2.x. Users must ensure they are running Python 3.5 or newer. PyPI listings indicate active testing and support for Python 3.10-3.14.
fix Upgrade your Python environment to Python 3.5+ (preferably a recent stable version like 3.9+). If migrating from very old projects, update `agate` accordingly.
gotcha Agate's core design principle dictates that `Table` objects are immutable. Operations like `select()`, `where()`, or `order_by()` do not modify the original table in-place; instead, they return *new* `Table` instances.
fix Always assign the result of table operations to a new variable (e.g., `new_table = original_table.where(...)`) or chain operations (e.g., `table.where(...).order_by(...)`).
gotcha When loading data from sources like CSV, `agate` uses a `TypeTester` to automatically infer column data types. While generally effective, it can sometimes guess incorrectly, especially with ambiguous data.
fix If type inference is wrong, manually specify column types when loading data using the `column_types` argument in `Table.from_csv` or by instantiating `TypeTester` with `force` overrides. Example: `tester = agate.TypeTester(force={'my_column': agate.Text()}); table = agate.Table.from_csv(..., column_types=tester)`.
gotcha There are multiple Python libraries with 'Agate' in their name. This entry refers to `wireservice/agate`, a data analysis library (`pip install agate`). Another common one is `obiba-agate`, which is a client for an 'Agate server' and has different use cases and dependencies.
fix Always verify the correct library by its PyPI slug (`agate`) and maintainer (`wireservice`) to ensure you are installing the intended data analysis library. Check documentation links (e.g., `agate.rtfd.org`) to confirm.
breaking The `agate.aggregations.Sum` aggregate function does not accept a `cast` keyword argument. Attempting to pass `cast=True` (or any value) to its constructor will result in a `TypeError`. This applies to most `agate` aggregate functions; type conversion should generally be handled before aggregation.
fix Remove the `cast` argument from the `agate.aggregations.Sum` constructor. Ensure the column being aggregated is already of the appropriate numeric type for summation. If type conversion is necessary, apply it to the column using `table.compute()` or similar methods *before* performing the aggregation.
gotcha The `agate.aggregations.Sum` class does not accept a `cast` keyword argument during initialization. Passing `cast` to `Sum()` will result in a `TypeError`.
fix Remove the `cast` argument when initializing `agate.aggregations.Sum`. Ensure the column being aggregated is already of a numeric type. If explicit casting is required, perform it in a separate `Table.compute()` operation before applying the aggregation.
pip install agate[icu]
python os / libc variant status wheel install import disk
3.10 alpine (musl) agate wheel - 0.43s 53.1M
3.10 alpine (musl) icu wheel - 0.42s 53.1M
3.10 alpine (musl) agate - - 0.43s 53.1M
3.10 alpine (musl) icu - - 0.43s 53.1M
3.10 slim (glibc) agate wheel 2.4s 0.34s 54M
3.10 slim (glibc) icu wheel 2.5s 0.30s 54M
3.10 slim (glibc) agate - - 0.31s 54M
3.10 slim (glibc) icu - - 0.32s 54M
3.11 alpine (musl) agate wheel - 0.59s 55.3M
3.11 alpine (musl) icu wheel - 0.60s 55.3M
3.11 alpine (musl) agate - - 0.60s 55.3M
3.11 alpine (musl) icu - - 0.60s 55.3M
3.11 slim (glibc) agate wheel 2.5s 0.47s 56M
3.11 slim (glibc) icu wheel 2.4s 0.47s 56M
3.11 slim (glibc) agate - - 0.45s 56M
3.11 slim (glibc) icu - - 0.49s 56M
3.12 alpine (musl) agate wheel - 0.52s 47.1M
3.12 alpine (musl) icu wheel - 0.47s 47.1M
3.12 alpine (musl) agate - - 0.52s 47.1M
3.12 alpine (musl) icu - - 0.51s 47.1M
3.12 slim (glibc) agate wheel 2.3s 0.50s 48M
3.12 slim (glibc) icu wheel 2.2s 0.50s 48M
3.12 slim (glibc) agate - - 0.53s 48M
3.12 slim (glibc) icu - - 0.53s 48M
3.13 alpine (musl) agate wheel - 0.47s 46.8M
3.13 alpine (musl) icu wheel - 0.47s 46.8M
3.13 alpine (musl) agate - - 0.49s 46.7M
3.13 alpine (musl) icu - - 0.50s 46.7M
3.13 slim (glibc) agate wheel 2.4s 0.45s 47M
3.13 slim (glibc) icu wheel 2.3s 0.50s 47M
3.13 slim (glibc) agate - - 0.49s 47M
3.13 slim (glibc) icu - - 0.49s 47M
3.9 alpine (musl) agate wheel - 0.34s 52.6M
3.9 alpine (musl) icu wheel - 0.34s 52.6M
3.9 alpine (musl) agate - - 0.37s 52.6M
3.9 alpine (musl) icu - - 0.38s 52.6M
3.9 slim (glibc) agate wheel 2.6s 0.33s 53M
3.9 slim (glibc) icu wheel 2.8s 0.31s 53M
3.9 slim (glibc) agate - - 0.32s 53M
3.9 slim (glibc) icu - - 0.31s 53M

This quickstart demonstrates loading data from a CSV (simulated in-memory), filtering rows, grouping data, aggregating results (mean and sum), and ordering the final table. It highlights `agate`'s immutable operations, where methods like `where()` and `group_by()` return new table objects.

import agate
import csv
import io

# Create a dummy CSV in memory for a runnable example
csv_data = """name,age,city,salary
Alice,30,New York,70000
Bob,24,London,50000
Charlie,30,New York,75000
David,35,London,90000
Eve,24,Paris,60000
"""

# Use io.StringIO to simulate a file for from_csv
with io.StringIO(csv_data) as f:
    # agate automatically infers types with TypeTester by default
    table = agate.Table.from_csv(f)

print("Original Table:")
table.print_table()

# Filter rows where age is less than 30
filtered_table = table.where(lambda row: row['age'] < 30)
print("\nFiltered Table (age < 30):")
filtered_table.print_table()

# Group by city and calculate average salary
from agate.aggregations import Sum, Mean

by_city = table.group_by('city')
averages = by_city.aggregate([
    ('average_salary', Mean('salary')),
    ('total_employees', Sum('age', cast=True)) # Using Sum on age as a proxy for count
])

print("\nAggregated by City (Average Salary & Total Employees):")
averages.print_table()

# Order by average salary, descending
ordered_averages = averages.order_by('average_salary', reverse=True)
print("\nOrdered by Average Salary (Descending):")
ordered_averages.print_table()