Flupy: Fluent Data Processing

1.2.3 · active · verified Sat Apr 11

Flupy is a lightweight Python library and command-line interface (CLI) for implementing data pipelines with a fluent, chainable interface. Built upon generators, it processes data lazily and uses a constant amount of memory, making it suitable for large datasets. The current stable version is 1.2.3, and it maintains an active development cadence.

Warnings

Install

Imports

Quickstart

This example demonstrates creating a fluent pipeline to process an infinite sequence lazily, squaring numbers, filtering them, chunking the results, and taking a limited number of chunks. Operations are chained, with each method returning a new `flu` object.

from itertools import count
from flupy import flu

# Example: Process an infinite sequence in constant memory
pipeline = (
    flu(count()) # Start with an infinite sequence
    .map(lambda x: x**2) # Square each number
    .filter(lambda x: x % 517 == 0) # Keep only multiples of 517
    .chunk(5) # Group into chunks of 5
    .take(3) # Take the first 3 chunks
)

results = []
for item in pipeline:
    results.append(item)

print(results)
# Expected output (varies slightly based on iteration start, but structure is similar):
# [[0, 267289, 1069156, 2405601, 4276624], [6682225, 9622404, 13097161, 17106496, 21650409], [26728900, 32341969, 38489616, 45171841, 52388644]]

view raw JSON →