{"id":3499,"library":"flupy","title":"Flupy: Fluent Data Processing","description":"Flupy is a lightweight Python library and command-line interface (CLI) for implementing data pipelines with a fluent, chainable interface. Built upon generators, it processes data lazily and uses a constant amount of memory, making it suitable for large datasets. The current stable version is 1.2.3, and it maintains an active development cadence.","status":"active","version":"1.2.3","language":"en","source_language":"en","source_url":"https://github.com/olirice/flupy","tags":["python","data-processing","fluent-interface","generators","lazy-evaluation","cli","functional-programming"],"install":[{"cmd":"pip install flupy","lang":"bash","label":"Install latest version"}],"dependencies":[],"imports":[{"symbol":"flu","correct":"from flupy import flu"}],"quickstart":{"code":"from itertools import count\nfrom flupy import flu\n\n# Example: Process an infinite sequence in constant memory\npipeline = (\n    flu(count()) # Start with an infinite sequence\n    .map(lambda x: x**2) # Square each number\n    .filter(lambda x: x % 517 == 0) # Keep only multiples of 517\n    .chunk(5) # Group into chunks of 5\n    .take(3) # Take the first 3 chunks\n)\n\nresults = []\nfor item in pipeline:\n    results.append(item)\n\nprint(results)\n# Expected output (varies slightly based on iteration start, but structure is similar):\n# [[0, 267289, 1069156, 2405601, 4276624], [6682225, 9622404, 13097161, 17106496, 21650409], [26728900, 32341969, 38489616, 45171841, 52388644]]","lang":"python","description":"This example demonstrates creating a fluent pipeline to process an infinite sequence lazily, squaring numbers, filtering them, chunking the results, and taking a limited number of chunks. Operations are chained, with each method returning a new `flu` object."},"warnings":[{"fix":"To re-run a pipeline or process data again, re-create the `flu` object from the original data source or explicitly `list()` the results of the first iteration if you need to reuse them.","message":"Flupy pipelines are built on generators and evaluate lazily. Operations return new generators, and data is only processed when explicitly iterated over (e.g., in a `for` loop or by calling `.collect()`). If you assign a pipeline to a variable and iterate it once, then try to iterate the same variable again, it will be exhausted and yield no further results unless re-initialized.","severity":"gotcha","affected_versions":"<=1.2.3"},{"fix":"Always use the new `flu` object returned by each method call to continue the pipeline. If you need a final, concrete collection, use terminal operations like `.collect()` or `list()`.","message":"Methods on the `flu` object (like `.map()`, `.filter()`) return *new* `flu` instances, reflecting a functional programming paradigm. They do not modify the original iterable in-place. Attempting to modify the original source indirectly through chained methods will not work as expected.","severity":"gotcha","affected_versions":"<=1.2.3"},{"fix":"When writing Python code, always explicitly import `flu` using `from flupy import flu` and initialize it with your iterable (e.g., `flu(my_list)`).","message":"The `flupy` library has an accompanying command-line interface (CLI) named `flu`. In the CLI, input data is automatically assigned to a variable named `_`. This `_` variable is specific to the CLI context and should not be directly used or confused with the `flu` object when writing Python code.","severity":"gotcha","affected_versions":"<=1.2.3"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}