{"id":10042,"library":"pypiper","title":"Pypiper","description":"Pypiper is a lightweight Python toolkit designed for building robust, restartable command-line pipelines. It simplifies the process of creating complex data processing workflows by handling logging, error recovery, and status tracking. The current version is 0.15.1, and it maintains an active release cadence with regular updates.","status":"active","version":"0.15.1","language":"en","source_language":"en","source_url":"https://github.com/databio/pypiper/","tags":["pipeline","cli","workflow","bioinformatics","automation","restartable"],"install":[{"cmd":"pip install pypiper","lang":"bash","label":"Install Pypiper"}],"dependencies":[{"reason":"Core dependency for pipeline result reporting and status tracking; significant changes in integration from v0.14.0 onwards.","package":"pipestat","optional":false},{"reason":"Provides shared utilities for `pypiper` and related tools.","package":"ubiquerg","optional":false},{"reason":"Used for YAML-based configuration management.","package":"yacman","optional":false}],"imports":[{"symbol":"PipelineManager","correct":"from pypiper import PipelineManager"},{"note":"Commonly used for bioinformatics pipelines; an alternative to PipelineManager for some use cases.","symbol":"ngs_pipe","correct":"from pypiper import ngs_pipe"}],"quickstart":{"code":"import pypiper\nimport os\n\n# Define pipeline name and output directory\nPIPELINE_NAME = \"my_pypiper_example\"\nOUTDIR = \"pypiper_output\"\nos.makedirs(OUTDIR, exist_ok=True)\n\n# Initialize PipelineManager\npm = pypiper.PipelineManager(name=PIPELINE_NAME, outdir=OUTDIR)\n\nprint(f\"\\n--- Starting Pypiper Pipeline: {PIPELINE_NAME} ---\")\n\n# Stage 1: Create an initial file\ninput_file = os.path.join(OUTDIR, \"raw_data.txt\")\ncmd1 = f\"echo 'Line 1\\nLine 2\\nLine 3' > {input_file}\"\npm.run(cmd1, target=input_file, stage_name=\"create_raw_data\")\n\n# Stage 2: Process the file (e.g., count lines)\noutput_file = os.path.join(OUTDIR, \"processed_data.txt\")\ncmd2 = f\"wc -l {input_file} > {output_file}\"\npm.run(cmd2, target=output_file, stage_name=\"count_lines\")\n\n# Report a result to pipestat (requires pipestat to be configured or just report to log)\npm.report_result(\"lines_counted\", os.path.getsize(output_file))\n\n# Close the pipeline manager (flushes logs, finishes reporting)\npm.close()\n\nprint(f\"--- Pipeline Finished! Check '{OUTDIR}' for results. ---\")\nprint(f\"Content of {output_file}:\")\nwith open(output_file, 'r') as f:\n    print(f.read().strip())\n\n# Clean up (optional for quickstart demonstration)\n# import shutil\n# shutil.rmtree(OUTDIR)\n","lang":"python","description":"This quickstart demonstrates how to initialize a `PipelineManager`, define stages using `pm.run()` with shell commands, specify target files for restartability, and report a simple result. It creates an output directory, generates a file, processes it, and then reports a simple metric."},"warnings":[{"fix":"Upgrade your Python environment to 3.10 or newer. Pypiper requires Python >=3.10.","message":"Pypiper v0.14.0 dropped support for Python 2.7. Users on older Python versions will encounter `SyntaxError` or `ModuleNotFoundError`.","severity":"breaking","affected_versions":">=0.14.0"},{"fix":"Review your `PipelineManager` initialization and `report_result`/`report_object` calls. Update parameter names and ensure `message_raw` values conform to `pipestat`'s `value_dict` expectation.","message":"Significant changes to `pipestat` integration parameters occurred in v0.14.0 and v0.14.1. `pipestat_project_name` parameter was removed, `pipestat_sample_name` was renamed to `pipestat_record_identifier`, and `message_raw` type changed.","severity":"breaking","affected_versions":">=0.14.0"},{"fix":"If you rely on stages skipping when target files exist, explicitly set `force_overwrite=False` in your `PipelineManager` constructor or in individual `pm.run()` calls.","message":"The default value for `force_overwrite` in `PipelineManager` changed from `False` to `True` in v0.14.1. This means existing pipelines might unexpectedly rerun stages if not explicitly configured.","severity":"gotcha","affected_versions":">=0.14.1"},{"fix":"Ensure that the `target` file specified for `pm.run()` is indeed created or modified successfully by the executed command. Use `pypiper.check_file_existance(target)` or `pypiper.file_checksum(target)` in debugging.","message":"Pypiper relies on `target` files for restartability. If a stage's `target` file is not correctly created or updated by the command, Pypiper may incorrectly assume the stage failed or needs to be rerun, or conversely, skip a stage that should run.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Run `pip install pypiper` to install the library.","cause":"The pypiper library is not installed in the current Python environment.","error":"ModuleNotFoundError: No module named 'pypiper'"},{"fix":"Change `pipestat_sample_name` to `pipestat_record_identifier` in your `PipelineManager` constructor. Also check for `pipestat_project_name` which was removed.","cause":"You are using an older parameter name for `pipestat` integration with a newer version of Pypiper.","error":"TypeError: PipelineManager.__init__() got an unexpected keyword argument 'pipestat_sample_name'"},{"fix":"Upgrade your Python interpreter to version 3.10 or newer. For example, use `python3.10 your_script.py` or create a new virtual environment with a newer Python version.","cause":"Your Python environment is too old for the current version of Pypiper, which requires Python >=3.10.","error":"SyntaxError: invalid syntax (from trying to run pypiper code on Python 2.7)"},{"fix":"Examine the pipeline log file (usually `[outdir]/[pipeline_name].log`) for the specific error messages from your shell command. Debug the command as you would outside Pypiper.","cause":"A command executed by `pm.run()` returned a non-zero exit code, indicating failure. Pypiper caught this and marked the stage as failed.","error":"ERROR: Pipeline stage 'my_stage' failed!"}]}