{"id":3738,"library":"pipelinewise-singer-python","title":"PipelineWise Singer Python Library","description":"This library is a fork of Singer's singer-python, specifically tailored for PipelineWise compatibility. It provides utilities for implementing the Singer.io data replication specification, enabling taps (data extractors) and targets (data loaders) to communicate using a standard JSON-based message format over stdout. The current version is 2.0.1, with releases occurring infrequently, typically driven by critical bug fixes or significant feature enhancements like performance improvements.","status":"maintenance","version":"2.0.1","language":"en","source_language":"en","source_url":"https://github.com/transferwise/pipelinewise-singer-python","tags":["singer.io","etl","data integration","pipelinewise","data pipeline","tap","target","json"],"install":[{"cmd":"pip install pipelinewise-singer-python","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Switched to orjson in v2.0.0 for faster JSON serialization/deserialization.","package":"orjson","optional":false},{"reason":"Used for retry logic, bumped to 1.11.1 in v2.0.0.","package":"backoff","optional":false},{"reason":"Used for timezone handling, bumped to latest in v2.0.0.","package":"pytz","optional":false}],"imports":[{"note":"The entire library's functionality is typically accessed via the top-level 'singer' module after import.","symbol":"singer","correct":"import singer"}],"quickstart":{"code":"import singer\nimport sys\nimport json\n\n# Define a simple schema for demonstration\nschema = {\n    'properties': {\n        'id': {'type': 'integer', 'key': True},\n        'name': {'type': 'string'},\n        'value': {'type': 'number'}\n    }\n}\n\n# Write the schema message\nsinger.write_schema('my_stream', schema, ['id'])\n\n# Write some record messages\nrecords = [\n    {'id': 1, 'name': 'Item A', 'value': 100.5},\n    {'id': 2, 'name': 'Item B', 'value': 200.0},\n    {'id': 3, 'name': 'Item C', 'value': 150.75}\n]\n\nfor record in records:\n    singer.write_record('my_stream', record)\n\n# Write a state message (optional, but good practice for incremental processing)\nsinger.write_state({'last_processed_id': records[-1]['id']})\n\nprint(\"\\n--- Output captured (simulated stdout) ---\")\n# For demonstration, manually capture output to show what 'singer' writes\n# In a real Singer pipeline, this output goes to stdout.","lang":"python","description":"This quickstart demonstrates basic usage of the `pipelinewise-singer-python` library to emit Singer.io compliant messages (schema, record, state) to standard output. These messages can then be consumed by a Singer.io target. The example defines a simple schema and writes three records, followed by a state message."},"warnings":[{"fix":"Review custom JSON serialization/deserialization logic in your application. Test thoroughly after upgrading to ensure compatibility with `orjson`. Consult `orjson` documentation for any behavioral differences.","message":"Version 2.0.0 replaced the standard `json` library with `orjson` for improved performance. While generally a drop-in replacement, applications with custom JSON handling or those relying on specific `json` library behaviors not supported by `orjson` (e.g., certain `json.dumps` parameters) may experience unexpected issues.","severity":"breaking","affected_versions":">=2.0.0"},{"fix":"Implement a logging configuration in your application that utilizes `pipelinewise-singer-python`. Refer to Python's `logging` module documentation or set the `LOGGING_CONF_FILE` environment variable to a valid logging configuration file path.","message":"The library does not provide a default logging configuration. Users are responsible for setting up their own logging using standard Python `logging` module practices. If no logging is configured, log messages from the library might not appear or might go to stderr without proper formatting. An environment variable `LOGGING_CONF_FILE` can be used to point to a logging configuration file.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure all taps and targets in your Singer.io pipeline are updated to versions that explicitly support the `BATCH` message type, especially if you intend to leverage its performance benefits. Check component documentation for `BATCH` support.","message":"The `BATCH` message type, introduced in `v1.2.0` and enhanced in `v1.3.0` (with `time_extracted`), allows for more efficient data transfer. However, older taps or targets in a Singer.io pipeline might not fully support this message type, leading to compatibility issues or data processing failures if not all components are updated.","severity":"gotcha","affected_versions":"<1.2.0 (for pipelines not supporting BATCH)"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}