JSON Stream

2.5.0 · active · verified Sat Apr 11

json-stream is a Python library (version 2.5.0, actively maintained) designed for efficient streaming JSON encoding and decoding. It allows processing JSON data in chunks, rather than loading the entire document into memory, which significantly reduces memory consumption and latency for large files or network streams. It provides a Pythonic dict/list-like interface for reading and uses generators for writing, making it suitable for web applications, data pipelines, and other scenarios requiring optimized JSON handling.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates both streaming JSON decoding and encoding. For decoding, `json_stream.load()` reads from a file-like object, allowing you to access elements as they are parsed without loading the entire structure into memory. For encoding, `streamable_dict` and `streamable_list` wrap Python generators, enabling JSON serialization of large or dynamically generated data structures without constructing the full object graph upfront before `json.dumps()` or `json.dump()` is called.

import json_stream
from json_stream.writer import streamable_dict, streamable_list
import io
import json

# --- Reading JSON (Decoding) ---
json_data_str = '{"name": "Alice", "items": [1, 2, 3], "settings": {"active": true}}'

# Simulate a file-like object for streaming
json_stream_input = io.StringIO(json_data_str)

# Load the stream in transient mode (default)
data = json_stream.load(json_stream_input)

# Access data - values are loaded as accessed
name = data['name']
first_item = data['items'][0]
setting_active = data['settings']['active']

print(f"Decoded Name: {name}")
print(f"Decoded First Item: {first_item}")
print(f"Decoded Setting Active: {setting_active}")

# --- Writing JSON (Encoding) ---
def generate_items():
    for i in range(3):
        yield i + 1

def generate_data():
    yield 'id', 123
    yield 'status', 'processed'
    yield 'results', streamable_list(generate_items())

# Use streamable_dict for the top-level object
streaming_output = streamable_dict(generate_data())

# Dump to a string (or file) using the standard json module
# The streamable_dict/list objects adapt to json.dump/dumps
encoded_json = json.dumps(streaming_output)
print(f"Encoded JSON: {encoded_json}")

# Expected output for writing is a complete JSON string after dumps() is called.

view raw JSON →