StreamingJSON

0.0.5 · active · verified Tue Apr 14

StreamingJSON is a Python library designed to preprocess incomplete JSON strings, transforming them into valid, parseable JSON in real-time. This addresses challenges in stream JSON parsing, especially relevant for Large Language Models (LLMs), by enabling immediate data processing without waiting for full JSON generation. It works by completing fragmented JSON, allowing other standard JSON libraries to parse its output seamlessly. The library is currently in an active development phase, with version 0.0.5 being the latest release, and follows a rapid release cadence.

Warnings

Install

Imports

Quickstart

This example demonstrates how to initialize the `Lexer`, append string segments incrementally, and retrieve the completed (syntactically valid) JSON at any point in the stream. It highlights the use of a new `Lexer` instance per stream and the library's ability to handle various JSON fragments and escaped characters.

import streamingjson

# NOTE: A new Lexer instance is required for each JSON stream.
lexer = streamingjson.Lexer()

# Append initial JSON segment
lexer.append_string('{"a":')
print(lexer.complete_json()) # Expected: {"a":null}

# Append more JSON segments
lexer.append_string('[tr')
print(lexer.complete_json()) # Expected: {"a":[true]}

lexer.append_string('ue], "b": "hello')
print(lexer.complete_json()) # Expected: {"a":[true], "b": "hello"}

# Example with escaped characters
new_lexer = streamingjson.Lexer()
new_lexer.append_string('{"key": "value with \"quote\" and \\slash"')
print(new_lexer.complete_json()) # Expected: {"key": "value with \"quote\" and \\slash"}

view raw JSON →