JSON Repair
json-repair is a Python library designed to automatically fix malformed or invalid JSON strings, making them parseable by standard JSON parsers. It addresses common issues like missing quotes, trailing commas, and incomplete structures, making it particularly useful for processing data from less strict sources like LLMs or web scraping. The library is actively maintained with frequent minor releases addressing bug fixes and performance improvements. The current version is 0.58.7.
Warnings
- gotcha The `schema_repair_mode='salvage'` option, introduced in v0.58.0, prioritizes data recovery but might yield an output that is not fully compliant with the provided JSON schema. There have been past regressions (fixed in v0.58.4) where `salvage` mode repaired less aggressively than `standard` mode.
- breaking Starting from v0.57.1, valid JSON input that does not conform to an optionally provided JSON schema may be *coerced* to match the schema's types (e.g., a string '1' might become an integer 1). This can alter data types in ways that might be unexpected if strict validation without coercion was desired.
- gotcha By default, `json-repair` first attempts to parse the input using Python's standard `json.loads()`. Only if this fails does it proceed with its repair logic. Explicitly wrapping `json.loads()` in a `try...except` block before calling `json_repair.repair_json()` is redundant and an anti-pattern.
- gotcha The `strict=True` parameter changes the repair behavior to be more validation-oriented. In strict mode, the parser raises a `ValueError` for structural issues like duplicate keys, missing separators, or multiple top-level elements, rather than attempting to fix them. This results in less aggressive repair.
- gotcha Versions prior to v0.58.5 had significantly lower parser performance, especially for common JSON strings. v0.58.5 introduced a 60% improvement.
Install
-
pip install json-repair
Imports
- repair_json
from json_repair import repair_json
- loads
from json_repair import loads
Quickstart
from json_repair import repair_json, loads
import json
broken_json_string = """{
'name': 'Alice',
age: 30,
'isStudent': True,
'hobbies': ['reading', 'gaming',],
}"""
# Using repair_json to get a fixed JSON string
repaired_string = repair_json(broken_json_string)
print(f"Repaired string: {repaired_string}")
parsed_data = json.loads(repaired_string)
print(f"Parsed data (using json.loads): {parsed_data}")
# Using loads for direct parsing to Python object
# This method already includes the repair logic internally
parsed_data_direct = loads(broken_json_string)
print(f"Parsed data (using json_repair.loads): {parsed_data_direct}")