Pybtex: BibTeX-compatible bibliography processor
Pybtex is a BibTeX-compatible bibliography processor written in Python, designed to read citation information from various formats (BibTeX, BibTeXML, YAML) and produce formatted bibliographies. It supports traditional BibTeX `.bst` styles and also allows for defining custom styles directly in Python, with output options including LaTeX, HTML, Markdown, or plain text. Currently at version 0.26.1, it is actively maintained and serves as a flexible alternative or drop-in replacement for the original BibTeX.
Warnings
- gotcha Pybtex uses two distinct 'engines' for bibliography formatting: a BibTeX engine (for `.bst` styles) and a Python engine (for Pythonic styles). The BibTeX engine can only output LaTeX, while the Python engine supports LaTeX, HTML, Markdown, and plain text. When using Pythonic styles, remember to specify `--style-language python` and `--output-backend <format>` via the command line or explicitly use the Python API for desired output formats.
- gotcha When working with `BibliographyData` `Entry` objects in the Python API, access person-related fields (authors, editors, etc.) through the `entry.persons` dictionary, not `entry.fields`. `entry.fields` contains general string fields, while `entry.persons` holds a list of `Person` objects for each role. Additionally, `Entry` objects are not directly indexable.
- gotcha Pybtex's strict mode behavior differs between command-line usage and library usage. By default, command-line execution prints warnings to stderr for compatibility with BibTeX, whereas programmatic use as a library will raise exceptions on all errors. This can lead to unexpected crashes if not handled.
- deprecated The API for reading and writing bibliography data has evolved. Earlier methods like `Entry.get_field()` were deprecated. New API for rich text and general data handling was introduced around versions 0.19 and 0.22.
Install
-
pip install pybtex
Imports
- BibliographyData
from pybtex.database import BibliographyData
- Parser
from pybtex.database.input.bibtex import Parser
- Writer
from pybtex.database.output.html import Writer
- BaseStyle
from pybtex.style.formatting import BaseStyle
- Text
from pybtex.richtext import Text
Quickstart
from pybtex.database.input import bibtex
from pybtex.database.output import html
from pybtex.style.formatting import plain
from pybtex.style.template import field, sentence, tag
import os
# Create a dummy .bib file for demonstration
bib_content = """
@article{einstein1905relativity,
title={Zur Elektrodynamik bewegter K\"orper},
author={Einstein, Albert},
journal={Annalen der Physik},
volume={322},
number={10},
pages={891-921},
year={1905},
doi={10.1002/andp.19053221006}
}
"""
with open('references.bib', 'w') as f:
f.write(bib_content)
# 1. Parse the bibliography data
parser = bibtex.Parser()
bib_data = parser.parse_file('references.bib')
# 2. Select a formatting style (e.g., 'plain' or a custom one)
# For a custom style, you would inherit from BaseStyle
style = plain.Style()
# 3. Format the bibliography
# The 'citations' argument specifies which entries to include.
# If not provided, all entries in bib_data will be formatted.
formatted_bibliography = style.format_bibliography(bib_data, citations=['einstein1905relativity'])
# 4. Write the formatted bibliography to an output file (e.g., HTML)
writer = html.Writer()
with open('output.html', 'w') as f:
writer.write_file(formatted_bibliography, f)
print("Bibliography processed and saved to output.html")
# Clean up dummy file
os.remove('references.bib')
os.remove('output.html')