PyOsmium
PyOsmium provides Python bindings for libosmium, a high-performance C++ library designed for processing OpenStreetMap (OSM) data. It enables efficient reading, writing, and manipulation of various OSM file formats (PBF, XML, O5M) and change files, making it suitable for large-scale geospatial data tasks. The library is actively maintained with regular updates, typically aligning with new libosmium releases, and is currently at version 4.3.1.
Common errors
-
TypeError: cannot convert 'osmium._osmium.Node' object to Python type
cause Attempting to store a direct reference to an Osmium C++ object (like Node, Way, Relation) outside of its handler callback, leading to access violations once the underlying C++ memory is deallocated.fixInstead of storing the entire Osmium object, extract and store only the necessary data (e.g., `node.id`, `node.tags.get('name')`, `node.location`) into a new Python dictionary or custom object. -
AttributeError: 'FileProcessor' object has no attribute 'apply_file'
cause Confusing the `osmium.FileProcessor` class (used for advanced, granular processing of file sections) with the `apply_file()` method, which belongs to `osmium.SimpleHandler`.fixFor basic file processing using a handler, instantiate your custom handler and call `handler.apply_file(filename)`. `FileProcessor` is used differently, typically with a `osmium.apply()` function or by managing its processing stages explicitly. -
FileNotFoundError: [Errno 2] No such file or directory: 'your_file.osm.pbf'
cause The specified OpenStreetMap data file (.osm.pbf, .osm, .o5m, etc.) does not exist at the given path, or the path is incorrect/inaccessible.fixVerify that the file path is correct and absolute, or that the file exists in the current working directory. Ensure the Python process has read permissions for the file.
Warnings
- breaking Prior to v4.0.0, PyOsmium loaded entire OSM files into memory, which was inefficient for large files. Version 4.0.0 introduced `osmium.FileProcessor` for iterative processing. Scripts written for older versions that implicitly assumed full in-memory loading will likely fail or be extremely slow with large datasets.
- gotcha Objects (Node, Way, Relation) passed to handler callbacks are only valid for the duration of that specific callback. Keeping direct Python references to these objects beyond the callback's scope will lead to errors, as the underlying C++ memory is deallocated. This is a common source of `TypeError: cannot convert 'osmium._osmium.Node' object to Python type`.
- deprecated The method `ReplicationServer.open_url()` was deprecated in favor of `ReplicationServer.set_request_parameter()` in v3.7.0. Direct overrides of `open_url()` are no longer supported, making old custom replication logic incompatible.
- breaking Version 4.3.0 removed the direct dependency on Boost C++. While this simplifies the build process, custom C++ extensions or very specific build environments that previously relied on PyOsmium's Boost dependency might need adjustments.
- breaking The build process for installing PyOsmium from source changed significantly in v4.1.0. `pybind11` is now installed from PyPI, and custom location variables for `libosmium`, `protozero`, or `boost` (e.g., `LIBOSMIUM_PREFIX`) have been replaced by CMake's standard `Libosmium_ROOT`, `Protozero_ROOT` variables.
- breaking Version 4.3.1 fixed a critical regression introduced in `libosmium` 2.23.0 (used by PyOsmium 4.3.0) where deletions in extract diffs were not handled correctly. Users processing OSM change files, especially for extracts like those from Geofabrik, might encounter incorrect data if using earlier versions.
Install
-
pip install osmium
Imports
- SimpleHandler
from osmium import SimpleHandler
- FileProcessor
from osmium import FileProcessor
- Node
from osmium.osm import Node
- Location
from osmium.osm import Location
- SimpleWriter
from osmium import SimpleWriter
Quickstart
import osmium
import osmium.osm
import tempfile
import os
from datetime import datetime
# Create a dummy PBF file for demonstration purposes
# In a real scenario, you'd process an existing .osm.pbf file.
temp_pbf_file = os.path.join(tempfile.gettempdir(), "test.osm.pbf")
writer = osmium.SimpleWriter(temp_pbf_file)
writer.add_node(osmium.osm.Node(1, location=osmium.osm.Location(1.0, 1.0), user='test_user', timestamp=datetime.now()))
writer.add_node(osmium.osm.Node(2, location=osmium.osm.Location(2.0, 2.0), user='test_user', timestamp=datetime.now()))
writer.add_way(osmium.osm.Way(3, nodes=[1, 2], user='test_user', timestamp=datetime.now()))
writer.add_relation(osmium.osm.Relation(4, user='test_user', timestamp=datetime.now()))
writer.close()
class ElementCounter(osmium.SimpleHandler):
def __init__(self):
super().__init__()
self.nodes = 0
self.ways = 0
self.relations = 0
def node(self, n):
self.nodes += 1
def way(self, w):
self.ways += 1
def relation(self, r):
self.relations += 1
try:
# Process the (dummy) PBF file using the handler
handler = ElementCounter()
handler.apply_file(temp_pbf_file)
print(f"Nodes: {handler.nodes}")
print(f"Ways: {handler.ways}")
print(f"Relations: {handler.relations}")
finally:
# Clean up the temporary file
if os.path.exists(temp_pbf_file):
os.remove(temp_pbf_file)