pyreadr
pyreadr is a Python library designed to read and write R RData and Rds files, seamlessly converting them to and from pandas DataFrames. The current version is 0.5.6. Releases are generally infrequent, often focusing on build system improvements, dependency compatibility, or minor feature additions.
Common errors
-
ImportError: DLL load failed while importing _readr (Windows) or ImportError: librdata.so: cannot open shared object file: No such file or directory (Linux)
cause The C libraries that `pyreadr` relies on (like `librdata` or `libiconv`) were not correctly compiled, linked, or found at runtime.fixOn Linux, ensure development headers for `librdata` and `libiconv` are installed (e.g., `sudo apt-get install librdata-dev libiconv-hook-dev`). Try reinstalling with `PYREADR_LINK_ICONV=1 pip install pyreadr`. On Windows, ensure you are using pre-built wheels, or have a compatible build environment set up. -
KeyError: '<object_name>' when accessing objects from `read_rdata` result.
cause The RData file does not contain an object with the exact name specified, or the name has unexpected casing/characters.fixInspect the keys returned by `pyreadr.read_rdata(file_path).keys()` to get the actual object names present in the RData file. R object names can sometimes have different conventions or hidden characters. -
ValueError: Pandas version X.Y.Z is not supported. Please update to a newer version of pandas or pyreadr.
cause Your installed `pyreadr` version has a specific compatibility requirement with pandas, and your pandas version falls outside of this range.fixFirst, try updating `pyreadr`: `pip install --upgrade pyreadr`. If the error persists, check `pyreadr`'s documentation or release notes for specific pandas version requirements and adjust your pandas installation accordingly (e.g., `pip install pandas==X.Y.Z`).
Warnings
- gotcha On some Linux distributions, `pyreadr`'s underlying C libraries (`librdata`, `libiconv`) might fail to link during installation, leading to `ImportError`. This often happens when `libiconv` development headers are not correctly found.
- breaking `pyreadr` versions older than 0.5.5 may not be compatible with pandas 3.0 or newer due to internal changes in pandas. This could lead to various `TypeError` or `AttributeError` exceptions.
- gotcha The underlying `librdata` library may not support all possible RData/Rds file versions or complex R object types (e.g., S4 objects, environments, specific user-defined types). Attempting to read unsupported structures might result in errors or incomplete data.
Install
-
pip install pyreadr
Imports
- read_rdata
import pyreadr result = pyreadr.read_rdata('file.RData') - write_rdata
import pyreadr pyreadr.write_rdata('file.RData', {'df': df_obj}) - read_rds
import pyreadr result = pyreadr.read_rds('file.rds') - write_rds
import pyreadr pyreadr.write_rds(df_obj, 'file.rds')
Quickstart
import pyreadr
import pandas as pd
import numpy as np
import os
# Create a dummy RData file for testing
data_for_r = {'df': pd.DataFrame({'col1': [1, 2, 3], 'col2': ['a', 'b', 'c']}),
'vec': np.array([10, 20, 30])}
pyreadr.write_rdata("dummy.RData", data_for_r)
# Read RData file
result_rdata = pyreadr.read_rdata("dummy.RData")
# result_rdata is a dictionary where keys are R object names
df_from_r = result_rdata['df']
vec_from_r = result_rdata['vec']
# Create a dummy Rds file for testing (single object)
df_to_rds = pd.DataFrame({'colA': [10, 20], 'colB': ['x', 'y']})
pyreadr.write_rds(df_to_rds, "dummy.rds")
# Read Rds file
result_rds = pyreadr.read_rds("dummy.rds")
# result_rds is the pandas DataFrame directly
df_from_rds = result_rds
print("DataFrame from RData:\n", df_from_r)
print("Vector from RData:\n", vec_from_r)
print("DataFrame from Rds:\n", df_from_rds)
# Clean up dummy files
os.remove("dummy.RData")
os.remove("dummy.rds")