fsspec — Filesystem Interfaces for Python
Filesystem Spec (fsspec) provides a unified, Pythonic interface to local, remote, and embedded file systems and bytes storage — including S3, GCS, Azure, HTTP, SFTP, memory, and more. It is the file-system abstraction layer used internally by Dask, pandas, PyArrow, Zarr, and many others. The current version is 2026.3.0, following a monthly calendar-versioning (CalVer) release cadence tied to the YYYY.MM.PATCH scheme.
Warnings
- gotcha Filesystem instances are cached singletons. fsspec.filesystem('s3', key=K1) and fsspec.filesystem('s3', key=K2) with different credentials may return the same cached instance if arguments hash identically. This can cause silent credential cross-contamination.
- breaking Calling sync fsspec methods (e.g. fs.ls()) from inside a running asyncio event loop raises NotImplementedError: 'Calling sync() from within a running loop'. This affects Jupyter notebooks, FastAPI handlers, and async frameworks.
- gotcha fsspec.open() returns an OpenFile placeholder — the remote file is NOT opened until you enter a `with` block. Calling .read() on the raw OpenFile object without a context manager will fail.
- gotcha Cloud backend packages (s3fs, gcsfs, adlfs) are NOT included in the base fsspec install. Attempting to use fsspec.filesystem('s3') without s3fs raises an ImportError with a hint, but the hint may be missed in automated pipelines.
- gotcha Async filesystem instances (s3fs, gcsfs) are incompatible with os.fork() (used by PyTorch DataLoader, multiprocessing). After fork, cached instances hold references to dead event loops, causing hangs or cryptic errors.
- deprecated Passing `trim` to fsspec.spec.AbstractBufferedFile is deprecated.
- breaking fsspec.asyn.maybe_sync was removed. Older pinned versions of s3fs (<=0.5.2) and other ecosystem libraries that imported maybe_sync will break on recent fsspec.
Install
-
pip install fsspec -
pip install fsspec[s3] -
pip install fsspec[gcs] -
pip install fsspec[ssh] -
pip install fsspec[full]
Imports
- fsspec.open
import fsspec with fsspec.open('s3://bucket/file.txt', 'rt') as f: ... - fsspec.filesystem
import fsspec fs = fsspec.filesystem('s3', key='...', secret='...') - AbstractFileSystem
from fsspec.spec import AbstractFileSystem
- AsyncFileSystem
from fsspec.asyn import AsyncFileSystem
- LocalFileSystem
from fsspec.implementations.local import LocalFileSystem
- known_implementations
from fsspec.registry import known_implementations
- register_implementation
from fsspec.registry import register_implementation register_implementation('myproto', MyFS) - get_mapper
import fsspec mapper = fsspec.get_mapper('s3://bucket/path/') - OpenFile
from fsspec.core import OpenFile
Quickstart
import os
import fsspec
# 1. Open any URL transparently (protocol auto-detected from URL)
with fsspec.open(
"https://raw.githubusercontent.com/fsspec/filesystem_spec/master/README.md",
"rt",
) as f:
first_line = f.readline()
print("README first line:", first_line.strip())
# 2. Use fsspec.filesystem() for repeated operations on one backend
fs = fsspec.filesystem("memory") # in-process, no I/O
fs.mkdir("/mydir")
with fs.open("/mydir/hello.txt", "wt") as f:
f.write("Hello from fsspec!")
with fs.open("/mydir/hello.txt", "rt") as f:
print(f.read())
print("Files:", fs.ls("/mydir"))
# 3. S3 example (needs pip install fsspec[s3])
# Uses env-var credentials — safe pattern for agents
# aws_key = os.environ.get('AWS_ACCESS_KEY_ID', '')
# aws_secret = os.environ.get('AWS_SECRET_ACCESS_KEY', '')
# fs_s3 = fsspec.filesystem('s3', key=aws_key, secret=aws_secret)
# print(fs_s3.ls('my-bucket/'))
# 4. Zarr-style key-value mapping over any backend
mapper = fsspec.get_mapper("memory://zarr-root/")
mapper["chunk-0"] = b"\x00" * 128
print("Mapper keys:", list(mapper))