Parfive: Parallel File Downloader

2.3.1 · active · verified Fri Apr 17

Parfive is an asynchronous HTTP and FTP parallel file downloader for Python. It leverages `asyncio` to efficiently download multiple files concurrently, providing features like progress bars, connection throttling, and retry mechanisms. The current version, 2.3.1, offers a robust asynchronous API for managing large-scale file transfers. Releases are made periodically to add features, address issues, and ensure compatibility with newer Python versions, maintaining an active development status.

Common errors

Warnings

Install

Imports

Quickstart

This example demonstrates how to use `parfive.Downloader` to download multiple public files concurrently to a local directory. It shows how to initialize the downloader, add URLs, and await the results. The `asyncio.run()` function is used to execute the asynchronous main function.

import parfive
import asyncio
import os

async def main():
    # Define some public URLs to download
    urls = [
        "https://raw.githubusercontent.com/sunpy/parfive/main/README.md",
        "https://raw.githubusercontent.com/sunpy/parfive/main/LICENSE"
    ]

    # Create a directory for downloads if it doesn't exist
    download_dir = "parfive_downloads"
    os.makedirs(download_dir, exist_ok=True)

    # Initialize the Downloader with a maximum of 5 concurrent connections
    # and display a progress bar.
    downloader = parfive.Downloader(max_conn=5, progress=True)

    # Add URLs to the downloader, specifying the local path
    for url in urls:
        downloader.add_url(url, path=download_dir)

    # Execute the downloads asynchronously
    print(f"Starting download of {len(urls)} files to '{download_dir}'...")
    results = await downloader.download()

    # Print the paths of the downloaded files
    print("Downloaded files:")
    for filepath in results:
        print(f"- {filepath}")

    # Optionally clean up the downloaded files
    # for filepath in results:
    #     os.remove(filepath)
    # os.rmdir(download_dir)

if __name__ == "__main__":
    asyncio.run(main())

view raw JSON →