EMR Notebooks Magics
EMR Notebooks Magics is a Python library providing Jupyter magics specifically for Amazon EMR Notebooks. These magics enhance the interactive experience by allowing direct interaction with EMR cluster resources, such as mounting S3 workspaces and executing other notebooks. The library is currently at version 0.2.4 and is actively maintained by AWS Labs, with updates tied to EMR Notebooks and EMR Studio feature development.
Warnings
- gotcha After installing `emr-notebooks-magics` (e.g., via `pip install`), you must restart the Jupyter kernel before the magics become available. Installing via bootstrap actions is not supported.
- breaking The `%mount_workspace_dir` magic primarily functions with Python 3 kernels. Spark executors do not have access to the mounted directory when used with a Python 3 kernel.
- gotcha Enabling write access with `%mount_workspace_dir` is irreversible and applies changes directly to your S3 Workspace. By default, mounts are read-only.
- gotcha Using magics like `%execute_notebook` requires specific IAM permissions for the EMR-EC2 instance role. Lack of proper S3 read access for the EC2 instance profile will lead to failures.
- breaking EMR Serverless and Amazon EMR on EKS clusters do not support EMR Notebooks magics, including those from `emr-notebooks-magics`.
Install
-
pip install emr-notebooks-magics
Imports
- %mount_workspace_dir
%mount_workspace_dir .
- %generate_s3_download_url
%generate_s3_download_url s3://my_bucket/path/to/object
Quickstart
# After installing the package and restarting the kernel in an EMR Notebook. # Mount the entire Workspace directory to the EMR cluster instance %mount_workspace_dir . # Generate a presigned URL for an S3 object # Replace with your actual S3 bucket and object key %generate_s3_download_url s3://my-example-bucket/path/to/my-file.csv # Execute another notebook in the background # Make sure 'another_notebook.ipynb' exists in your workspace %execute_notebook another_notebook.ipynb