Apache MXNet: Ultra-scalable Deep Learning Framework
Apache MXNet was an ultra-scalable deep learning framework known for its flexibility, efficiency, and multi-language support. It allowed users to mix and match imperative and symbolic programming for deep learning tasks. The project's last stable release was 1.9.1 in May 2022, and it was moved to the Apache Attic in September 2023, signifying that it is no longer actively developed or maintained.
Warnings
- breaking Apache MXNet has been moved to the Apache Attic as of September 2023 and is no longer actively developed or maintained. No new releases, features, or official support are expected.
- gotcha MXNet has known compatibility issues with newer versions of NumPy, often requiring older NumPy versions (e.g., `<1.20.0` or even specific versions like `1.23.5` for `1.8`) to avoid errors like `Module 'numpy' has no attribute 'bool'`.
- gotcha GPU installations (`mxnet-cuXXX`) require strict matching of the installed CUDA Toolkit version with the MXNet package. With the project abandoned, there is no official support for recent CUDA versions (e.g., CUDA 12.x).
- deprecated The (unreleased) 2.0.0 beta versions of MXNet introduced significant API changes, deprecating legacy APIs like `Model`, `Module`, `Symbol`, and the original `NDArray` API in favor of a NumPy-compatible `np` and `npx` interface and an enhanced Gluon API.
Install
-
pip install mxnet -
pip install mxnet-cu112 -
pip install mxnet-cu102
Imports
- mxnet
import mxnet as mx
- ndarray
from mxnet import nd
- gluon
from mxnet import gluon
- autograd
from mxnet import autograd
Quickstart
import mxnet as mx
from mxnet import gluon, nd
from mxnet.gluon import nn
# Define a simple neural network
class MLP(nn.Block):
def __init__(self, **kwargs):
super(MLP, self).__init__(**kwargs)
self.dense0 = nn.Dense(128, activation='relu')
self.dense1 = nn.Dense(64, activation='relu')
self.dense2 = nn.Dense(10)
def forward(self, x):
x = self.dense0(x)
x = self.dense1(x)
x = self.dense2(x)
return x
# Create an instance of the network
net = MLP()
# Initialize parameters
ctx = mx.cpu(0) # Or mx.gpu(0) if GPU is available and MXNet-GPU is installed
net.initialize(mx.init.Xavier(), ctx=ctx)
# Create a dummy input (e.g., for a batch of 1 with 784 features)
dummy_input = nd.random.uniform(shape=(1, 784), ctx=ctx)
# Perform a forward pass
output = net(dummy_input)
print(f"Network output shape: {output.shape}")
# Simple tensor operation
a = nd.ones((2, 3), ctx=ctx)
b = a * 2
print(f"Simple NDArray operation result: {b.asnumpy()}")