{"id":4664,"library":"onnxslim","title":"OnnxSlim","description":"OnnxSlim is an open-source toolkit developed by Microsoft for optimizing ONNX (Open Neural Network Exchange) models. It helps reduce model size and improve inference speed through various techniques like node elimination, constant folding, and shape inference. As of version 0.1.91, it is under active development with frequent updates, aiming to provide a robust solution for model deployment.","status":"active","version":"0.1.91","language":"en","source_language":"en","source_url":"https://github.com/microsoft/onnxslim","tags":["onnx","model optimization","deep learning","neural networks","model compression"],"install":[{"cmd":"pip install onnxslim","lang":"bash","label":"Install OnnxSlim"}],"dependencies":[{"reason":"Core library for ONNX model manipulation, required for loading and saving models.","package":"onnx","optional":false},{"reason":"Recommended for validating and running slimmed ONNX models.","package":"onnxruntime","optional":true}],"imports":[{"symbol":"slim","correct":"from onnxslim.slim import slim"}],"quickstart":{"code":"import onnx\nfrom onnxslim.slim import slim\nimport os\n\n# Create a minimal dummy ONNX model for demonstration\n# In a real-world scenario, you would load your existing model:\n# input_model_path = \"path/to/your/model.onnx\"\n# If you don't have one, this creates a simple Add operation model:\nfrom onnx.helper import make_model, make_node, make_graph, make_tensor_value_info\nfrom onnx import TensorProto\n\ninput_name = 'input'\noutput_name = 'output'\ninput_tensor = make_tensor_value_info(input_name, TensorProto.FLOAT, [1, 2, 3])\noutput_tensor = make_tensor_value_info(output_name, TensorProto.FLOAT, [1, 2, 3])\nnode = make_node('Add', [input_name, input_name], [output_name]) # example: output = input + input\ngraph = make_graph([node], 'simple_add_graph', [input_tensor], [output_tensor])\ndummy_model = make_model(graph, opset_imports=[onnx.helper.make_opsetid(\"\", 13)]) # Opset 13 is common\ninput_model_path = \"dummy_model_to_slim.onnx\"\noutput_model_path = \"dummy_model_slimmed.onnx\"\nonnx.save(dummy_model, input_model_path)\n\nprint(f\"Dummy ONNX model created at {input_model_path}\")\n\ntry:\n    # Perform the slimming\n    slimmed_model_proto = slim(input_model_path)\n    # The 'slim' function returns an ONNX model proto object.\n    # To save it, use onnx.save:\n    onnx.save(slimmed_model_proto, output_model_path)\n    print(f\"Model successfully slimmed and saved to {output_model_path}\")\n\n    # Optional: Load and verify the slimmed model\n    # slimmed_model_loaded = onnx.load(output_model_path)\n    # print(f\"Loaded slimmed model with graph name: {slimmed_model_loaded.graph.name}\")\n\nexcept Exception as e:\n    print(f\"An error occurred during slimming: {e}\")\n\nfinally:\n    # Clean up dummy files\n    if os.path.exists(input_model_path):\n        os.remove(input_model_path)\n    if os.path.exists(output_model_path):\n        os.remove(output_model_path)\n    print(\"Cleaned up dummy ONNX files.\")","lang":"python","description":"This quickstart demonstrates how to load an ONNX model (or create a dummy one for the example) and then use `onnxslim.slim.slim()` to optimize it. The `slim` function returns an ONNX model protocol buffer object, which can then be saved to a new file using `onnx.save()`."},"warnings":[{"fix":"Run inference on the slimmed model with sample data and compare outputs (e.g., using ONNX Runtime) against the original model. Evaluate performance metrics like latency and throughput in your target environment.","message":"Always verify the slimmed model's output and performance. While OnnxSlim aims for lossless optimization, aggressive slimming or specific model architectures can sometimes subtly alter behavior or reduce compatibility with certain ONNX runtimes or hardware accelerators. Extensive testing after optimization is crucial.","severity":"gotcha","affected_versions":"All versions"},{"fix":"After `slim()` and `onnx.save()`, manually copy or manage the external data files. If `onnxslim` changes paths or removes nodes that referenced external data, you might need to re-export the model with external data or verify paths carefully.","message":"Models that store weights as external data files (common for very large models) require careful handling. OnnxSlim primarily operates on the `.onnx` protobuf definition. After slimming, ensure that any associated external data files are correctly moved or regenerated alongside the new slimmed `.onnx` file, maintaining their relative paths if applicable.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Use the `input_shapes` parameter in the `slim` function to provide explicit fixed or symbolic shapes for dynamic inputs, e.g., `slim(model, input_shapes={'input_name': [1, 3, 224, 224]})` or carefully inspect the slimmed model's input shapes using `onnx.checker`.","message":"For models with dynamic input shapes, incorrect shape inference by `onnxslim` can lead to runtime errors. While `onnxslim` includes shape inference, complex dynamic scenarios might require explicit configuration.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}