TFLite Runtime

2.14.0 verified Fri May 01 auth: no python

TensorFlow Lite Runtime is a lightweight library for on-device machine learning inference, optimized for mobile and embedded devices. Version 2.14.0 supports model conversion and execution with hardware acceleration. Releases are tied to TensorFlow Lite versions.

pip install tflite-runtime

Common errors

error ImportError: cannot import name 'Interpreter' from 'tflite_runtime' ↓

cause Using wrong import path.

fix

Replace with 'from tflite_runtime.interpreter import Interpreter'.

error ValueError: Cannot set tensor: Dimension mismatch. Got 3 but expected 4 for input 0. ↓

cause Input shape mismatch. TFLite models often expect a batch dimension.

fix

Reshape input data to include batch dimension, e.g., np.array([[1,2,3]]) instead of np.array([1,2,3]).

Warnings

breaking tflite_runtime does not include training ops. If your model uses custom ops or training-only ops, inference may fail. ↓

fix Use full TensorFlow for training; export to TFLite and ensure ops are supported.

gotcha The Interpreter class must be imported from tflite_runtime.interpreter, not from top-level tflite_runtime. Common mistake: `from tflite_runtime import Interpreter` -> AttributeError. ↓

fix Use `from tflite_runtime.interpreter import Interpreter`.

deprecated Support for Python 3.6 ended in tflite-runtime 2.7. Check your Python version if you encounter installation errors. ↓

fix Upgrade to Python 3.7+ or use an older tflite-runtime version if absolutely necessary.

Imports

Interpreter

wrong

from tflite_runtime import Interpreter

correct

from tflite_runtime.interpreter import Interpreter

Interpreter is in tflite_runtime.interpreter module, not top-level.

load_delegate

from tflite_runtime.interpreter import load_delegate

Use for GPU or Edge TPU delegates.

Quickstart

Basic inference with TFLite Runtime.

import numpy as np
from tflite_runtime.interpreter import Interpreter

interpreter = Interpreter(model_path='model.tflite')
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

input_data = np.array([[1.0, 2.0, 3.0]], dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)