{"id":5262,"library":"ipex-llm","title":"IPEX-LLM","description":"IPEX-LLM is a PyTorch-based library developed by Intel for optimizing Large Language Models (LLMs) on Intel CPUs and GPUs (XPUs). It provides tools for efficient inference and fine-tuning, leveraging Intel hardware accelerations. The current stable version is 2.2.0, with frequent nightly builds and updates, often released in conjunction with the broader BigDL project.","status":"active","version":"2.2.0","language":"en","source_language":"en","source_url":"https://github.com/intel-analytics/ipex-llm","tags":["LLM","AI","Intel","optimization","deep-learning","pytorch","transformers"],"install":[{"cmd":"pip install ipex-llm[cpu]","lang":"bash","label":"For CPU inference"},{"cmd":"pip install --pre --upgrade ipex-llm[xpu]","lang":"bash","label":"For Intel GPUs (XPUs), often with nightly builds"}],"dependencies":[{"reason":"IPEX-LLM is built on PyTorch and requires a compatible version.","package":"torch","optional":false}],"imports":[{"symbol":"LLM","correct":"from ipex_llm import LLM"},{"note":"The direct import `from ipex_llm import optimize_model` is correct for the current API. Older versions or specific submodules might have used `from ipex_llm.transformers import optimize_model` which may still work but is less idiomatic now.","wrong":"from ipex_llm.transformers import optimize_model","symbol":"optimize_model","correct":"from ipex_llm import optimize_model"},{"symbol":"AutoModel","correct":"from ipex_llm.transformers import AutoModel"},{"symbol":"AutoTokenizer","correct":"from ipex_llm.transformers import AutoTokenizer"}],"quickstart":{"code":"from ipex_llm import LLM\n\n# Instantiate LLM model\nmodel = LLM(\n    model_name='/path/to/your/model',\n    optimize_type='int4',\n    dtype='auto',\n    trust_remote_code=True\n)\n\n# Example for text generation\nprompt = \"What is the capital of France?\"\noutput = model(prompt)\nprint(output)\n\n# For AutoModel/AutoTokenizer\nfrom ipex_llm.transformers import AutoModel, AutoTokenizer\n\nmodel_id = \"TinyLlama/TinyLlama-1.1B-Chat-v1.0\"\ntokenizer = AutoTokenizer.from_pretrained(model_id)\nmodel = AutoModel.from_pretrained(\n    model_id,\n    load_in_4bit=True, # or load_in_low_bit, quantize=4 etc.\n    torch_dtype='auto'\n)\n\ninput_ids = tokenizer.encode(prompt, return_tensors=\"pt\")\noutput = model.generate(input_ids, max_new_tokens=32)\nprint(tokenizer.decode(output[0], skip_special_tokens=True))","lang":"python","description":"This quickstart demonstrates how to load an LLM using the `ipex_llm.LLM` class for simple inference or `ipex_llm.transformers.AutoModel` and `AutoTokenizer` for more fine-grained control and compatibility with Hugging Face Transformers. Ensure your `model_name` or `model_id` points to a valid local path or Hugging Face model."},"warnings":[{"fix":"Update your `pip install` commands from `bigdl-llm` to `ipex-llm`. Adjust import statements from `from bigdl.llm...` to `from ipex_llm...`.","message":"The library was rebranded from `BigDL-LLM` to `ipex-llm`. This changes package names, import paths, and some CLI tools.","severity":"breaking","affected_versions":"All versions prior to 2.x (BigDL-LLM) when migrating to 2.x (ipex-llm)."},{"fix":"Always use `pip install ipex-llm[cpu]` or `pip install ipex-llm[xpu]` as appropriate for your system. Refer to the official documentation for detailed hardware requirements.","message":"IPEX-LLM installations are hardware-specific. Users must install the correct extras for their target platform (`[cpu]` for Intel CPUs or `[xpu]` for Intel GPUs). Installing without the correct extra may lead to missing dependencies or suboptimal performance.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always check the official IPEX-LLM documentation and release notes for the recommended PyTorch and oneAPI versions for your IPEX-LLM release. Consider using the `--pre` flag during installation to get the latest compatible builds.","message":"IPEX-LLM's performance is highly dependent on specific PyTorch and underlying Intel oneAPI library versions. Incompatible versions can lead to errors or degraded performance.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For new code, prefer using `ipex_llm.LLM` or `ipex_llm.transformers.AutoModel.from_pretrained()` with `load_in_4bit`/`load_in_low_bit` for easier integration and more streamlined workflows.","message":"While still functional, the `ipex_llm.optimize_model` API is being superseded by the higher-level `ipex_llm.LLM` and `ipex_llm.transformers.AutoModel` APIs for model loading and quantization.","severity":"deprecated","affected_versions":"2.x onwards"}],"env_vars":null,"last_verified":"2026-04-13T00:00:00.000Z","next_check":"2026-07-12T00:00:00.000Z"}