{"id":10276,"library":"tabmat","title":"Tabmat","description":"Tabmat provides efficient matrix representations for working with tabular data, designed to integrate seamlessly with various dataframe libraries. It offers specialized matrix types like DenseMatrix, CategoricalMatrix, and SplitMatrix for performance-critical statistical and machine learning tasks, especially useful for generalized linear models. The current version is 4.2.1, with an active development pace and frequent releases addressing bug fixes and new features.","status":"active","version":"4.2.1","language":"en","source_language":"en","source_url":"https://github.com/Quantco/tabmat","tags":["matrix","tabular data","numerical computing","statistics","machine learning","sparse matrix"],"install":[{"cmd":"pip install tabmat","lang":"bash","label":"Install tabmat"}],"dependencies":[{"reason":"Core numerical operations and array representations.","package":"numpy","optional":false},{"reason":"Used for sparse matrix operations and components like `sps.csc_matrix`.","package":"scipy","optional":false},{"reason":"Commonly used for input dataframes, especially with `from_df` and `from_formula`.","package":"pandas","optional":true},{"reason":"Enables support for various dataframe types (e.g., Polars, PyArrow) via `from_df` and `from_formula`.","package":"narwhals","optional":true}],"imports":[{"symbol":"from_df","correct":"from tabmat import from_df"},{"symbol":"from_formula","correct":"from tabmat import from_formula"},{"note":"Class is directly in the tabmat package, not a submodule.","wrong":"import tabmat.DenseMatrix","symbol":"DenseMatrix","correct":"from tabmat import DenseMatrix"},{"symbol":"CategoricalMatrix","correct":"from tabmat import CategoricalMatrix"},{"symbol":"SplitMatrix","correct":"from tabmat import SplitMatrix"}],"quickstart":{"code":"import pandas as pd\nimport tabmat as tm\nimport numpy as np\n\n# Create a sample DataFrame\ndf = pd.DataFrame({\"numeric_col\": [1, 2, 3, 4],\n                   \"categorical_col\": [\"A\", \"B\", \"A\", \"C\"],\n                   \"bool_col\": [True, False, True, False]})\n\n# Create a SplitMatrix from the DataFrame, standardizing numeric columns\n# and dropping the first level for categorical encoding\nmatrix = tm.from_df(df, standardize=True, drop_first=True)\n\nprint(f\"Matrix shape: {matrix.shape}\")\nprint(f\"Matrix parts (e.g., DenseMatrix, CategoricalMatrix): {matrix.matrices}\")\n\n# Example of matrix-vector multiplication\nvec = np.random.rand(matrix.shape[1])\nresult = matrix.matvec(vec)\nprint(f\"Result of matvec (first 5 elements): {result[:5]}\")","lang":"python","description":"This quickstart demonstrates how to create a `SplitMatrix` from a pandas DataFrame using `tabmat.from_df`. It automatically handles different column types, applying standardization and one-hot encoding as specified. The example then shows how to perform a matrix-vector multiplication, a common operation for `tabmat` objects."},"warnings":[{"fix":"Use the `.unpack()` method (or `.toarray()` for `DenseMatrix`) to explicitly convert to the underlying array type before performing operations that require a standard NumPy array or SciPy sparse matrix. For example, `dense_matrix.unpack()`.","message":"As of v4.0.0, `DenseMatrix` and `SparseMatrix` no longer inherit from `numpy.ndarray` and `scipy.sparse.csc_matrix` respectively. Direct array-like access (e.g., `.A`) or implicit conversion will now fail.","severity":"breaking","affected_versions":">=4.0.0"},{"fix":"Upgrade your Python environment to version 3.10 or higher. If unable to upgrade, install an older version of tabmat, e.g., `pip install \"tabmat<4.2.0\"`.","message":"Tabmat v4.2.0 and later require Python 3.10 or newer. Installation via `pip` will fail with an incompatibility error on older Python versions.","severity":"breaking","affected_versions":">=4.2.0"},{"fix":"Upgrade to tabmat >=4.2.1, which includes fixes for read-only buffer handling across various matrix operations. If upgrading is not an option, ensure that any input arrays passed to tabmat methods are writable (e.g., by making a copy: `my_array.copy()`).","message":"Methods of `CategoricalMatrix` and related internal functions in versions prior to 4.2.1/4.1.3 might raise a `RuntimeError` when operating on read-only buffers (e.g., NumPy arrays with `writeable=False`).","severity":"gotcha","affected_versions":"<4.2.1"},{"fix":"Test your existing code with the new versions. Consult the `narwhals` documentation if you encounter unexpected behavior, especially when working with non-pandas dataframes or specific dataframe operations.","message":"`tabmat.from_df` and `tabmat.from_formula` now use `narwhals`' v2 API and support a wider range of dataframes (including `polars`). While this enhances compatibility, users should be aware of potential subtle behavioral changes if they were relying on specific `pandas` dataframe quirks or older `narwhals` API behavior.","severity":"gotcha","affected_versions":">=4.1.4"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Use `dense_matrix.unpack()` or `dense_matrix.toarray()` to get the underlying NumPy array for direct array manipulation.","cause":"Attempting to access the underlying NumPy array using the `.A` attribute, which was removed in tabmat v4.0.0 when `DenseMatrix` stopped inheriting from `np.ndarray`.","error":"AttributeError: 'DenseMatrix' object has no attribute 'A'"},{"fix":"Explicitly convert the `DenseMatrix` to a NumPy array using `dense_matrix.unpack()` or `dense_matrix.toarray()` before passing it to functions expecting a `np.ndarray`.","cause":"Trying to pass a `DenseMatrix` directly where a `numpy.ndarray` is expected, due to the breaking change in tabmat v4.0.0 that removed direct inheritance from `np.ndarray`.","error":"TypeError: can't convert DenseMatrix to numpy.ndarray implicitly"},{"fix":"Upgrade your Python environment to 3.10 or newer. If an upgrade is not possible, install an older version of tabmat: `pip install \"tabmat<4.2.0\"`.","cause":"Attempting to install or use tabmat version 4.2.0 or higher with an incompatible Python version (older than 3.10).","error":"ERROR: Package 'tabmat' requires Python '>=3.10' but the running Python is 3.X.Y"},{"fix":"Upgrade to tabmat version 4.2.1 or newer, which includes fixes for operating on read-only buffers. If an upgrade is not possible, ensure any input arrays are writable, e.g., by creating a copy: `my_array.copy(order='C')`.","cause":"Certain `CategoricalMatrix` methods or related internal operations in older tabmat versions were called with an immutable (read-only) NumPy array or buffer, which they attempted to modify.","error":"RuntimeError: buffer source array is read-only"}]}