{"id":23437,"library":"cocoindex","title":"CocoIndex","description":"CocoIndex is a Python library that automatically maintains search indexes derived from declarative transformations. Users define how to transform source data into an index, and CocoIndex incrementally updates the index when sources change, minimizing recomputation. As of version 1.0.2, it requires Python >= 3.11 and is under active development with monthly releases.","status":"active","version":"1.0.2","language":"python","source_language":"en","source_url":"https://github.com/cocoindex-cx/cocoindex","tags":["indexing","data-engineering","embeddings","ETL","vector-search"],"install":[{"cmd":"pip install cocoindex","lang":"bash","label":"Install from PyPI"}],"dependencies":[],"imports":[{"note":"Main entry point for declaring data flows and indexes.","symbol":"cocoindex","correct":"import cocoindex"},{"note":"DataFlow is part of the public API; avoid internal module paths.","wrong":"from cocoindex._core import DataFlow","symbol":"DataFlow","correct":"from cocoindex import DataFlow"}],"quickstart":{"code":"import cocoindex\nimport os\n\n# Define a simple data flow that indexes documents into a vector index\nclass MyIndex(cocoindex.DataFlow):\n    def transform(self) -> None:\n        source = self.load_csv(\"data/documents.csv\")  # columns: id, text\n        source[\"embedding\"] = source[\"text\"].apply(lambda text: [0.0] * 384)  # placeholder embedding\n        self.create_index(source, on=\"embedding\", name=\"documents_index\")\n\n# Build the index (synchronous example)\ncocoindex.build(MyIndex(), output_dir=\"./mydb\")\nprint(\"Index built successfully.\")","lang":"python","description":"Minimal example: define a DataFlow subclass, transform data, and build a persistent index."},"warnings":[{"fix":"Use cocoindex.rebuild() or cocoindex.update() after changing source data.","message":"CocoIndex does not automatically detect changes to source files; you need to trigger a rebuild explicitly for incremental updates.","severity":"gotcha","affected_versions":"<=1.0.2"},{"fix":"Provide a valid embedding vector (list of floats) for each document in the 'embedding' column.","message":"Embedding generation is user's responsibility; CocoIndex only indexes the embeddings you provide. Common footgun: forgetting to generate embeddings before indexing.","severity":"gotcha","affected_versions":"all"},{"fix":"Use public API imports: from cocoindex import DataFlow, build, rebuild, etc.","message":"The old import from cocoindex._core classes is deprecated and will be removed in future versions.","severity":"deprecated","affected_versions":"<=1.0.2"}],"env_vars":null,"last_verified":"2026-05-01T00:00:00.000Z","next_check":"2026-07-30T00:00:00.000Z","problems":[{"fix":"Run pip install cocoindex --upgrade in the correct Python environment (>=3.11).","cause":"CocoIndex not installed or installed in a different environment.","error":"ModuleNotFoundError: No module named 'cocoindex'"},{"fix":"Ensure transform() method always creates an index by calling self.create_index() even if no data.","cause":"The transform() method did not call self.create_index() or returned before creating the index.","error":"TypeError: 'NoneType' object is not iterable when calling cocoindex.build()"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}