Marqo

3.18.0 · active · verified Fri Apr 17

Marqo is an open-source, RAG-ready vector database and tensor search engine built on OpenSearch. It enables multimodal search across text, images, and other data types using embeddings, simplifying the development of advanced search and Retrieval-Augmented Generation (RAG) applications. Currently at version 3.18.0, Marqo maintains an active development cycle with frequent updates and releases.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a Marqo client, create an index, add documents, and perform a basic search. It includes environment variable support for connecting to a Marqo instance and handles index creation idempotently.

import marqo
import os

# For local Marqo instances (e.g., via Docker), the URL is often http://localhost:8882
# For cloud instances, set MARQO_URL and optionally MARQO_API_KEY
marqo_url = os.environ.get('MARQO_URL', 'http://localhost:8882')
marqo_api_key = os.environ.get('MARQO_API_KEY', None)

mq = marqo.Client(url=marqo_url, api_key=marqo_api_key)

index_name = "my-first-marqo-index"

# Create an index (if it doesn't exist)
try:
    mq.get_index(index_name=index_name)
    print(f"Index '{index_name}' already exists.")
except marqo.errors.MarqoApiError as e:
    if "index_not_found" in str(e).lower():
        print(f"Creating index '{index_name}'...")
        mq.create_index(index_name=index_name)
    else:
        raise e


# Add documents to the index
docs = [
    {
        "_id": "doc1",
        "title": "The Art of Computer Programming",
        "description": "A series of comprehensive monographs by Donald Knuth covering many topics in computer science."
    },
    {
        "_id": "doc2",
        "title": "Structure and Interpretation of Computer Programs",
        "description": "An influential computer science textbook by Abelson and Sussman, known as SICP."
    }
]

response_add = mq.add_documents(index_name=index_name, documents=docs)
print("Added documents:", response_add)

# Perform a search
search_query = "computer science textbooks"
response_search = mq.search(index_name=index_name, q=search_query)
print(f"\nSearch results for '{search_query}':")
for hit in response_search['hits']:
    print(f"  Title: {hit['title']}, Score: {hit['_score']:.2f}")

# Clean up (optional): delete the index
# mq.delete_index(index_name=index_name)

view raw JSON →