Elasticsearch Python Client (v8)
The official Python client for Elasticsearch versions 8.x and later. It provides a low-level client for interacting with the Elasticsearch REST API, supporting both synchronous and asynchronous operations. The current version is 8.19.3, and it follows the Elasticsearch release cycle for major updates.
Warnings
- gotcha The PyPI package name is `elasticsearch8`, but the import statement remains `from elasticsearch import Elasticsearch`. Attempting `from elasticsearch8 import Elasticsearch` will result in a `ModuleNotFoundError`.
- breaking Migration from `elasticsearch-py` v7 to `elasticsearch8` (v8.x) introduces several breaking changes. Key changes include: `request_timeout` parameter renamed to `timeout`, removal of automatic `sniffing` for cluster discovery, and `verify_certs` defaulting to `True` for enhanced security.
- gotcha For asynchronous operations, you must install the `[async]` extra (e.g., `pip install elasticsearch8[async]`) and import `AsyncElasticsearch` instead of `Elasticsearch`. The default client is synchronous.
- gotcha Certificate verification is enabled by default (`verify_certs=True`). If connecting to a self-signed or improperly configured HTTPS endpoint without a trusted CA certificate, you will encounter SSL errors.
- gotcha When connecting to Elastic Cloud, prefer `cloud_id` and `api_key` for authentication. For self-managed instances, `hosts` and `basic_auth` (username/password) are common. Failing to provide correct authentication will lead to connection errors.
- deprecated The `doc_type` parameter is largely removed or ignored in Elasticsearch v8, as documents no longer have types. Using it in client calls may raise warnings or errors.
Install
-
pip install elasticsearch8 -
pip install elasticsearch8[async]
Imports
- Elasticsearch
from elasticsearch import Elasticsearch
- AsyncElasticsearch
from elasticsearch import AsyncElasticsearch
Quickstart
import os
from elasticsearch import Elasticsearch
# --- Configuration for connecting to Elasticsearch ---
# For Elastic Cloud (recommended)
CLOUD_ID = os.environ.get("ELASTIC_CLOUD_ID", "")
API_KEY = os.environ.get("ELASTIC_API_KEY", "") # Base64 encoded API Key ID and Key
# For self-managed Elasticsearch
ES_HOST = os.environ.get("ELASTICSEARCH_HOST", "http://localhost:9200")
ES_USERNAME = os.environ.get("ELASTICSEARCH_USERNAME", "elastic")
ES_PASSWORD = os.environ.get("ELASTICSEARCH_PASSWORD", "changeme")
client = None
if CLOUD_ID and API_KEY:
print("Connecting to Elastic Cloud...")
client = Elasticsearch(
cloud_id=CLOUD_ID,
api_key=API_KEY,
)
elif ES_HOST:
print(f"Connecting to self-managed Elasticsearch at {ES_HOST}...")
client = Elasticsearch(
hosts=[ES_HOST],
basic_auth=(ES_USERNAME, ES_PASSWORD),
verify_certs=False # CAUTION: Only for local dev/testing without proper CA config
)
else:
raise ValueError(
"Elasticsearch connection details not provided. Set ELASTIC_CLOUD_ID/ELASTIC_API_KEY "
"or ELASTICSEARCH_HOST/ELASTICSEARCH_USERNAME/ELASTICSEARCH_PASSWORD environment variables."
)
# --- Verify Connection ---
if client.ping():
print("Successfully connected to Elasticsearch!")
else:
print("Could not connect to Elasticsearch. Please check your configuration.")
exit(1)
# --- Index a Document ---
index_name = "my-documents"
doc_id = "1"
document = {
"author": "John Doe",
"text": "Elasticsearch is a powerful open-source search and analytics engine.",
"timestamp": "2024-05-15T12:00:00Z"
}
print(f"Indexing document with ID '{doc_id}' into index '{index_name}'...")
response = client.index(index=index_name, id=doc_id, document=document)
print(f"Indexed document: Result - {response['result']}")
# --- Search for Documents ---
print(f"Searching for documents in index '{index_name}'...")
search_query = {"match": {"text": "search engine"}}
response = client.search(index=index_name, query=search_query)
print(f"Found {response['hits']['total']['value']} hits:")
for hit in response['hits']['hits']:
print(f" ID: {hit['_id']}, Source: {hit['_source']}")
# --- Clean Up (Optional) ---
# client.indices.delete(index=index_name, ignore=[400, 404])
# print(f"Index '{index_name}' deleted (if it existed).")
client.close()
print("Client connection closed.")