{"library":"openevals","title":"OpenEvals","description":"OpenEvals is an open-source Python library providing ready-made evaluators for Large Language Model (LLM) applications. It offers a structured approach to LLM evaluation, similar to traditional software testing, with built-in functionalities like LLM-as-judge evaluators and prebuilt prompts for common evaluation scenarios such as correctness, conciseness, and hallucination detection. Developed by LangChain, it aims to streamline the process of bringing LLM applications to production by making evaluation more accessible and transparent. The current version is 0.2.0, with ongoing development and updates.","language":"python","status":"active","last_verified":"Thu May 14","install":{"commands":["pip install openevals"],"cli":null},"imports":["from openevals.llm import create_llm_as_judge","from openevals.prompts import CORRECTNESS_PROMPT","from openevals.prompts import CONCISENESS_PROMPT","from openevals.prompts import HALLUCINATION_PROMPT"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import os\nfrom openevals.llm import create_llm_as_judge\nfrom openevals.prompts import CORRECTNESS_PROMPT\n\n# Ensure your OpenAI API key is set as an environment variable\n# For example: os.environ[\"OPENAI_API_KEY\"] = \"sk-...\"\n# For quickstart, we use .get to avoid immediate error if not set, but it's required for actual use.\nif not os.environ.get(\"OPENAI_API_KEY\"): print(\"WARNING: OPENAI_API_KEY not set. Quickstart will fail without it.\")\n\n# Create a correctness evaluator using an LLM-as-judge\ncorrectness_evaluator = create_llm_as_judge(\n    prompt=CORRECTNESS_PROMPT,\n    model=\"openai:o3-mini\", # 'o3-mini' refers to gpt-3.5-turbo-0125\n)\n\n# Define inputs, outputs, and reference outputs for evaluation\ninputs = \"How much has the price of doodads changed in the past year?\"\noutputs = \"Doodads have increased in price by 10% in the past year.\"\nreference_outputs = \"The price of doodads has decreased by 50% in the past year.\"\n\n# Run the evaluator\neval_result = correctness_evaluator(\n    inputs=inputs,\n    outputs=outputs,\n    reference_outputs=reference_outputs\n)\n\nprint(eval_result)\n# Expected output (score might vary slightly based on LLM, but structure is consistent):\n# { 'key': 'score', 'score': False, 'comment': 'The provided answer stated that doodads increased in price by 10%, which conflicts with the reference output...' }","lang":"python","description":"This quickstart demonstrates how to set up and run a basic LLM-as-judge correctness evaluation. It uses a prebuilt prompt and an OpenAI model. Ensure your `OPENAI_API_KEY` environment variable is set for the example to run successfully. The evaluator returns a dictionary containing a score and a comment based on the LLM's judgment.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":{"tag":null,"tag_description":null,"last_tested":"2026-05-14","installed_version":"0.1.0","pypi_latest":"0.2.0","is_stale":true,"summary":{"python_range":"3.10–3.9","success_rate":100,"avg_install_s":11.5,"avg_import_s":2.49,"wheel_type":"wheel"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.19,"mem_mb":28.9,"disk_size":"107.0M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":2.36,"mem_mb":28.6,"disk_size":"104.3M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":12.2,"import_time_s":1.72,"mem_mb":28.9,"disk_size":"115M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":1.96,"mem_mb":28.6,"disk_size":"112M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.83,"mem_mb":31.4,"disk_size":"115.9M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":3.19,"mem_mb":31.1,"disk_size":"112.9M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":11.4,"import_time_s":2.59,"mem_mb":31.4,"disk_size":"124M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":2.58,"mem_mb":31.1,"disk_size":"121M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.77,"mem_mb":30.8,"disk_size":"106.1M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":3.12,"mem_mb":30.5,"disk_size":"103.2M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":9,"import_time_s":2.77,"mem_mb":30.8,"disk_size":"114M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":3.06,"mem_mb":30.5,"disk_size":"111M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.43,"mem_mb":31.3,"disk_size":"105.9M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":2.9,"mem_mb":31,"disk_size":"102.8M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":9.2,"import_time_s":2.47,"mem_mb":31.3,"disk_size":"114M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":2.85,"mem_mb":31,"disk_size":"111M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.07,"mem_mb":27.7,"disk_size":"135.0M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":2.15,"mem_mb":27.6,"disk_size":"133.0M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":15.9,"import_time_s":1.91,"mem_mb":27.7,"disk_size":"142M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"openevals","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":1.88,"mem_mb":27.6,"disk_size":"140M"}]}}