{"id":4748,"library":"rouge","title":"Python ROUGE Score Implementation","description":"The 'rouge' library provides a full, native Python implementation of the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metric, used for evaluating automatic text summarization and machine translation. Unlike some other ROUGE packages, it is not a wrapper around the original Perl script. The current stable version is 1.0.1, with releases occurring periodically to introduce features and fixes.","status":"active","version":"1.0.1","language":"en","source_language":"en","source_url":"https://github.com/pltrdy/rouge","tags":["NLP","evaluation","summarization","ROUGE","text processing"],"install":[{"cmd":"pip install rouge","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Used for Python 2/3 compatibility, though Python 3 is the primary target for recent versions.","package":"six","optional":false}],"imports":[{"symbol":"Rouge","correct":"from rouge import Rouge"}],"quickstart":{"code":"from rouge import Rouge\n\nhypothesis = \"the cat sat on the mat\"\nreference = \"the cat is on the mat\"\n\nrouge = Rouge()\nscores = rouge.get_scores(hypothesis, reference)\nprint(scores)","lang":"python","description":"Calculate ROUGE scores (ROUGE-1, ROUGE-2, ROUGE-L) for a single hypothesis-reference pair. The `get_scores` method returns a list of dictionaries, each containing 'f' (F1-score), 'p' (precision), and 'r' (recall) for each ROUGE type."},"warnings":[{"fix":"Be aware of potential minor score differences. If strict adherence to ROUGE-155 results is required, consider using packages that wrap the Perl script or `rouge-score` by Google, which aims for Perl script replication.","message":"This 'rouge' library (pltrdy/rouge) is a native Python implementation and explicitly states its results may be slightly different from the 'official' ROUGE-155 Perl script. If exact replication of Perl ROUGE-155 is critical, or if using a different Python ROUGE implementation (like 'rouge-score' by Google) results in discrepancies, this is the expected behavior for this package.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure you are importing `from rouge import Rouge` for this library. If you intend to use Google's implementation, install `rouge-score` and import `from rouge_score import rouge_scorer.RougeScorer`.","message":"There are multiple Python libraries for ROUGE, notably `rouge` (this package, from pltrdy) and `rouge-score` (from Google Research). They have different import paths and API interfaces. Confusing them is a common footgun.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Pre-tokenize your input strings into space-separated words or n-grams before passing them to the `Rouge().get_scores()` method. For example, `hypothesis = 'word1 word2 word3'`.","message":"The library is language-agnostic and expects tokenized input. For optimal results, especially with non-English texts or specific NLP tasks, users should pre-process and tokenize the hypothesis and reference texts before passing them to the `get_scores` method.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade to version 0.3 or higher to ensure correct ROUGE-L calculations, especially for multi-sentence inputs.","message":"Prior to version 0.3, there was an error in ROUGE-L calculation when handling sequences with multiple sentences. This was fixed in version 0.3.","severity":"deprecated","affected_versions":"<0.3"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}