{"id":4917,"library":"csvw","title":"CSVW Python Library","description":"The `csvw` Python library (version 3.7.0) provides an API to read and write relational, tabular data in adherence to the W3C CSV on the Web specification. It offers functionalities for parsing CSVW described data, converting it to JSON, and validating metadata. The project maintains an active development status with regular releases.","status":"active","version":"3.7.0","language":"en","source_language":"en","source_url":"https://github.com/cldf/csvw","tags":["csv","data","metadata","linked-data","csvw","w3c"],"install":[{"cmd":"pip install csvw","lang":"bash","label":"Install stable release"}],"dependencies":[{"reason":"Requires Python 3.8 or higher.","package":"Python","optional":false}],"imports":[{"note":"The primary class for interacting with CSVW data and metadata.","symbol":"CSVW","correct":"from csvw import CSVW"}],"quickstart":{"code":"import json\nfrom csvw import CSVW\nimport os\n\n# Example using a remote CSVW metadata file\n# Note: In a real application, you might use a local file path.\n# Ensure 'https://raw.githubusercontent.com/cldf/csvw/master/tests/fixtures/test.tsv' is accessible.\n\ntry:\n    data = CSVW('https://raw.githubusercontent.com/cldf/csvw/master/tests/fixtures/test.tsv')\n    # Convert the CSVW data to JSON\n    json_output = data.to_json()\n    print(json.dumps(json_output, indent=2))\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\n    print(\"Please ensure the URL is correct and accessible.\")","lang":"python","description":"This quickstart demonstrates how to instantiate a `CSVW` object from a URL pointing to a TSV file (or a CSVW metadata file) and then convert the described data to a JSON representation. The `to_json()` method serializes the tabular data according to the CSVW specification."},"warnings":[{"fix":"Ensure you explicitly `pip install csvw` and `from csvw import CSVW`. Do not confuse it with `csvwlib` which uses `from csvwlib import CSVWConverter`.","message":"There are multiple Python libraries with 'csvw' in their name, notably `csvw` (this library) and `csvwlib`. They have distinct APIs and functionalities. Installing and importing `csvwlib` instead of `csvw` will lead to incompatible API calls and unexpected behavior.","severity":"breaking","affected_versions":"All versions"},{"fix":"If strict positional matching is required, explicitly specify `'header': false` and `'skipRows': 1` in the table's dialect description within your CSVW metadata.","message":"The `csvw` library does not implement the *full* CSVW specification. Specifically, when reading CSV files with headers, columns are matched based on their header text and column descriptions' 'name' or 'titles' attributes, not strictly by order as might be expected by the spec. This allows more flexibility but deviates from a strict interpretation.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Be aware of these specific `csv` module limitations, particularly when dealing with `commentPrefix`, `escapechar`, `quoteChar`, and `doubleQuote` settings in your dialect. Test data thoroughly with complex characters and quoting.","message":"Due to reliance on Python's standard `csv` module, certain behaviors related to `escapechar` and `commentPrefix` can be inconsistent or unexpected. For instance, if `commentPrefix` is specified in a `Dialect` instance, rows starting with it will be skipped even if the value was quoted. Also, cell content with `escapechar` may not round-trip as expected when `doubleQuote==False` and minimal quoting is used.","severity":"gotcha","affected_versions":"All versions"},{"fix":"When working with `anyURI` types, be aware that the string representation may change due to normalization. If exact string preservation is critical for non-normalized URIs, consider storing them as `string` datatype instead, or handle normalization explicitly before passing to `anyURI`.","message":"The `anyURI` datatype in `csvw.datatypes` normalizes URLs according to RFC 3986 during serialization to a string. This normalization means that round-tripping (serializing and then deserializing) a URI is not guaranteed to yield an identical string if the original URI contained non-normalized forms.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}