{"id":8329,"library":"morphys","title":"Morphys (Python library)","description":"Morphys is a Python library (version 1.0) designed for smart and common conversions between Unicode (Python `str`) and bytes types. It aims to simplify handling character encodings, offering a consistent interface for these essential text processing operations. While available on PyPI, specific details about its ongoing development or release cadence are not publicly detailed beyond its initial release in 2018.","status":"active","version":"1.0","language":"en","source_language":"en","source_url":"https://github.com/mkalinski/morphys","tags":["unicode","bytes","encoding","text processing","conversion"],"install":[{"cmd":"pip install morphys","lang":"bash","label":"Install latest version"}],"dependencies":[],"imports":[{"note":"Assumed conversion function from bytes to unicode string.","symbol":"to_unicode","correct":"from morphys import to_unicode"},{"note":"Assumed conversion function from unicode string to bytes.","symbol":"to_bytes","correct":"from morphys import to_bytes"}],"quickstart":{"code":"from morphys import to_unicode, to_bytes\n\n# Example 1: Convert a Unicode string to bytes\nunicode_string = \"Hello, world! 👋\"\nencoded_bytes = to_bytes(unicode_string, encoding=\"utf-8\")\nprint(f\"Unicode string: {unicode_string}\")\nprint(f\"Encoded bytes: {encoded_bytes}\")\n\n# Example 2: Convert bytes back to a Unicode string\ndecoded_string = to_unicode(encoded_bytes, encoding=\"utf-8\")\nprint(f\"Decoded string: {decoded_string}\")\n\n# Example 3: Handling potential non-UTF8 characters (assuming default error handling)\n# If 'morphys' has smart handling, it might try other encodings or use 'replace'/'ignore'\n# This example still explicitly uses UTF-8 and shows a common issue if characters are not valid\ntry:\n    latin1_bytes = b'caf\\xe9'\n    print(f\"\\nAttempting to decode latin1 bytes: {latin1_bytes}\")\n    decoded_from_latin1 = to_unicode(latin1_bytes, encoding=\"utf-8\", errors=\"strict\")\n    print(f\"Decoded with strict UTF-8: {decoded_from_latin1}\")\nexcept UnicodeDecodeError as e:\n    print(f\"Error decoding with strict UTF-8 (expected): {e}\")\n    # Using 'replace' error handling for robustness\n    decoded_from_latin1_safe = to_unicode(latin1_bytes, encoding=\"utf-8\", errors=\"replace\")\n    print(f\"Decoded with UTF-8 (replace errors): {decoded_from_latin1_safe}\")","lang":"python","description":"This quickstart demonstrates the core functionality of `morphys`: converting between Python's Unicode `str` type and `bytes`. It shows encoding a string to bytes and then decoding the bytes back to a string, using the common `utf-8` encoding. It also includes an example of handling bytes that might not conform to the expected encoding, illustrating common error handling strategies (assuming `morphys` wraps standard Python encoding/decoding with similar `errors` parameters)."},"warnings":[{"fix":"For mission-critical applications or advanced encoding needs, consider Python's built-in `str.encode()` and `bytes.decode()` methods with explicit encoding and error handling, or more actively maintained libraries like `chardet` for robust encoding detection.","message":"The `morphys` library (version 1.0) appears to be minimally maintained since its 2018 release. While functional, it might not receive updates for newer Python versions or complex encoding scenarios.","severity":"gotcha","affected_versions":"1.0"},{"fix":"Ensure that `str` objects are passed to functions expecting `str` and `bytes` objects to functions expecting `bytes`. Explicitly specify `encoding` (e.g., `'utf-8'`) and `errors` (e.g., `'strict'`, `'ignore'`, `'replace'`) arguments during conversion to prevent unexpected behavior.","message":"Misunderstanding string vs. bytes in Python is a common source of `TypeError` or `UnicodeDecodeError`/`UnicodeEncodeError`. Always be explicit about the type you're working with and the expected encoding.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Explicitly specify the encoding, typically `utf-8`, when converting a string to bytes or writing to a file. For example: `to_bytes(my_string, encoding='utf-8')` or `open('file.txt', 'w', encoding='utf-8')`.","cause":"Attempting to encode a Unicode string containing characters not representable in the target (often default system) encoding, or writing non-ASCII characters to a file opened without specifying a UTF-8 encoding.","error":"UnicodeEncodeError: 'charmap' codec can't encode character..."},{"fix":"Determine the correct encoding of the incoming bytes and use it for decoding. If the encoding is unknown or mixed, consider using `errors='replace'` or `errors='ignore'` for lenient decoding, or a library like `chardet` for detection. Example: `to_unicode(my_bytes, encoding='latin-1')` or `to_unicode(my_bytes, encoding='utf-8', errors='replace')`.","cause":"Attempting to decode a byte sequence using an incorrect encoding. For instance, trying to decode Latin-1 encoded bytes with a UTF-8 decoder.","error":"UnicodeDecodeError: 'utf-8' codec can't decode byte 0x... in position ...: invalid start byte"},{"fix":"Convert the `str` to `bytes` using `to_bytes()` before passing it, or convert `bytes` to `str` using `to_unicode()` if the function expects a string. Example: `func_expecting_bytes(to_bytes(my_string))` or `func_expecting_string(to_unicode(my_bytes))`.","cause":"Passing a Python `str` object to a function or operation that explicitly expects a `bytes` object (or vice-versa), without performing the necessary conversion.","error":"TypeError: a bytes-like object is required, not 'str'"}]}