You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 3, 2019. It is now read-only.
But non-ASCII charset encodings indeed should be supported.
RFC 2426 explicitly allow to use 8-bit strings. Because CHARSET parameter was removed from vCard 3.0 (compare to 2.1) and file contains no information about text charset you can assume that file uses utf-8.
But vcard tool can't parse utf-8 files. Error is:
... skipped ...
property_.values = get_vcard_property_values(values_string)
File "/home/citrin/.local/lib/python2.7/site-packages/vcard/vcard_validator.py", line 295, in get_vcard_property_values
values.append(get_vcard_property_sub_values(sub))
File "/home/citrin/.local/lib/python2.7/site-packages/vcard/vcard_validator.py", line 333, in get_vcard_property_sub_values
raise VCardValueError('{0}: {1}'.format(NOTE_INVALID_SUB_VALUE, sub_value), {})
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)
It seems to be vcard first converts input text to python internal form (from raw octets to Unicode characters) then uses regex based on grammar from RFC2426 (NON-ASCII = %x80-FF). But it is wrong - %x80-FF should be applied to octets, not Unicode characters. Of course Unicode code points can be grater than 0xFF.
UTF-8 characters should not be detected as invalid text.
The text was updated successfully, but these errors were encountered: