Handle non-ASCII encoding #20

l0b0 · 2013-06-12T17:57:02Z

UTF-8 characters should not be detected as invalid text.

citrin · 2016-09-11T15:00:24Z

Quoted-printable encoding is not allowed (obsolete) in vCard 3.0 (RFC 2426).
QP can be used only in vCard 2.1

citrin · 2016-09-12T00:19:53Z

But non-ASCII charset encodings indeed should be supported.
RFC 2426 explicitly allow to use 8-bit strings. Because CHARSET parameter was removed from vCard 3.0 (compare to 2.1) and file contains no information about text charset you can assume that file uses utf-8.

But vcard tool can't parse utf-8 files. Error is:

... skipped ...
    property_.values = get_vcard_property_values(values_string)
  File "/home/citrin/.local/lib/python2.7/site-packages/vcard/vcard_validator.py", line 295, in get_vcard_property_values
    values.append(get_vcard_property_sub_values(sub))
  File "/home/citrin/.local/lib/python2.7/site-packages/vcard/vcard_validator.py", line 333, in get_vcard_property_sub_values
    raise VCardValueError('{0}: {1}'.format(NOTE_INVALID_SUB_VALUE, sub_value), {})
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)

It seems to be vcard first converts input text to python internal form (from raw octets to Unicode characters) then uses regex based on grammar from RFC2426 (NON-ASCII = %x80-FF). But it is wrong - %x80-FF should be applied to octets, not Unicode characters. Of course Unicode code points can be grater than 0xFF.

Test vCard in utf-8 can be found here: http://www.citrin.ru/tmp/minimal.vcf

l0b0 mentioned this issue Jun 12, 2013

Refactor/sets for membership #18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle non-ASCII encoding #20

Handle non-ASCII encoding #20

l0b0 commented Jun 12, 2013 •

edited

Loading

citrin commented Sep 11, 2016

citrin commented Sep 12, 2016 •

edited

Loading

Handle non-ASCII encoding #20

Handle non-ASCII encoding #20

Comments

l0b0 commented Jun 12, 2013 • edited Loading

citrin commented Sep 11, 2016

citrin commented Sep 12, 2016 • edited Loading

l0b0 commented Jun 12, 2013 •

edited

Loading

citrin commented Sep 12, 2016 •

edited

Loading