[Vobject] saving unicode vcard
Jeffrey Harris
jeffrey at osafoundation.org
Mon Jun 2 15:53:26 CDT 2008
Hi Anil,
> Also, maybe allowQP should be isQP?
> I guess I need to read in the file and search for quoted-printable and
> if found call readComponents with allowQP...
The problem is fairly easy to summarize. vCard 2.1 is a horrific standard.
vCard 2.1 followed a model where different lines could have different
charset encodings (charset=utf-8, or charset=latin-1, or whatever you
want) and individual lines could be wrapped either with
quoted-printable, or with beginning-space wrapping.
So for vCard 2.1, you can't just try to decode the entire stream into
unicode with one encoding, and because lines may be quoted-printable,
it's tough to parse with a regular expression.
In the (for calendar users like me, rare) instance that you're working
with a vCard 2.1 file, you need to parse with a state machine, which is
much slower than regular expressions, and you need to decode each line
into unicode separately.
All in all, vCard 2.1 support is alpha quality in vobject. I'd love to
have a more elegant solution for seamlessly parsing both well-behaved
iCalendar, and poorly-behaved vCard 2.1, but I haven't come up with an
approach I'm psyched about.
Probably, when allowQP is set, we should parse each line, and if there's
no explicit charset or quoted-printable declaration, decode parameters
and values as utf-8.
Unfortunately, I don't have the time to focus on making this work better
right now.
Sincerely,
Jeffrey
More information about the VObject
mailing list