[Vobject] Pickle and vObject

Jeffrey Harris jeffrey at osafoundation.org
Tue Jun 3 16:17:14 CDT 2008


Hi Jonathan,

Note that your messages to the list are rejected because you aren't a 
subscriber (too much spam if I don't do that).

> My application is a HTTP JSON calendar store. You POST in ICS files,
> or update them with PUT, and then can get back events in a range on
> that calendar with a GET:
> 
>     /calendars/5?start=2008-01-01&end=2008-01-30
> 
> You then get back a nice structured JSON response with a list of
> easy to parse and understand "events" that are pre-bursted from your
> calendar (so you don't have to think about recurrence).

Very cool!

> I have found that the parse overhead from vObject is *very* high,
> so I was attempting to optimize by cacheing the pre-parsed objects
> in memory, but this prevents me from running two instances of my
> application for load balancing and high availability, so thats why I
> was thinking about just pickling the objects and shoving them into
> memcached or a database.

I'm a little surprised you're having performance issues, after moving to 
a regular expression for parsing most of my performance issues have gone 
away.  But I've mainly used vobject in a desktop environment, doing one 
time parsing of small files, and I haven't really put too much energy 
into optimization.

Actually, I wonder if most of your performance pain is in recurrence 
expansion?  That's the area I've had most of my performance headaches, 
periodically I ponder going through and optimizing dateutil's rrule.

> I might give pickle another shot later, but I am concerned that the
> overhead from deserializing the pickled `Component` might actually
> be almost as bad as the parse overhead.
> 
> Any other ideas for speeding up my use case?

Pickle is extremely fast.

The problem you're having is with tzical, which provides a Python tzinfo 
class given a VTIMEZONE.  It uses rrules to represent DST transitions. 
As long as you don't have any tzical timezones, vobject pickles fine.

Here's what I would do.  Chandler has a routine that converts from 
tzical to PyICU tzinfo classes, see convertToICUtzinfo in 
http://svn.osafoundation.org/chandler/trunk/chandler/parcels/osaf/pim/calendar/TimeZone.py

Unfortunately PyICU has a painful API, and there are various calls to 
view.tzinfo.whatever.  I think all those calls could be replaced with 
PyICU.ICUtzinfo.whatever, but you'd have to spend a little while doing 
the replacements.

Next, I'd monkey-patch vobject.icalendar.registerTzid.  Any time you're 
handed a tzinfo that's a tzical timezone, run your version of 
convertToICUtzinfo.

I think once you've done that, pickle should work, because all datetimes 
will have PyICU timezones.

Gotchas with this approach are that convertToICUtzinfo isn't 100% 
foolproof.  It pretty diligently attempts to find the timezone with DST 
transitions similar to the ones listed in the VTIMEZONE, but there's no 
guarantee that iCalendar files will have reasonable VTIMEZONEs.  But 
this approach works 99% of the time for Chandler.

Sincerely,
Jeffrey


More information about the VObject mailing list