Archive for September, 2006

Super Easy XML Parsing in Python

Saturday, September 30th, 2006

Getting information I need out of XML is one of those tasks that occurs rarely enough that I never get to develop that profound understanding of XML documents and their parsing that we all so desire. It also means that every time I want to work with a piece of XML I have to re-learn the bare minimum of Python XML processing.

To fix this problem I am posting the snippet of bare-minimum, cheap XML processing code I usually come up with, using my latest XML processing problem as an illustration.

Be sure to advise me if I’m doing something dreadfully wrong, or if you’ve come up with a superior code snippet for processing XML that involves less cruft. Also stay tuned to my announcements page to see how this code is going to contribute to automatically adding birthdays to Google Calendar!

And finally here is how I get XML into a list of dictionaries where each dictionary contains the important values of the calendar entry elements for a piece of XML like this:

<entry>
		<id>

http://www.google.com/calendar/feeds/default/private/full/...

		</id>
		<published>
			2006-09-15T04:55:44.000Z
		</published>
		<updated>
			2006-09-15T04:55:44.000Z
		</updated>
		<category scheme=\"http://schemas.google.com/g/2005#kind\" term=\"http://schemas.google.com/g/2005#event\"/>
		<title type=\"text\">
			Mom\'s Birthday
		</title>
		<content type=\"text\"/>
		<link href=\"http://www.google.com/calendar/event?eid=...\" rel=\"alternate\" title=\"alternate\" type=\"text/html\"/>
		<link href=\"http://www.google.com/calendar/feeds/default/private/full/...\" rel=\"self\" type=\"application/atom+xml\"/>
		<link href=\"http://www.google.com/calendar/feeds/default/private/full/...\" rel=\"edit\" type=\"application/atom+xml\"/>
		<author>
			<name>
				....
			</name>
			<email>
				....
			</email>
		</author>
		<gd:comments>
			<gd:feedLink href=\"http://www.google.com/...\"/>
		</gd:comments>
		<gd:visibility value=\"http://schemas.google.com/g/2005#event.default\"/>
		<gd:eventStatus value=\"http://schemas.google.com/g/2005#event.confirmed\"/>
		<gd:transparency value=\"http://schemas.google.com/g/2005#event.opaque\"/>
		<gCal:sendEventNotifications value=\"true\"/>
		<gd:where valueString=\"\"/>
		<gd:when endTime=\"2006-09-21\" startTime=\"2006-09-20\">
			<gd:reminder minutes=\"2880\"/>
		</gd:when>
	</entry>

Here is the code. I just feed this function the raw xml returned by Google calendar and it gives me a list of dictionaries where each dictionary holds information from that event like title, and startdate. It’s not that complicated but I always get hung up on all that firstChild and data stuff so it’s easier for me to just copy this snippet and modify it for whatever XML I’m dealing with than redoing it each time.

def parse_entries(raw_xml):
    dom = xml.dom.minidom.parseString(data) #Make the dom from raw xml
    entries=dom.getElementsByTagName('entry') #Pull out all entry's
    result_entries=[] #Make an empty container to fill up and return
    for entry in entries:
        dentry={} #Make empty dict to hold info on an entry
        #Fill up the dict
        dentry['id']=entry.getElementsByTagName('id')[0].firstChild.data
        dentry['published']=entry.getElementsByTagName('published')[0].firstChild.data
        dentry['updated']=entry.getElementsByTagName('updated')[0].firstChild.data
        dentry['title']=entry.getElementsByTagName('title')[0].firstChild.data
        try: dentry['content']=entry.getElementsByTagName('content')[0].firstChild.data
        except AttributeError: dentry['content']=''
        dentry['startTime']=entry.getElementsByTagName('gd:when')[0].getAttribute('startTime')
        dentry['endTime']=entry.getElementsByTagName('gd:when')[0].getAttribute('endTime')
        result_entries.append(dentry)
    return result_entries

For the future I’d like to consider trying pyRXP instead which promises to be 97% faster than minidom and parses XML directly into some kind of mix of tuples and other Python primitives.

By the way, if you’re interested in learning more about working with Google Calendar’s API, I’ll be making a few posts on Answer My Searches soon detailing how to do that. (And I may end up even releasing a library for Python). So go ahead and subscribe if you haven’t yet ;-)

[tags]Python, XML, Python XML, XML Parsing, minidom, PyRXP, dom, parse[/tags]

What the Minus Sign Does in xs:dateTime

Sunday, September 24th, 2006

Have you ever seen some XML like the following and wondered what that “-08:00″ was for?

<gd:when startTime="2005-06-06T17:00:00-08:00"/>

Well after much searching it turns out that it specifies the time zone offset from Greenwich Mean Time. Otherwise the app accepting your XML might assume you meant Greenwich Mean Time and put your Tea time at some weird hour!


This guy’s slides explain the xs:dateTime format a bit:

xs:dateTime – an ISO 8601 date plus a time.
The date is separated from the time by a ‘T’: 2005-03-10T02:00:00-08:00
The time format is ‘hh:mm:ss’ with at optional fractional seconds and optional timezone offset.
The timezone is Greenwich Mean Time unless you specify the offset.

And by the way, if you want to find out what your UTC time zone offset is, you can look it up here.

And here are my myriad of search terms:

  1. python datetime to xs:dateTime
  2. “time zone offset”
  3. how to find my time zone offset
  4. “xs:dateTime” + minus + mean
  5. What does -08:00 Mean in xs:dateTime
  6. “xs:dateTime” + timezone, xs:dateTime

[tags]xs:dateTime, time zone, xml, schema, ISO 8601[/tags]

Super Easy Way to Reverse a String in Python

Saturday, September 23rd, 2006

Here’s what I discoved today. To reverse a string in Python you just do:
text[::-1]

It’s that easy! Here is a more filling example:

>>> text='greg'
>>> print text[::-1]
gerg

And finally here is why it works, for the curious.

How to Make a Macro in GoldMine

Thursday, September 14th, 2006

GoldMine macros let you automate the more tedious sequences of key strokes and clicks you do within GoldMine.
Here’s an easy to follow, pictorial tutorial on how to make one of these fellows.
(more…)

Command Line Arguments for Goldmine

Thursday, September 14th, 2006

Yes, it turns out you can give the GoldMine executable (gmw7.exe for me) some parameters when you launch it. Some of them are even useful! Who knew?

Doug Castell says in this forum post:

Here are the ones I know off the top of my head:
/u: pass the username
/p: pass the password
/b: pass a folder in which remote installation files reside. Legacy
feature, no longer applicable(?)
/c: pass a folder in which contact files reside: example:
/c:c:\apps\goldmine\common\
/t: ‘thin client mode’. removes a number of bandwidth-heavy operations.
most appropriate for thin client users (thin bandwidth network/vpn, graphon,
term server)
/m: pass a recorded goldmine macro number to run
/s: run in silent server mode. options include /s:DDE and /s:GOLDSYNC

Please comment if you can think of any others. Of course it’s never been documented before!

As an example here’s how I get GoldMine to run a macro (even works if GoldMine is already running.)
J:\gmw7.exe /m:803

803 is the number at the end of the macro file in the J:\macros directory (with J:\ being GoldMines system directory for me.) The full name of that file is gpinero.803.

This GoldMine support page also lists a few more switches for you.

[tags]GoldMine, Macro, GoldMine Macros, GoldMine Help[/tags]