Archive for the ‘Web Services’ Category

FTP Ignorance

Wednesday, December 13th, 2006

I spent over an hour trying to run the sample code below. This comes from the sample code from ftputil (The best FTP library for Python by the way.)

# download some files from the login directory
import ftputil
host = ftputil.FTPHost('ftp://ftp.kernel.org/')
names = host.listdir(host.curdir)
for name in names:
    if host.path.isfile(name):
        host.download(name, name, 'b')        # remote, local, binary mode

And I just kept getting this error!

raise FTPOSError(ftp_error)
FTPOSError: (11001, 'getaddrinfo failed')
Debugging info: ftputil 2.1.1, Python 2.4.2 (win32)

Anyway the simple mistake ended up being that I needed to use ftp.kernel.org/ and not ftp://ftp.kernel.org/. So that’s the answer if you’re ever stuck, omit the ftp:// prefix.

Super Easy XML Parsing in Python

Saturday, September 30th, 2006

Getting information I need out of XML is one of those tasks that occurs rarely enough that I never get to develop that profound understanding of XML documents and their parsing that we all so desire. It also means that every time I want to work with a piece of XML I have to re-learn the bare minimum of Python XML processing.

To fix this problem I am posting the snippet of bare-minimum, cheap XML processing code I usually come up with, using my latest XML processing problem as an illustration.

Be sure to advise me if I’m doing something dreadfully wrong, or if you’ve come up with a superior code snippet for processing XML that involves less cruft. Also stay tuned to my announcements page to see how this code is going to contribute to automatically adding birthdays to Google Calendar!

And finally here is how I get XML into a list of dictionaries where each dictionary contains the important values of the calendar entry elements for a piece of XML like this:

<entry>
		<id>

http://www.google.com/calendar/feeds/default/private/full/...

		</id>
		<published>
			2006-09-15T04:55:44.000Z
		</published>
		<updated>
			2006-09-15T04:55:44.000Z
		</updated>
		<category scheme=\"http://schemas.google.com/g/2005#kind\" term=\"http://schemas.google.com/g/2005#event\"/>
		<title type=\"text\">
			Mom\'s Birthday
		</title>
		<content type=\"text\"/>
		<link href=\"http://www.google.com/calendar/event?eid=...\" rel=\"alternate\" title=\"alternate\" type=\"text/html\"/>
		<link href=\"http://www.google.com/calendar/feeds/default/private/full/...\" rel=\"self\" type=\"application/atom+xml\"/>
		<link href=\"http://www.google.com/calendar/feeds/default/private/full/...\" rel=\"edit\" type=\"application/atom+xml\"/>
		<author>
			<name>
				....
			</name>
			<email>
				....
			</email>
		</author>
		<gd:comments>
			<gd:feedLink href=\"http://www.google.com/...\"/>
		</gd:comments>
		<gd:visibility value=\"http://schemas.google.com/g/2005#event.default\"/>
		<gd:eventStatus value=\"http://schemas.google.com/g/2005#event.confirmed\"/>
		<gd:transparency value=\"http://schemas.google.com/g/2005#event.opaque\"/>
		<gCal:sendEventNotifications value=\"true\"/>
		<gd:where valueString=\"\"/>
		<gd:when endTime=\"2006-09-21\" startTime=\"2006-09-20\">
			<gd:reminder minutes=\"2880\"/>
		</gd:when>
	</entry>

Here is the code. I just feed this function the raw xml returned by Google calendar and it gives me a list of dictionaries where each dictionary holds information from that event like title, and startdate. It’s not that complicated but I always get hung up on all that firstChild and data stuff so it’s easier for me to just copy this snippet and modify it for whatever XML I’m dealing with than redoing it each time.

def parse_entries(raw_xml):
    dom = xml.dom.minidom.parseString(data) #Make the dom from raw xml
    entries=dom.getElementsByTagName('entry') #Pull out all entry's
    result_entries=[] #Make an empty container to fill up and return
    for entry in entries:
        dentry={} #Make empty dict to hold info on an entry
        #Fill up the dict
        dentry['id']=entry.getElementsByTagName('id')[0].firstChild.data
        dentry['published']=entry.getElementsByTagName('published')[0].firstChild.data
        dentry['updated']=entry.getElementsByTagName('updated')[0].firstChild.data
        dentry['title']=entry.getElementsByTagName('title')[0].firstChild.data
        try: dentry['content']=entry.getElementsByTagName('content')[0].firstChild.data
        except AttributeError: dentry['content']=''
        dentry['startTime']=entry.getElementsByTagName('gd:when')[0].getAttribute('startTime')
        dentry['endTime']=entry.getElementsByTagName('gd:when')[0].getAttribute('endTime')
        result_entries.append(dentry)
    return result_entries

For the future I’d like to consider trying pyRXP instead which promises to be 97% faster than minidom and parses XML directly into some kind of mix of tuples and other Python primitives.

By the way, if you’re interested in learning more about working with Google Calendar’s API, I’ll be making a few posts on Answer My Searches soon detailing how to do that. (And I may end up even releasing a library for Python). So go ahead and subscribe if you haven’t yet ;-)

[tags]Python, XML, Python XML, XML Parsing, minidom, PyRXP, dom, parse[/tags]

What the Minus Sign Does in xs:dateTime

Sunday, September 24th, 2006

Have you ever seen some XML like the following and wondered what that “-08:00″ was for?

<gd:when startTime="2005-06-06T17:00:00-08:00"/>

Well after much searching it turns out that it specifies the time zone offset from Greenwich Mean Time. Otherwise the app accepting your XML might assume you meant Greenwich Mean Time and put your Tea time at some weird hour!


This guy’s slides explain the xs:dateTime format a bit:

xs:dateTime – an ISO 8601 date plus a time.
The date is separated from the time by a ‘T’: 2005-03-10T02:00:00-08:00
The time format is ‘hh:mm:ss’ with at optional fractional seconds and optional timezone offset.
The timezone is Greenwich Mean Time unless you specify the offset.

And by the way, if you want to find out what your UTC time zone offset is, you can look it up here.

And here are my myriad of search terms:

  1. python datetime to xs:dateTime
  2. “time zone offset”
  3. how to find my time zone offset
  4. “xs:dateTime” + minus + mean
  5. What does -08:00 Mean in xs:dateTime
  6. “xs:dateTime” + timezone, xs:dateTime

[tags]xs:dateTime, time zone, xml, schema, ISO 8601[/tags]

Why Everyone Should Backup Google Calendar to Their Own Computer

Saturday, August 5th, 2006


The most compelling reason is that it’s easy. Just follow these steps:

  1. Log into your Google calendar
  2. Click on the Manage Calendars link (On the left side)
  3. Click on the calendar you want to backup
  4. Scroll down to the section that says Private Address
  5. There will be a link that says “ical”, There be your data!

At this point you have a few options:

  1. Right click the ical link, click save link as and save that ical data.
  2. Use that url in a custom backup program you write.
  3. Import the ical data into Outlook or some other calendar software.


What are the other reasons to backup your Google calendar?

  1. Google could turn evil someday (and delete everyones’ data?)
  2. Google coud lose or delete data by accident
  3. You could lose your internet connection for an extended period of time
  4. You may want to move to a different calendar software in the future and am not able to export from Google calendar at that time(see reason 1)

If you’re an avid user of Google Calendar you may also like this post:
How to automatically add birthdays to Google Calendar

[tags]Google, Google Calendar, Calendar, PIM, Backup, local data, schedule, backup software, web services[/tags]

How to Link to Search Queries for Google, Yahoo Shopping, and a Few More

Saturday, May 27th, 2006

In updating Grocist today I needed to know how to link to key shopping sites to search for each item on my grocery list.

Apparently none of these websites has ever thought that anyone would want to link to them :-) so they didn’t offer any help anywhere.

Here are the URL syntaxes I’m using to link to these sites which I discovered by trial and error. I don’t claim they’re correct or that they won’t change but they work for right now:
(more…)