Jeff’s Brain Dump

Sometimes the first duty of intelligent men is the restatement of the obvious.

A tale of two Screencasts: How to suck less at Screencasting

Posted by Jeff December 13, 2007

Recently I came across two Python editing environments, Reinteract and Hotwire. The screencasts could not be more different. It’s instructive to consider what makes a superior screencast.

Before I pontificate, what makes my opinion worth listening to? I have an eye for video - I am a top contributor to VideoSift. My screencasts on ShowMeDo have been well reviewed.

Dislaimer: I know nothing about the two projects beyond having seen these screencasts.  Also, Hotwire lead Colin Walters notes that the Hotwire screencast is fan-made; an improved official vid may be in the works.
Let’s deconstruct these examples to figure out: makes a screencast suck or succeed?

Audio

Hotwire uses a hard-rocking song. The soundtrack is irrelevant to the action onscreen, and distracts. Currently a single YouTube comment asks for the song title.
Reinteract is narrated by the developer. He knows his stuff and his clarity of speech conveys precision. The pacing feels right.

Video

Hotwire is presented in what Yahtzee has dubbed TeenyWeenyEyestrainoVision. Youtube’s stingy real estate obliterates detail. Add AutoPanning and Beryl fx for added wooziness.
Reinteract is clean and sharp. No other distracting windows or desktop. Video is full size; details are preserved. The entire screencast takes place in one window. Overall: clean, simple, focused.

Pacing / Narrative

Hotwire has so many distracting elements it’s impossible for an outsider to follow. After 30 seconds of squinting, I gave up. Hotwire may have fantastic features.. this video does not communicate them.
Reinteract has a coherent, well structured progression. The narrator explains features, benefits, and builds complexity. As a viewer I see what makes it cool and useful and how I might apply it

Summary

The purpose of Screencasts is to communicate concepts. Show the Sizzle. Principles of writing apply: dump anything that doesn’t contribute. Audio should be on topic. Video should be sharp, fullscreen, with no distractions. YouTube is a poor choice. Pacing and narrative should set a context, deliver benefits, and communicate something new and useful.

When done right, screencasts can communicate cheaply and effectively to a worldwide audience.

, , ,

Parsing BookMooch’s Asins.xml with a SAX parser

Posted by Jeff May 01, 2007

I’ve been playing with BookMooch’s API recently. They have data files for:

  • Inventory : how many copies of each book are moochable
  • Wishlists: How many people want each book.
  • ASIN’s : full details for each book.

My initial goal is the availability (Inventory-Wishlist) of each book. Examples:

Availability Title
-210 Omnivore’s Dilemma
-172 The God Delusion
115 The Da Vinci Code
122 Jurassic Park


Omnivore’s Dilemma is in heavy demand; Jurassic Park is a stale meme. The value of a book decreases over time; it makes sense to trade in current books while you can.

ASINS.xml is 983MB; a DOM parser requires far too much memory. A SAX parser is required to handle a file this size.

My requirements are to produce a CSV file mapping ISBN to Title. A pickled version of a python map would also be useful.

ASIN Detail

See example here.

SAX ContentHandler

A ContentHandler is supplied as a callback. The parser calls startElement, characters,and stopElement and as it walks the XML input stream.  Since the id element repeats, examining the tag name is insufficient to know the location in the tree. Instead, a list of containing elements makes sense:

class asinHandler(ContentHandler):
def __init__(self):
self.curElements=[] # Will have the path to the current location.
def startElement(self, name, attrs):
self.curElements.append(name)
def endElement(self, name):
self.curElements.pop()

Capturing ID, Title

My only interest is in the title and ID of each book.
    def characters(self,ch):
        if len(self.curElements) ==3:
            if self.curElements == ‘id’:
                self.isbn = self.isbn + ch
            elif self.curElements[2] == ‘Title’:
                self.title = self.title + ch
    def startElement(self, name, attrs):
        if name==’asin’:
            self.isbn = ‘’
            self.title = ‘’       
    def endElement(self, name):
        if name==’asin’:
                self.br.record ( self.isbn.strip(), self.title.strip() )

Full Program

# IN: asins_fixed.xml
# OUT: isbns.txt, isbns.pickle

# http://www.devarticles.com/c/a/XML/Parsing-XML-with-SAX-and-Python/
import codecs,pickle
from xml.sax import make_parser
from xml.sax.handler import ContentHandler
BOOKNUM = 0

class bookRecorder:
    def __init__(self):
        # *Very Important* to open in the right encoding!
        self.f = codecs.open (’isbns.txt’,'w’, ‘iso-8859-1′)
        self.dict = {}
    def record(self,isbn,title):
        self.f.write(unicode (isbn+’,’ +  title +’\r\n’))
        self.dict [isbn]=title
    def close(self):
        self.f.close()
        p=open(’isbns.pickle’,'w’,200000)
        pickle.dump (self.dict,p)
        p.close()

class asinHandler(ContentHandler):
    def __init__(self):
        self.br=bookRecorder()
        self.curElements=[] # Will have the path to the current location.       
    def characters(self,ch):
        if len(self.curElements) ==3:
            if self.curElements == ‘id’:
                self.isbn = self.isbn + ch
            elif self.curElements[2] == ‘Title’:
                self.title = self.title + ch
    def close(self):
        self.br.close()  
    def startElement(self, name, attrs):
        self.curElements.append(name)
        if name==’asin’:
            self.isbn = ‘’
            self.title = ‘’
            global BOOKNUM
            BOOKNUM = BOOKNUM + 1
            if BOOKNUM % 5000 == 0:
                 print BOOKNUM       
    def endElement(self, name):
        if name==’asin’:
                self.br.record ( self.isbn.strip(), self.title.strip() )
        self.curElements.pop()

parser = make_parser()  
curHandler = asinHandler()
parser.setContentHandler(curHandler)
parser.parse(open(’asins_fixed.xml’))
curHandler.close()

, ,

Django and Paradox of Choice

Posted by Jeff August 19, 2006

I read the bombshell about Django like everyone else, and want to raise a point: this is a good thing for the python community as a whole. The reason is the Paradox of Choice.

The more funds employers offer their employees in 401(k) retirement plans, the less likely the employees are to invest in any, even though in many cases, failing to do so costs them employer-matching funds of up to several thousand dollars a year.

Here is Prof. Barry Schwartz talking at Google The Paradox of Choice - Why More Is Less (flash). MP4 format - link should work.

,

Linkage

Posted by Jeff August 16, 2006

technical

  • Python Challenge - Difficult but fun web challenge. I want to try to finish this..
  • lpy.py Literate programming in python - produces HTML docu mixing code, commentary. Very interesting. (but yet another markup to learn)
  • NearlyFreeSpeech.net - pay $1/GB as you go - A “major slashdotting” of a site hosted on our service will cost you (on average) about $10, one time. With python,ruby,perl CGI.

silly

,

Python Stickers, revisited

Posted by Jeff August 15, 2006

Remember the Nod and Python? Well, I’ve been emailing and bothering various Python folks. I’ve learned:

  • There is an official python store at CafePress. It’s hard to find, called”pydotorg”.
  • Stickers are now available! $3.49
  • There are T-shirts, hats, mugs, etc. The shirts on the official store are awfully expensive, I think - the least expensive is $23, before shipping.

Goodstorm is a new site with much lower base prices than CafePress. I’ve created a store there. Without any markup, the python logo shirt is $8.40 for a men’s fitted T.

I’ll be getting a laptop sticker and a couple t-shirts. So if you see a bearded geek around Harvard/Porter square with a python sticker, say Hi :)

,

Firefox Search plugins — Scroogle, PythonDocs, JavaBlogs

Posted by Jeff August 11, 2006

Here are search plugins for JavaBlogs, Python documentation, or Scroogle. They install in the search box at the top of the Firefox window:

Click to install:

JavaBlogs IconJavaBlogs: search javablogs.com

Python IconPython Docs : search python.org/doc

Scroogle IconScroogle : Google results without the cookies or search records.

To uninstall search engines, use SearchPluginHacks.  To right-click on text and invoke a search engine, I like SearchWith.

, , , , , ,

Python and “the Nod”

Posted by Jeff August 04, 2006

Kathy Sierra writes about “The Nod” - the acknowledgement of another person’s good taste in making the same choices as you.

What does this have to do with programming languages? Programmers care deeply about language choice — How do python developers recognize each other?  There should be an obvious signifier - like a laptop sticker. The guy on the train might be a MOT (Member of the Tribe), too.

I’ve looked around on cafepress and see a few, but they’re the old logo. I want a sticker with the new logo.

,

NO Connection Fees!

Posted by Jeff July 21, 2006

Hangman - nice lil game built with GWT

JavaAlmanac - switching between too many languages? This site has the snippets you need.

Rubik’s Cube Solving Robot video

Python on the Nokia S60 The geek in me wants this very much.

Fly Guy - Flash game. Strangely relaxing.

SqlObjectGuide : Best tips I’ve found for SQLObject, the Python O/R mapper.

, ,

Vacation’s All I ever wanted…

Posted by Jeff July 13, 2006

I’m in vacation, yet here I am on dialup, geeking out. Life moves slower here in the country..so do the downloads.

Textile your textareas with Greasemonkey

wikiPad - wiki app. Now open source. I’ve been using NoteLens for off-the-cuff notetaking, a little walled garden where I don’t have to worry about file names, but I’m jonesing for markup and hyperlinks…wikipad might do the trick.

Lumpy generates UML diagrams from a running Python program - darn, missed the presentation at the Boston Python meetup.

, ,

The Nose Knows

Posted by Jeff May 31, 2006

Titus has written a nice intro to nose with examples. I emphatically agree:

the most important part of having unit tests is that they can be run quickly, easily, and without any thought

Thus, I was suprised not to see something like Nosy, to automagically run tests when code changes. To me, running tests by hand requires thought, and a context switch. Nosy keeps me in the flow.

Pinocchio is an extension to filter out long-running tests so tests may be run quickly. This leads me to believe his test suite was too big to run automatically.

(What is it about nose that inspires bad puns? This blog title, nosy, pinocchio…well, what do expect in a language named after comedy troupe :)

,

, , ,

« Older blog posts