Jeff’s Brain Dump

Sometimes the first duty of intelligent men is the restatement of the obvious.

A tale of two Screencasts: How to suck less at Screencasting

Posted by Jeff December 13, 2007

Recently I came across two Python editing environments, Reinteract and Hotwire. The screencasts could not be more different. It’s instructive to consider what makes a superior screencast.

Before I pontificate, what makes my opinion worth listening to? I have an eye for video - I am a top contributor to VideoSift. My screencasts on ShowMeDo have been well reviewed.

Dislaimer: I know nothing about the two projects beyond having seen these screencasts.  Also, Hotwire lead Colin Walters notes that the Hotwire screencast is fan-made; an improved official vid may be in the works.
Let’s deconstruct these examples to figure out: makes a screencast suck or succeed?

Audio

Hotwire uses a hard-rocking song. The soundtrack is irrelevant to the action onscreen, and distracts. Currently a single YouTube comment asks for the song title.
Reinteract is narrated by the developer. He knows his stuff and his clarity of speech conveys precision. The pacing feels right.

Video

Hotwire is presented in what Yahtzee has dubbed TeenyWeenyEyestrainoVision. Youtube’s stingy real estate obliterates detail. Add AutoPanning and Beryl fx for added wooziness.
Reinteract is clean and sharp. No other distracting windows or desktop. Video is full size; details are preserved. The entire screencast takes place in one window. Overall: clean, simple, focused.

Pacing / Narrative

Hotwire has so many distracting elements it’s impossible for an outsider to follow. After 30 seconds of squinting, I gave up. Hotwire may have fantastic features.. this video does not communicate them.
Reinteract has a coherent, well structured progression. The narrator explains features, benefits, and builds complexity. As a viewer I see what makes it cool and useful and how I might apply it

Summary

The purpose of Screencasts is to communicate concepts. Show the Sizzle. Principles of writing apply: dump anything that doesn’t contribute. Audio should be on topic. Video should be sharp, fullscreen, with no distractions. YouTube is a poor choice. Pacing and narrative should set a context, deliver benefits, and communicate something new and useful.

When done right, screencasts can communicate cheaply and effectively to a worldwide audience.

, , ,

Parsing BookMooch’s Asins.xml with a SAX parser

Posted by Jeff May 01, 2007

I’ve been playing with BookMooch’s API recently. They have data files for:

  • Inventory : how many copies of each book are moochable
  • Wishlists: How many people want each book.
  • ASIN’s : full details for each book.

My initial goal is the availability (Inventory-Wishlist) of each book. Examples:

Availability Title
-210 Omnivore’s Dilemma
-172 The God Delusion
115 The Da Vinci Code
122 Jurassic Park


Omnivore’s Dilemma is in heavy demand; Jurassic Park is a stale meme. The value of a book decreases over time; it makes sense to trade in current books while you can.

ASINS.xml is 983MB; a DOM parser requires far too much memory. A SAX parser is required to handle a file this size.

My requirements are to produce a CSV file mapping ISBN to Title. A pickled version of a python map would also be useful.

ASIN Detail

See example here.

SAX ContentHandler

A ContentHandler is supplied as a callback. The parser calls startElement, characters,and stopElement and as it walks the XML input stream.  Since the id element repeats, examining the tag name is insufficient to know the location in the tree. Instead, a list of containing elements makes sense:

class asinHandler(ContentHandler):
def __init__(self):
self.curElements=[] # Will have the path to the current location.
def startElement(self, name, attrs):
self.curElements.append(name)
def endElement(self, name):
self.curElements.pop()

Capturing ID, Title

My only interest is in the title and ID of each book.
    def characters(self,ch):
        if len(self.curElements) ==3:
            if self.curElements == ‘id’:
                self.isbn = self.isbn + ch
            elif self.curElements[2] == ‘Title’:
                self.title = self.title + ch
    def startElement(self, name, attrs):
        if name==’asin’:
            self.isbn = ‘’
            self.title = ‘’       
    def endElement(self, name):
        if name==’asin’:
                self.br.record ( self.isbn.strip(), self.title.strip() )

Full Program

# IN: asins_fixed.xml
# OUT: isbns.txt, isbns.pickle

# http://www.devarticles.com/c/a/XML/Parsing-XML-with-SAX-and-Python/
import codecs,pickle
from xml.sax import make_parser
from xml.sax.handler import ContentHandler
BOOKNUM = 0

class bookRecorder:
    def __init__(self):
        # *Very Important* to open in the right encoding!
        self.f = codecs.open (’isbns.txt’,'w’, ‘iso-8859-1′)
        self.dict = {}
    def record(self,isbn,title):
        self.f.write(unicode (isbn+’,’ +  title +’\r\n’))
        self.dict [isbn]=title
    def close(self):
        self.f.close()
        p=open(’isbns.pickle’,'w’,200000)
        pickle.dump (self.dict,p)
        p.close()

class asinHandler(ContentHandler):
    def __init__(self):
        self.br=bookRecorder()
        self.curElements=[] # Will have the path to the current location.       
    def characters(self,ch):
        if len(self.curElements) ==3:
            if self.curElements == ‘id’:
                self.isbn = self.isbn + ch
            elif self.curElements[2] == ‘Title’:
                self.title = self.title + ch
    def close(self):
        self.br.close()  
    def startElement(self, name, attrs):
        self.curElements.append(name)
        if name==’asin’:
            self.isbn = ‘’
            self.title = ‘’
            global BOOKNUM
            BOOKNUM = BOOKNUM + 1
            if BOOKNUM % 5000 == 0:
                 print BOOKNUM       
    def endElement(self, name):
        if name==’asin’:
                self.br.record ( self.isbn.strip(), self.title.strip() )
        self.curElements.pop()

parser = make_parser()  
curHandler = asinHandler()
parser.setContentHandler(curHandler)
parser.parse(open(’asins_fixed.xml’))
curHandler.close()

, ,

17 Antipatterns from the Worst Presidency Ever

Posted by Jeff November 10, 2006

So the American people have thrown the bums out. This administration embodies incompetence– what are the lessons to be gleaned from their mistakes?

1. Repeating a lie doesn’t make it true.
2. Credibility, once lost, is not easily restored.
3. There’s a fine line between optimism and deceit.
4. Actions speak louder than words.
5. You can’t say one thing and do another for long. People catch on.
6. Arrogance is tolerable only if you’re competent.
7. People see through word games. And hate you for trying to deceive them.  (definition of torture)
8. Listen to your customers, not yesmen.
9. Facts and performance are everything.
10. Getting elected is the easy part. Delivering the big picture is hard.
11. The world is becoming more transparent. Cheating doesn’t work in the long run.
12. People see your true motives.
13. Strategic Thinking- planning and avoiding problems is the hard part. It isn’t as visible.
14. Feed people deception and they will seek truth.
15. You are not as smart as you think. Learn from history and others.
16. Being a nice guy helps. Being competent matters more.
17. Fear is not a sustainable motivation. Hope and vision are.

, , ,

Akismet kills Comment Spam dead.

Posted by Jeff September 21, 2006

I feel terrible when I come across a good blog dripping in spam — the time wasted moderating all that crap! I installed Akismet (free) a while back, and spam hasn’t been a problem since.

Akismet is a centralized service to recognize blog comment spam. It works beautifully - you just never see spam comments. Akismet stats say 92% of comments are spam. I use the Wordpress plugin - they have API’s and plugins for most blog engines and languages (Java,Python,etc).

Wired covered it in their recent Splog article.

, ,

Django and Paradox of Choice

Posted by Jeff August 19, 2006

I read the bombshell about Django like everyone else, and want to raise a point: this is a good thing for the python community as a whole. The reason is the Paradox of Choice.

The more funds employers offer their employees in 401(k) retirement plans, the less likely the employees are to invest in any, even though in many cases, failing to do so costs them employer-matching funds of up to several thousand dollars a year.

Here is Prof. Barry Schwartz talking at Google The Paradox of Choice - Why More Is Less (flash). MP4 format - link should work.

,

HOWTO beautify your Bookmarks toolbar

Posted by Jeff August 19, 2006

Is your linkbar a mess like mine?
Now you can clean it to look like this–

I recently discovered a cool trick: if a site has a favicon, it’s shown by the link text: . But you can empty the name:

… while keeping the icon:

If your favorite website doesn’t have an favicon– send them a link to this article!

Update: FavIcon Picker lets you pick and assign your own icons.

, ,

Linkage

Posted by Jeff August 16, 2006

technical

  • Python Challenge - Difficult but fun web challenge. I want to try to finish this..
  • lpy.py Literate programming in python - produces HTML docu mixing code, commentary. Very interesting. (but yet another markup to learn)
  • NearlyFreeSpeech.net - pay $1/GB as you go - A “major slashdotting” of a site hosted on our service will cost you (on average) about $10, one time. With python,ruby,perl CGI.

silly

,

Python Stickers, revisited

Posted by Jeff August 15, 2006

Remember the Nod and Python? Well, I’ve been emailing and bothering various Python folks. I’ve learned:

  • There is an official python store at CafePress. It’s hard to find, called”pydotorg”.
  • Stickers are now available! $3.49
  • There are T-shirts, hats, mugs, etc. The shirts on the official store are awfully expensive, I think - the least expensive is $23, before shipping.

Goodstorm is a new site with much lower base prices than CafePress. I’ve created a store there. Without any markup, the python logo shirt is $8.40 for a men’s fitted T.

I’ll be getting a laptop sticker and a couple t-shirts. So if you see a bearded geek around Harvard/Porter square with a python sticker, say Hi :)

,

Firefox Search plugins — Scroogle, PythonDocs, JavaBlogs

Posted by Jeff August 11, 2006

Here are search plugins for JavaBlogs, Python documentation, or Scroogle. They install in the search box at the top of the Firefox window:

Click to install:

JavaBlogs IconJavaBlogs: search javablogs.com

Python IconPython Docs : search python.org/doc

Scroogle IconScroogle : Google results without the cookies or search records.

To uninstall search engines, use SearchPluginHacks.  To right-click on text and invoke a search engine, I like SearchWith.

, , , , , ,

Windows installer for Nosy

Posted by Jeff August 09, 2006

I’ve done a Windows installer for Nosy. Nosy automatically runs tests when your code changes. Screencast Details
The NSI script language is powerful yet clunky, SuperPiMP™ technology notwithstanding. But it does the job.

No Tags

« Older blog posts