If you blog it they will come?

Sunday, August 16, 2009

When a band I like comes to town, I'll know

The List is a comprehensive listing of all the known shows coming to the Bay Area listed by artist and by venue.

There are hundreds of concerts and nearly 1500 bands in the list so I threw together a script to scrape it and intersect those bands with the artists in my iTunes library.

The result is, I now know that these acts are playing in the near future:

atmosphere
bat for lashes
beirut
blink 182
butthole surfers
calexico
cat power
collective soul
dan deacon
deerhoof
deerhunter
dropkick murphys
elvis costello
fever ray
flipper
ghostface killah
girl talk
green day
grizzly bear
in flames
kenny rogers
lil wayne
m.i.a.
mastodon
meat puppets
mirah
modest mouse
mstrkrft
no age
nofx
pearl jam
placebo
porcupine tree
sunny day real estate
tenacious d
thievery corporation
tv on the radio
weezer
yo la tengo


The next step is to scrape the concert details as well, use fuzzy matching, run it automatically, and set up alerts.

But this only took 15 minutes to write in python and it would have taken me way longer to parse manually.

EDIT:

Instead of doing a set intersect, I now use difflib to find "close matches." It's slower but still runs start to finish in about 10 seconds, which is fine considering especially that the data changes infrequently.

I also unescape the ampersand in the iTunes xml, and filter out "The " because:

>>> import difflib
>>> difflib.get_close_matches('foo', ['the foo', 'foods'], n=1)
['foods']

...an exact match preceded by 'the' is penalized more than a suffix. So 'pixies' would match 'pixiestickers' instead of 'the pixies' in the case where I only select the top match (since ideally there's a one-to-one mapping).

Instead of writing my own fuzzy matching algorithm, for now I'll just chop off 'The ' and live with the results. Although some of the matches aren't useful, it does better at finding bands such as ...and you will know us by the trail of dead and The Ting Tings.

Tuesday, August 11, 2009

timsort visualization

This blog post is quite effective at illustrating the timsort algorithm, found in python (and soon java)

Saturday, August 8, 2009

vim zen moment

There comes a certain time in one's life to put aside the variety of editors they might use or sometimes dabble in, and perhaps choose one they like best, or see the most potential with down the road, and work monogamously toward advanced proficiency in this editor, regardless of the bumps in the road or hardships which may provoke longing for other editors along the way.

I've made this commitment to vim recently and I'm still a novice.
But I just had what may be my first true zen moment with the editor!

The problem:
* I needed to fix a single spacing annoyance in a set of over 40 php files.

The solution:
* Open all php files rooted in foo, luckily 90% were all at the same leaf level
vim /foo/*/*/*\.php
* Start recording a macro in register a
qa
* Make edits
[a fair bit of jj and dd and i etc.]
* Write changes
:w
* Open next file
:bn
* Stop recording
q

After pressing @a a few times to execute the macro, or :bn if the file could be left alone, I was done.

I've always wanted to be able to do this sort of thing so easily, but it's elusive or cumbersome with most GUI editors. Many UNIX text munging tools exist too, but it's often easier to take direct route of showing the machine what you want, rather than, say, building a DFA.

So, I'm liking the taste of vim kool-aid thus far.