schussman.com logo

IronMay

Heather and I are participating in an informal competition (along with some co-workers) that we’re calling IronMay: Accumulate the total distances of an Ironman triathlon in the course of a month — this month. You can track our progress here. The idea that people do this entire thing in a single go just floors me.

Balls

You sort of have to admire The Ladders for being ostentatious enough to eschew those pathetic schlubby job seekers making less than seventy-five thousand dollars a year in the current market.

All the same, at $180 per year for membership, you would think they would manage to avoid attributing the same motivational quotes to different people. I hope that Paul and Benjamin really enjoy their job, and I sincerely wish that their employer could look past their relationship and just respect them for the work that they share.

On knowing your audience

Loving the fact that Jonah Goldberg needs to point this out to his audience:

Michael Mann — not the Miami Vice guy — reviews the book in the Post today. [emphasis mine]

Keywords in Lightroom

The last time I wrote about Lightroom, I was using sqlite to pull out frequencies of focal length. This time it’s keywords: Lightroom lets you build any number of custom keyword sets to apply to photos. It automatically builds a set of “recently used” keywords, but I thought it would also be handy to have a set of my most commonly-used keywords. While Lightroom has a command to export a list of keywords, that list doesn’t include frequencies. Keywords are stored in Lightroom in a table called AgLibraryTag. Conveniently, Lightroom writes a count of each keyword to the same table, so it’s easy to get out all the information we need. (Note: The frequency in this table is a cached value and may not reflect the up-to-the-minute reality within your database. Rather than constantly update its databases Lightroom seems to update this count when you view the discrete keywords. I’m not sure how to force a library-wide update of all keyword counts. This is probably close enough, and is simpler/quicker than counting keywords image-by-image.)

Rather than run this data through R to build a histogram as I did with the focal length data, I just use awk this time to make a list with the most frequently-used tags at the bottom. With this list, you can easily build a corresponding tag set in Lightroom.

Remember to change paths to suit, and that (on OSX) you’ll probably need to upgrade your version of sqlite for all this to work. Also, always always work from a copy of your database as this script does.

# display a sorted list of lightroom keywords cp ~/Pictures/Lightroom/Lightroom\ Catalog.lrcat ~/lightroom.lrdb /usr/local/bin/sqlite3 -csv ~/lightroom.lrdb 'select ImageCountCache, name from AgLibraryTag where kindName="AgKeywordTagKind";' > /Users/alan/lr-keywords.csv awk -F , '{print $1" "$2}' lr-keywords.csv | sort -n rm ~/lightroom.lrdb rm ~/lr-keywords.csv

Daydream: A map of how keywords relate to one another would be awesome.

Dots and lines

Sparklines are fun to tinker with and can provide quick glimpses of data. Here are some not-quite-realtime twitter sparklines, built with this small and useful tool and a bit of scripting. 30-days of twitter: How about a change plot: Or, if you like, the straight-up histogram:

Wondering if he meant it to sound that way

I’m only fifty pages into Andrew Keen’s The Cult of the Amateur, and passages like the following just keep popping up and confounding me:

When an article runs under the banner of a respected newspaper, we know that it has been weighed by a team of seasoned editors with years of training, assigned to a qualified reporter, researched, fact-checked, edited, proofread, and backed by a trusted news organization vouching for its truthfulness and accuracy. Take those filters away, and we, the general public, are faced with the impossible task of sifting through and evaluating an endless sea of the muddled musings of amateurs.

... Unlike professionally edited newspapers or magazines where the political slant of the paper is restricted to the op-ed page, the majority of blogs make radical, sweeping statements without evidence or substantiation. (pp. 52-53)

This assertion is made without evidence, substantiation, or a wink of self-awareness.

Lawrence Lessig has a long discussion of Keen’s book. He rightly points out that underlying Keen’s sweeping and unsubstantiated screed are some important questions (however unasked-by-Keen they may be): How do we create authority in collaborate and open systems? How do we build critical, capable audiences/creators of social media? How do we change build new kinds of markets to make this stuff work commercially?

And then Lessig unloads:

But what is puzzling about this book is that it purports to be a book attacking the sloppiness, error and ignorance of the Internet, yet it itself is shot through with sloppiness, error and ignorance. It tells us that without institutions, and standards, to signal what we can trust (like the institution (Doubleday) that decided to print his book), we won’t know what’s true and what’s false. But the book itself is riddled with falsity — from simple errors of fact, to gross misreadings of arguments, to the most basic errors of economics.

So how could it be that a book criticizing the Internet — because the product of a standardless process where nothing is “vetted for accuracy” (as he says of Wikipedia) — could itself be so mistaken, when it, presumably, has been “vetted for accuracy” and was only selected for publication because it passed the high standards of truth imposed by its publisher — Doubleday?

And then it hit me: Keen is our generation’s greatest self-parodist. His book is not a criticism of the Internet. Like the article in Nature comparing Wikipedia and Britannica, the real argument of Keen’s book is that traditional media and publishing is just as bad as the worst of the Internet. Here’s a book — Keen’s — that has passed through all the rigor of modern American publishing, yet which is perhaps as reliable as your average blog post: No doubt interesting, sometimes well written, lots of times ridiculously over the top — but also riddled with errors. Keen’s obvious point is to show those with a blind faith in the traditional system that it can be just as bad as the worst of the Internet. Indeed, one might say even worse, since the Internet doesn’t primp itself with the pretense that its words are promised to be true.

Lessig elaborates on the several key fallacies he sees in the book (relating to, among others, piracy, efficiency, and experts) both in his post and on a wiki, The Keen Reader.

Trains composite

Trains composite

I love the discoverability of Lightroom. First you find how neat it works to just manage lots of photos — and that process gets deeper as you use it more — and then you get better at processing raw images, and then you start to explore the other modules and see how the enable you to do creative things all in a single package.

Photo tool nrrdery

Update 6/27/2007: Lightroom v1.1 is out (it’s real, and it’s spectacular, see the O’Reilly Lightroom Blog for more), and it changes the database from a “library” to a “catalog.” In terms of this little tool, this change seems to only entail changing the filename referred to in the wrapper shell script — as I’ve done, below. Otherwise, generating a focal length histogram seems to work just as it did previously.

Camera Nrrdery

For fun: You can see that I use my fixed 50mm and 21mm lenses far more than anything else I’ve got. That’s because they’re so very pretty.

I use Adobe’s Lightroom to manage my RAW photos. It’s a wonderful, splendid tool. Among its features, it provides a handy metadata browser of your photo library, and includes the ability to browse by lens. Recently James Duncan Davidson mentioned being interested in plotting his use of various focal lengths, and commenters responded with a number of good solutions. Since Lightroom uses a SQLite database for its library, tools like SQLite Browser can be used to scan through the database file itself and export tables, at which point it’s straightforward to grep and find focal lengths. This is pretty slick all by itself, but I thought I’d put together a quick tool to automate the extraction and generation of this data. To do that, I use sqlite3 from the command line to dump the metadata table to a file, and then a short bit of R code finds the focal lengths and builds the histogram. The sqlite3 commands and the R code are invoked via a shell script that makes a copy of the main database to work with and cleans up the temp file when it’s all done.

Howto

If you made this this far, you might actually be interested in how it’s all done. After some tinkering, I found from Jeffrey Friedl’s Blog that Lightroom’s current database needs a newer version of sqlite3 than that which ships with OSX. With that update installed, sqlite3 will handle your Lightroom database without any problems.

Here’s the shell wrapper. Change paths to suit:

#!/bin/bash cp ~/Pictures/Lightroom/Lightroom\ Catalog.lrcat ~/lightroom.lrdb /usr/local/bin/sqlite3 -csv ~/lightroom.lrdb 'select xmp from Adobe_AdditionalMetadata;' > /Users/alan/lr-metadata.csv R CMD BATCH /Users/alan/bin/lr-getfocallengths.R rm ~/lightroom.lrdb rm ~/lr-metadata.csv convert ~/lr-focallengths.pdf ~/lr-focallengths.jpg

And here’s the R code, which lives in lr-getfocallengths.R and is called by the shell script. Again, fix paths for your own circumstances:

lr <- file("/Users/alan/lr-metadata.csv", "r") lrlines <- readLines(lr) temp <- gsub("(/1)", "", lrlines[grep("exif:FocalLength>", lrlines)]) lengths <- as.numeric(gsub("([^[:digit:]])", "", temp)) lengths<-lengths[lengths<=1000] pdf("/Users/alan/lr-focallengths.pdf") hist(lengths, main="Histogram of Focal Length Use",    xlab="Focal length (mm)", ylab="Number", breaks=seq(0,200, by=4)) dev.off()

A few things to note:

  • Depending on how your version of R is compiled, you can use jpeg(…) instead of pdf(…) to make the output file. My R isn’t currently compiled with jpg support, so I build a pdf file and then use convert on it.
  • There’s some noise in the metadata that leads to the erroneous identification of focal lengths like 83456000. That’s not right at all. I skim off everything above 1000 in line 5 of the R code. (Which is still sort of silly. My longest lens is presently 200mm.)
  • Relatedly, the x axis of the histogram only goes up to 200. To change that, modify the seq(0,200, by=4) accordingly — you can change the upper bound as well as the width of the bins.
  • A really slick way to do all this would be to properly parse the exported table in order to combine data, in order to limit the data to, for example, “favorites” by focal length. These aren’t in their own fields in the database, however, but rather all within a single column that holds all an image’s metadata, which makes it harder to select on multiple conditions. That’s a trick for another day.

Long time, no blog

Hey, the blog is still here! It turned out to be a quiet springtime for me, as I conscientiously cut back on blogging while in BelRedSeattle. (I have been taking a lot of pictures — see those for something of a chronology of my Jan-May.) Now that I’m officially no longer attached to any particular institution of higher learning, I’ll be picking up around here a bit, I expect.

Shout outs to my internets buddies.

Also

Getting kind of cobwebby in here…

Embed LaTeX in Wordpress

I don’t run Wordpress, but this is pretty cool: you can embed LaTeX right into your Wordpress posts. (Official announcement at WP.com)

Trondheim bike lift

XXXL

This is a really pretty fascinating story: How do all those champion shirts and hats get to the winning Super Bowl team so quickly? And what happens to the pre-manufactured gear prepared for the team that loses?

Distribution is a science. Twelve employees from Reebok and the N.F.L. huddle midway through the fourth quarter and handicap the game. If the score is lopsided, they stalk the sideline of the winning team, keeping the boxes out of sight.

But if the game is close, half the group goes to one side and half goes to the other. Each employee is assigned a star player to outfit. If the Colts win, for instance, someone immediately has to get a shirt and cap to quarterback Peyton Manning. If the Bears win, someone has to find linebacker Brian Urlacher.

The other team’s gear is hustled behind locked doors, to be given to a relief organization that sends them overseas, usually to Africa.

Feelin' fine

Way cool:

(image page at flickr)

We Feel Fine aggregates and provides clicky-feely visualizations of expressions of emotions online, via text found in blogs, flickr pages and google.

I spent a good chunk of today trying to figure out why a single dumb plot was coming out all hinky; these guys have colored affect balls swirling apparently effortlessly around your mouse cursor. I feel inadequate, sure, but I feel wildly enthusiastic, as well. This is cool stuff.

(Via Chris at Ruminate.)

Perfect

Jim Macdonald at Making Light responds to the President’s plan to make insurance premiums tax deductible:

Making yachts 100% tax-deductible won’t give everyone a yacht.


About, the short version

I’m a sociologist. This site is powered by Textpattern, TextDrive and the sociological imagination. For more about me and this site, see the long version.

Syndicate me with any of the following: Atom, RSS, sociology and linklog feeds are available.