schussman.com logo

Keywords in Lightroom

The last time I wrote about Lightroom, I was using sqlite to pull out frequencies of focal length. This time it’s keywords: Lightroom lets you build any number of custom keyword sets to apply to photos. It automatically builds a set of “recently used” keywords, but I thought it would also be handy to have a set of my most commonly-used keywords. While Lightroom has a command to export a list of keywords, that list doesn’t include frequencies. Keywords are stored in Lightroom in a table called AgLibraryTag. Conveniently, Lightroom writes a count of each keyword to the same table, so it’s easy to get out all the information we need. (Note: The frequency in this table is a cached value and may not reflect the up-to-the-minute reality within your database. Rather than constantly update its databases Lightroom seems to update this count when you view the discrete keywords. I’m not sure how to force a library-wide update of all keyword counts. This is probably close enough, and is simpler/quicker than counting keywords image-by-image.)

Rather than run this data through R to build a histogram as I did with the focal length data, I just use awk this time to make a list with the most frequently-used tags at the bottom. With this list, you can easily build a corresponding tag set in Lightroom.

Remember to change paths to suit, and that (on OSX) you’ll probably need to upgrade your version of sqlite for all this to work. Also, always always work from a copy of your database as this script does.

# display a sorted list of lightroom keywords cp ~/Pictures/Lightroom/Lightroom\ Catalog.lrcat ~/lightroom.lrdb /usr/local/bin/sqlite3 -csv ~/lightroom.lrdb 'select ImageCountCache, name from AgLibraryTag where kindName="AgKeywordTagKind";' > /Users/alan/lr-keywords.csv awk -F , '{print $1" "$2}' lr-keywords.csv | sort -n rm ~/lightroom.lrdb rm ~/lr-keywords.csv

Daydream: A map of how keywords relate to one another would be awesome.

Photo tool nrrdery

Update 6/27/2007: Lightroom v1.1 is out (it’s real, and it’s spectacular, see the O’Reilly Lightroom Blog for more), and it changes the database from a “library” to a “catalog.” In terms of this little tool, this change seems to only entail changing the filename referred to in the wrapper shell script — as I’ve done, below. Otherwise, generating a focal length histogram seems to work just as it did previously.

Camera Nrrdery

For fun: You can see that I use my fixed 50mm and 21mm lenses far more than anything else I’ve got. That’s because they’re so very pretty.

I use Adobe’s Lightroom to manage my RAW photos. It’s a wonderful, splendid tool. Among its features, it provides a handy metadata browser of your photo library, and includes the ability to browse by lens. Recently James Duncan Davidson mentioned being interested in plotting his use of various focal lengths, and commenters responded with a number of good solutions. Since Lightroom uses a SQLite database for its library, tools like SQLite Browser can be used to scan through the database file itself and export tables, at which point it’s straightforward to grep and find focal lengths. This is pretty slick all by itself, but I thought I’d put together a quick tool to automate the extraction and generation of this data. To do that, I use sqlite3 from the command line to dump the metadata table to a file, and then a short bit of R code finds the focal lengths and builds the histogram. The sqlite3 commands and the R code are invoked via a shell script that makes a copy of the main database to work with and cleans up the temp file when it’s all done.

Howto

If you made this this far, you might actually be interested in how it’s all done. After some tinkering, I found from Jeffrey Friedl’s Blog that Lightroom’s current database needs a newer version of sqlite3 than that which ships with OSX. With that update installed, sqlite3 will handle your Lightroom database without any problems.

Here’s the shell wrapper. Change paths to suit:

#!/bin/bash cp ~/Pictures/Lightroom/Lightroom\ Catalog.lrcat ~/lightroom.lrdb /usr/local/bin/sqlite3 -csv ~/lightroom.lrdb 'select xmp from Adobe_AdditionalMetadata;' > /Users/alan/lr-metadata.csv R CMD BATCH /Users/alan/bin/lr-getfocallengths.R rm ~/lightroom.lrdb rm ~/lr-metadata.csv convert ~/lr-focallengths.pdf ~/lr-focallengths.jpg

And here’s the R code, which lives in lr-getfocallengths.R and is called by the shell script. Again, fix paths for your own circumstances:

lr <- file("/Users/alan/lr-metadata.csv", "r") lrlines <- readLines(lr) temp <- gsub("(/1)", "", lrlines[grep("exif:FocalLength>", lrlines)]) lengths <- as.numeric(gsub("([^[:digit:]])", "", temp)) lengths<-lengths[lengths<=1000] pdf("/Users/alan/lr-focallengths.pdf") hist(lengths, main="Histogram of Focal Length Use",    xlab="Focal length (mm)", ylab="Number", breaks=seq(0,200, by=4)) dev.off()

A few things to note:

  • Depending on how your version of R is compiled, you can use jpeg(…) instead of pdf(…) to make the output file. My R isn’t currently compiled with jpg support, so I build a pdf file and then use convert on it.
  • There’s some noise in the metadata that leads to the erroneous identification of focal lengths like 83456000. That’s not right at all. I skim off everything above 1000 in line 5 of the R code. (Which is still sort of silly. My longest lens is presently 200mm.)
  • Relatedly, the x axis of the histogram only goes up to 200. To change that, modify the seq(0,200, by=4) accordingly — you can change the upper bound as well as the width of the bins.
  • A really slick way to do all this would be to properly parse the exported table in order to combine data, in order to limit the data to, for example, “favorites” by focal length. These aren’t in their own fields in the database, however, but rather all within a single column that holds all an image’s metadata, which makes it harder to select on multiple conditions. That’s a trick for another day.

Posting other peoples' hints

I found myself today with the desire to incorporate a bit of watermark-ish text and graphic into a LaTeX document. The regular graphics commands don’t work very well with this, so I went googling and found the code below.

Note that the following isn’t my tip, but the page that references the code points to a 404 link. I tracked down the google cache to get the original link back out and thought I’d share.


\usepackage{eso-pic}
\usepackage{color}
\usepackage{type1cm}
\makeatletter
  \AddToShipoutPicture*{%
    \setlength{\@tempdimb}{.5\paperwidth}%
    \setlength{\@tempdimc}{.5\paperheight}%
    \setlength{\unitlength}{1pt}%
    \put(\strip@pt\@tempdimb,\strip@pt\@tempdimc){%
\makebox(0,0){\rotatebox{45}{\textcolor[gray]{0.75}
{\fontsize{5cm}{5cm}\selectfont{Draft}}}}
    }
}
\makeatother

Change the final makebox statement to include any text or command that you want to be stamped on the page. Also, remove the asterisk from the opening AddToShipoutPicture command to have your image stamped on every page, rather than just the first.

TextMate GTD

Haris Skiadas, who has made massive contributions to writing in LaTeX with TextMate (see for example his screencasts of good use of the LaTeX bundle), has put together a super TextMate GTD bundle. Haris has been hacking on it nearly-continously for several days—I confess to having harassed him significantly throughout development so far—and the bundle is a fully-capable GTD system: You can work with a single document, or as many as you want, can easily move projects around, add tasks, and add and modify contexts. The bundle has a number of commands to generate Next Actions lists, and it will archive completed tasks/projects to a separate log file.

Up until now, I’ve been using the Kinkless GTD system. Lately, however, that software began to feel a little cumbersome, a little too cognitively heavy and opaque. Since it lives in TextMate, Haris’s GTD bundle works with pure text, making highly extensible, and it works great (it even knows how to convert your Kinkless document to its own format). Today Haris capped it off with a script that filters an inbox (fed via Quicksilver) into your GTD documents. Seriously cool. I highly recommend giving it a try if you’re using either TextMate or GTD (or need an excuse to give either one a test drive).

New-to-me Spotlight feature

I’ve spent the last several days digging information out of a set of files, essentially coding variables from a large group of archives. Previously, I’d only used Spotlight occasionally, but for this kind of data digging, Spotlight really, well, shines. Calling up Spotlight and telling it what I’m looking for brings up a short list of relevant files. That’s not new, but what was new to me is what Spotlight does next: When the file is opened, simply hitting CMD-G (for Find Again; works great in text files, Safari, Firefox, Mail) takes me right to the correct section within the file. This won’t work so well if the search string appears lots of times in the file, but if the file is a long list of mostly-unique records, it works great. I’ve used Spotlight’s similar ability to navigate directly to a reference in a PDF, but this was new to me, and slick. Even when the right document is already open, it often is far easier to invoke Spotlight and enter the search terms than to switch to the right application (which often means find in a long list of apps) from where I’m entering variables. All day long, it’s like Spotlight is reading my mind.

A wiki without the wacky

... That’s what some are calling Backpack, the newest web tool from the people who brought you Basecamp and Ta-da-list. I haven’t used either of those, as they were released just as I was trying out yet another organizational scheme of my own, but I am intrigued by Backpack’s promise of free-form but still useful organization. Per the “wiki without the wacky” tagline, I’m assuming that Backpack will use some simplified wiki-like syntax to denote do-lists and flag information of particular kinds. Sounds pretty neat.

Update: Hey, cool. I got a “golden ticket” to give Backpack a try (thanks to Jason Fried, and, I presume, technorati tags), and have been poking around with it for fifteen minutes or so. First impression: Pretty neat! I’ll write a bit more later after I make some progress on the syllabus revision that’s due tomorrow.

Google does maps

I was going to make this a simple link, but played around with Google Maps for a few minutes and found it too cool for a one-liner. As the top-level page of Google Maps suggests, you can search for directions, businesses, and locations. Try searching for “sushi in Tucson”—bingo! There’s a map with locations plotted on it. Another click gives you directions to or from any location. Mapquest isn’t nearly this slick.

Enter “tucson to flagstaff” in the directions search, and up comes the map straightaway; there’s no tabbing between all the address/city/state fields. If you know airport codes, enter them directly, to get directions from TUS to Ventana Canyon (for your golf weekend), for example. Want more detail about that freeway on-ramp? Click on any of the numbered waypoints for a close-up.

But it gets cooler. Find your location, and then click-drag on the map. It moves. Finer control? Use the arrow keys. See more in the tour.

Update: Hublog weighs in, saying that while Google Maps is pretty cool, “it still doesn’t beat the flying, zooming Java beast that is map24.” Java beast indeed, but the zoom is way cool.

Cartography

I spent a good chunk of this last weekend at Heather’s office, where they have chairs and desks purchased within the last decade—quite a difference from my own bomb-shelter steel desk at school. While there, I watched Heather do some of her work, which this weekend mostly consisted of creating and analyzing some GIS data.

Let me say that I consider myself relatively computer-savvy. I pick up concepts pretty quickly, can do some programming of a few different types, and feel comfortable with lots of different software on several platforms. So, I like to think that, you know, I get computers. But watching Heather work in GIS, well, I see that I just don’t get GIS. At all.

GIS is powerful. If you’re someone like a grass and fire ecologist, you can use it to map historical fire activity and predict the scope of future fires by hooking your GIS data into a statistical package. You can plot presence points of invasive species and build models of its future distribution. You can make really bitchin’ maps. But you first have to get it.

GIS, first of all, has this immense and complex vocabulary descended from various ways of making and describing maps, which is then crossed with technological vocabulary for rendering data: Topologies, cartographic mapping, vector- and point-based data. After a long day of cranking GIS, Heather routinely comes home and—after telling me what is wrong with the swam cooler—speaks this foreign mapping language for thirty or forty minutes at a stretch. On top of all this conceptual complexity is the fact that the software—all of it, as far as I can tell—basically defies intuition. You’re thinking, “Hey, I like maps. They’re neat.” Well just try downloading GRASS GIS, a mature, open source GIS package, and see if you can do something with it.

The same goes for the molto-expensive GIS packages made by folks like ESRI, who seem to own much of the commercial GIS market. For example, one problem (and this reminds me of the balkanization of various open source projects) is that the complex vocabulary that I referred to earlier changes with your software package. What ESRI’s ArcView calls “themes,” ESRI’s ArcInfo calls “layers.” The same company’s package calls something by two different names! Ack. It’s madness, and try as I might, I still can’t decipher just what Heather does all day. “Maps,” I tell people. “Ah, stuff with maps.”


About, the short version

I’m a sociologist. This site is powered by Textpattern, TextDrive and the sociological imagination. For more about me and this site, see the long version.

Syndicate me with any of the following: Atom, RSS, sociology and linklog feeds are available.