I’ve noted recently how slick it is to be able to use nested keywords in Lightroom: It’s a piece of cake to select a set of photos, hit cmd-K, and enter “mount humphries > mountains” to assign the “mount humphries” as a child keyword of “mountains”.
However, as noted by a poster to a thread about keywording over on the Flickr Lightroom group, this creates a potential problem: If the child keyword is already used, Lightroom 2 will end up creating a duplicate keyword; you’ll end up with one “mount humphries” without a parent, and one “mount humphries” keyword with the parent of “mountains.” So, by trying to be hierarchical with your keywording, you’ve actually splintered your keywords. Not helpful!
Having already found a fun way to explore relationships between and frequencies of my keywords, it occurred to me that I might have some ready-made tools to help with this situation: The need to find and deal with duplicate keywords.
—
As this article became more popular, I worked through a couple of alternative methods and organized things a bit further. To date, I describe three methods of identifying duplicate keywords:
- Full auto: Requires some scripting but is the most expedient way to go about it (and my favorite).
- Semi-auto: Requires the awk tool to identify duplicates but doesn’t rely on any sqlite3 code to pull from the LR database.
- Full manual: Uses LR’s built-in export tool and MS Excel to get what you want. Lots of steps, but it works.
After finding your preferred method, read on to see what to do with all the duplicates you identify.
—
The full auto/scripted method
Thanks to Dieter for putting me on to a much quicker, straightforward way to identify duplicates using SQLite only — no awk:
select count(name) as num, name from AgLibraryKeyword group by name having num > 1
I’ve kept the original SQLite+awk method below for posterity:
First I needed to find any duplicates created — potentially — by the ad-hoc nesting of keywords. A quick modification to my keyword frequency script produces a list of dupes:
cp ~/Pictures/Lightroom/Lightroom\ 2\ Catalog.lrcat ~/lightroom.lrdb
/usr/local/bin/sqlite3 -csv ~/lightroom.lrdb 'select ImageCountCache, name from AgLibraryKeyword;' > /Users/alan/lr-keywords.csv
awk -F , '{print $2}' lr-keywords.csv | sort -n | uniq -d
rm ~/lightroom.lrdb
rm ~/lr-keywords.csv
As with the frequencies, this short script makes a backup copy of my Lightroom 2 database, then calls on sqlite3 to extract the list of keywords; the difference is in the 4th line, where I use awk to pull out the tags and then pipe them through two built-in unix functions to print a list of any duplicates. In my case, it yielded the following:

Perfect! A list of keywords that appear to be duplicates.
Note: The script above works with OS X Tiger and requires an upgrade to the default version sqlite. It ought to work with Leopard as long as sqlite is present. Windows? Don’t know; either via cygwin or separate binaries, awk and uniq should be available for Windows, and there is a sqlite for Windows download at the above link.
Skipping the sqlite step
Update 17/Dec/2008: If you’re not eager to delve into sqlite, you can still make this work, but you’ll still need to have the awk tool. OS X users, you should be good to go, since awk comes with the OS; Windows users, you can download awk for Windows. First, manually export your keywords list from your catalog: Metadata > Export Keywords…, and save the file as lrkeywords.txt, and then run the following one-liner script from a shell/terminal:
awk 'BEGIN {FS = " "}; {$2 = $2; if (match($0, /{/)0) print}’ lrkeywords.txt | sort -n | uniq -d
Just as in the sqlite version, this one-liner parses your keywords file and returns the list of keywords that appear more than once and are not identified as synonyms. You can then reconcile the duplicates as described above. I prefer the single-step version that extracts directly from the database, but hope this is useful to a few folks.
The Full Manual Process
If you’re averse to both the sql steps and to using awk, you can use MS Excel to identify the duplicates. I think it’s far more cumbersome than either of the above processes, but it works. Here goes:
Manually export your keywords from Lightroom: Metadata > Export Keywords ….
Open that export file in Excel: Open as a text delimited file, but uncheck all the delimiters; you don’t want excel to parse along spaces or tabs, since both of those characters appear in the file but not as record separators.
You should see a single column of keywords something like this:

We need to slightly clean up that column to remove leading whitespace. We’ll use the CLEAN function to do that: In the column next to your keywords column, enter the formula =clean(a1), and then drag that formula all the way down the keywords column.

In col B, you now have a whitespace-trimmed set of keywords, but because of the clean() formula, you can’t manipulate it further. Select that entire column, copy it, and then use paste special to paste the column as values into column C.

Now we’re ready for the final steps: Sort, flag, and filter. Select column C, then go to Data > Sort, and sort ascending by column C. Now, in column D, drag the following formula from the first row to the last: =IF(C2=C1, "!", "").

That just fills the column with a flag if column C has a duplicate. It’s a low-budget search, but it works as long as the list of keywords is sorted. Finally, use Data > Filter > Autofilter, and click-select the ! in column D. You’ll now have a filtered list of duplicates from your original keyword list, which you can resolve as described above. Note that you’ll have a number of keywords surrounded by {} or [] brackets; these are keywords entered as synonyms or categories, and you should ignore them when you are addressing duplicates.

You can see from my current duplicate list that I’ve been working heavily on food-related keywords as we cruise through the holiday season.
All told, that whole manual process should just take a couple of minutes once you have the steps sorted out. Because the steps are manual, it’s not as easily-repeatable as the automated sqlite+awk approach, but it does work. I hope someone finds it useful!
Dealing with the duplicates
Whatever method you’ve employed, at this point you have a list now — let’s check out if it means what I think it means. Switching back over to Lightroom, I can filter for all photos with the “mount humphries” keyword:

Sure enough, I have 12 images tagged with “mount humphries”, and 11 images with the same tag set as a child of “mountains” (as an aside, I see that I have well over a hundred images with the “mountains” tag that could probably use some more granular tagging).
My first impulse was to try to just drag the non-child “mount humphries” into the “mountains” tag; this works, after all, with other keywords. But in this case, it won’t do the trick, presumably because there is already a “mount humphries” keyword there — Lightroom won’t let me add a same-named child.
To reassign the keyword to the parent, you need to take a few more steps: First, click the right-pointed arrow to the right side of the duplicate, non-child keyword; this will navigate to all images assigned that keyword. Then, in the grid view, simply select all (cmd-A), and then check the child keyword to add it to all of the selected images (the checkbox is to the left of the keyword, and appears when the mouse cursor hovers over the keyword), and you’ll see the count increase accordingly. Next, un-check the duplicate, non-child keyword in the keywords panel. You’ll see its count drop to 0. The order of those two check-uncheck steps is important: If you uncheck the non-child keyword first, you’ll end up with an empty selection and nothing to apply the proper keyword to.

There! My “mount humphries” keyword as a child of “mountains” is now assigned to all 23 original images, and I can delete the duplicate, non-child tag.
So, with an approach like this, ad-hoc keyword nesting shouldn’t be feared: We can identify duplicates created by nesting, and, in a matter of seconds, apply the same nesting to any previously-tagged images. And, once you’ve resolved the duplicate, any further assignment of the focal keyword will always assign it, appropriately, as a nested tag. Pretty slick, I do say.
I spend a lot of time in Lightroom 2 these days. I’m nobody’s pro, but I shoot a lot of photos, and after having used Lightroom (and now Lightroom 2) for a while now, I think I have a pretty good, simple, enthusiast-style workflow sorted out. I’ll summarize the workflow itself (importing through working up images) in follow-up post. Here are a few general tips that seem to work well for me:
Essentials, or Stuff I use constantly: I use Picks and keywords extensively. Reviewing newly-imported photos, I mark anything that I like right off the bat as a Pick by simply hitting shift-P as I scan through the gallery (and shift-X to immediately mark others as rejects; the shift modifier will mark the current photo and move on to the next shot). As I revisit a set of shots later, I find myself repeating this process; while those subsequent passes primarily identify further Rejects, I do occasionally find more Picks after starting to work up other photos. After each pass through a gallery, I use cmd-DELETE to remove (and delete) all the Rejects.
This has been a nice insight for my process: It means that I am fairly conservative when it comes to Rejects. That is, I don’t mark as Rejected 1) unless a photo is obviously bad (bad focus, blur, composition I really dislike, etc.) OR 2) until I’ve spent some time on photos in a set that I do like right from the get-go. This frequently helps give me a sense for appealing qualities of photos that I might not have noticed or thought of initially.
With a gallery through at least a first pass of identifying Picks and Rejects, I apply keywords. As with many aspects of processing photos, Lightroom has lots of ways to do this. There’s a jobber called Keyword Painting that I don’t use, because it’s always been much faster to simply select sets of photos and then apply keywords to the selection. In Lightroom 2, cmd-K focuses on the keyword entry box, which will auto-complete as you type. Lightroom 2 also has “recommended keywords” functionality, so that as keywords are assigned to a photo or set of photos, a new set of co-occurring keywords is identified and displayed for easy additional assignment.
Although I like to use a large-ish image preview (hit = to increase the size of preview images in the gallery grid) for screening for Picks and Rejects, for keywords I like to shrink the grid size (keyboard shortcut -). This fits more images into the grid and allows me to select larger sets for group assignment of keywords.
Lightroom allows for keywords to be nested, and there’s a great shortcut for accomplishing this: When entering keywords, separate child from parent keywords with a > sign: flickers > birds, for example, or burgers > food.
Simplifying, or Things I don’t use in Lightroom: Beyond keywording, Lightroom has at least a trio of way to identify and categorize photos: You can flag photos as Picks, label them with colors, and rate them with zero through five stars. I don’t use colors or stars at all. They may be highly useful for some situations, but they just clutter the cognitive space where I think about my photos: “Is this a three-star green photo, or a four-star blue one?” So except in the rarest circumstances, I haven’t yet found a use for ratings and color labels.
Indispensable keyboard shortcuts: There are grundles of these, but the shortcuts I use all the time are:
- G, E, D: Gallery, Editor, and Develop modes
- P: Mark as Pick (modify with shift)
- X: Mark as Reject (modify with shift)
- cmd-delete: Remove Rejects (optionally delete from disk)
- cmd-K: Assign keywords
- W: Jump to White Balance selector in Develop mode
- R: Crop tool in Develop mode
- J: Show clipped darks and highlights (Developer only; in gallery, changes display of thumbnails)
- L: Cycle the lights (view on black)
- tab/shift-tab: reveal/hide menu panels
Next time: The library filter, file organization, workflow, and Lightroom+Flickr?
Update 7/26/2009: There are some fantastic new features of LR2/Mogrify to check out:
- Relative-sized borders (which I’ve written about before) are super.
- As is relative-sized annotation text. No more calculating your scaled text size to apply to an export, since you can set the size to a percentage of image width or height. Very cool. (This goes for watermarks, too)
- More border features! A checkbox for “identical borders” makes it much easier to set uniform borders. And inner borders are now supported for the creation of an inset frame with variable opacity.
- Setting compression by file size: Specify a file size and LR2/Mogrify will compress the image accordingly so as not to exceed that file size.
Give these a try and don’t forget to donate to Timothy’s work if you find it of use.
Also, note that you can use Lightroom’s post-crop vignette feature to generate curved borders (in black or white) with less flexibility but also without plugins.
—
Update 12/21/2008: Much of this writeup now has more historical than practical value, since Timothy Armes has updated LR2/Mogrify to support multiple border options within the plugin’s own control panel. You can specify different-sized frames & borders without any extra monkeying around. Nice work, Tim!
—
A question recently came up in one of the Lightroom groups over on flickr about creating images with large borders on just one side — space within a frame to place a title, for example, but just along a single edge of an image. The poster wanted to create images such as those found here, and wondered if it was doable without diving out to an external tool like Photoshop. The first working proposal was to use a graphical frame applied in the print module, but that isn’t an ideal solution for me; it still requires setting up that frame with something like Photoshop, and to apply it you have to switch modes. So I tinkered a while with a photo I took a couple of nights ago, and managed to get what I think is a nice solution via a direct export from Lightroom using the fantastic LR2/Mogrify plugin from Timothy Armes.
The out-of-the-box options for this plugin don’t provide a capability to create different-sized borders to an image, but the underlying engine for the plugin, Imagemagick’s mogrify tool, does — after a fashion. So, in a nutshell, the trick is to use the command line element of LR2/Mogrify in addition to its other features, to add to the picture’s canvas size before performing the other operations.

There are just a couple of tricks to get this to work smoothly using LR2/Mogrify. First is to add the extent command to the mogrify configuation, specifying the resulting size of the image you want to export:

I’ve specified the command -background white -extent 3008×2158 to be prepended to the mogrify command line that LR2/Mogrify will execute for me. I’m exporting an original image that’s 3008×2008, so I’ve specified 2008×2158 to the final image — adding 150 pixels, which will be filled with a white background. Next I use the built-in features of LR2/Mogrify to add the colored frames and the text overlay.

Because the extent command was applied to the beginning of the command line, the borders will be applied to the new image — the one with the bigger lower border created by extent.

The text overlay is just a bit strange. Note that instead of specifying the text to fall at the bottom, I’ve placed it at the top center of the image, with an offset of 2158 pixels. For some reason, directly placing it at the bottom center reverses the position of the new white border — it ends up at the top of the image, through some kink of mogrify that I can’t quite sort. It’s easy enough to compensate with the offset.
Export away, and that’s all it takes. You’ve built an image with a nice broad frame and caption, all right from Lightroom’s export panel. No Photoshop or print module necessary. Fun.