Tagging Old Entries

As part of the current upgrade effort, I decided it was time to go through and tag older blog entries, the ones written before the introduction of tags. These entries were categorized, and the filesystem was used to manage the categorization. Each directory represented a category, and each subdirectory a subcategory. The directories were given descriptive names, suitable for use with the pycategories plugin for pyblosxom.

An obvious approach to automating the retagging of these old entries was to use the category hierarchy itself to provide the tags. I wrote a python program to walk the directory tree, and add tags to the old entries.

I chose to ignore the general category, though, because as a tag general doesn't provide much information. I think I might try a more sophisticated approach to those entries, analyzing the content of the entry to choose tags. I haven't worked out all the details, yet, but I'm considering building a map of tagged entries, based on the frequency of certain words or phrases that appear in them, then applying that map to a histogram of the entries currently categorized as general.

Tue, 26 Feb 2008 09:21

Comments: 0

Wishlist

'Tis the morning of Christmas and all through the house, not a creature is stirring, except me and the cats—actually, that's most of the creatures—all the creatures were stirring except Barbara, who's still having visions of sugarplums.

Instead of sugarplums, I had a dream last night about a method for tagging blog entries. Every concept used as a tag should have at least two aspects, one more general, one more specfic. For example, to tag an item christmas, the item would receive the tags holiday and christmas. To add the tag season, one might use season and winter. Each tag is an orthogonal axis, and receives several more or less general points along that axis. I don't know if this is a terrible idea or not, but I suspect I should probably read more about taxonomies before I present my recently invented round-thing that I've decided to call the wheel.

I'm intrigued by the way F-Spot uses a tree structure to organize tags. If you have a tag Places, and under that tag, you add a child, Nebraska, and under that, another, Lincoln, when you add the Lincoln tag to a photo, it automatically and transparently behaves as if it also has the Places and Nebraska tags. That seems like a cool approach, though it makes adding a new tag a little more cumbersome: it requires that the tree structure be edited whenever a new tag is added, to put the new tag on the correct branch of the tree. Also, photos exported to flickr only receive the leaf tags. I'm not sure, yet, if that's a bug or a feature.

All that is just a digression from the actual subject of this post, which was intended to be a set of Christmas wishes, two front teeth style. So without further ado, here are a few things I'd like to find under the tree this morning.

  • Time and energy to finally debug and fix session.py, which is the only thing keeping comments from working on Get Up 8.
  • Less stuff. As I get older, I'm learning the emptier a room is, the more I like it.
  • More opportunity to practice Japanese.
  • Motivation to work on the house.
  • A physics class. Or cognitive science. Or LIS. Or...
  • Somebody to go to aikido classes with, and maybe trade blows with bokuto.
  • Is it to corny to ask for a world that's more sane too?

Tue, 25 Dec 2007 09:44

Comments: 0