Wednesday, August 09, 2006

that's so meta!

Text mining is the idea of extracting information from texts and indexing the results. I like to think of it like an electronic concordance (which may or may not be correct, but that's how my analog brain works). It's used a lot in security, but databases also like it because it's basically what allows you to search for a term in the entirety of an article.

Recently, some folks decided to data mine the Congressional Record. Ars Technica reported on a group of political scientists who wrote a formula to mine the CR, just to see exactly what it is that gets discussed on the House and Senate floors. What did they find? Read their report!

No comments: