Text mining is the idea of extracting information from texts and indexing the results. I like to think of it like an electronic concordance (which may or may not be correct, but that's how my analog brain works). It's used a lot in security, but databases also like it because it's basically what allows you to search for a term in the entirety of an article.
Recently, some folks decided to data mine the Congressional Record. Ars Technica reported on a group of political scientists who wrote a formula to mine the CR, just to see exactly what it is that gets discussed on the House and Senate floors. What did they find? Read their report!
Wednesday, August 09, 2006
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment