Roll your own real-time twitter search with thrudb
I’ve been using twitter quite a bit lately and really like the simplicity of the service and api. One thing thats missing though is search, but there are some great sites like tweetscan and summize that let you search public tweets in close to real-time.
I decided indexing twitter is a great application for thrudb, specifically the thrudex service. Thrudex is essentially a Thrift service for CLucene with some special sauce added. If you’d like to read about the inner workings read this.
Anyway, I whipped up a demo (in perl) for a realtime twitter search and have indexed a few days of tweets (over 3 million!) . check it out here.
One of our regular contributers, Thai Duong, was kind enough to port it to python+django for you new school folks.
*Note* this is running on a single dev box, so be forgiving… It’s currently polling the public timeline feed so it’s not going to catch every tweet. but It captures ~85%
We’ve added the code as a tutorial for thrudb here. Take it and build your own service… Any takers on building a ruby version or a cross site social search aggregation ala friendfeed?





April 23rd, 2008 at 7:37 am
[…] THIRD RAIL » Blog Archive » Roll your own real-time twitter search with thrudb (tags: twitter thrudb search tutorial) […]
April 24th, 2008 at 11:59 pm
[…] « Roll your own real-time twitter search with thrudb […]
May 2nd, 2008 at 11:56 am
[…] happened to me a few times; I stay up late working on a great post and finish at 1am EST. In a rush of excitement I decide to submit it to reddit or del.icio.us and goto bed fully […]
May 2nd, 2008 at 9:06 pm
[…] happened to me a few times; I stay up late working on a great post and finish at 1am EST. In a rush of excitement I decide to submit it to reddit or del.icio.us and goto bed fully […]
May 4th, 2008 at 7:38 pm
[…] happened to me a few times; I stay up late working on a great post and finish at 1am EST. In a rush of excitement I decide to submit it to reddit or del.icio.us and goto bed fully […]