Thrift Officially an Apache Incubator Project

June 12th, 2008

Followup: Best time to post (howto)

May 6th, 2008

Thursday at Noon is the best time post and be noticed (PST)

May 2nd, 2008

It’s happened to me a few times; I stay up late working on a great post and finish at 1am EST.  In a rush of excitement I decide to submit it to reddit or del.icio.us and goto bed fully expecting to see it on the front page of their sites the next morning.  Of course this rarely happens… so being a programmer I figured I should do some analysis on the best time to post.My approach was simple:  Look at the times of day and days of the week that have the most popular posts.  To define popularity I used AideRSS’s Postrank ™. 

 PostRank™ is a scoring system that we have developed to rank each article on relevance and reaction.   PostRank ranges from 1-10. 

 Using the aiderss feed api, I fetched the last 10,000 posts on delicious, digg, reddit, and mixx…Threw it into R and plotted out the number of posts by weekday and posts by hour of day with PostRank > 6  

*NOTE* Hours are displayed in GMT

Best time of day and week to post

It’s pretty clear that Tues – Friday between 10am – 2pm PST are the “hot times” for popular blog posts.

Now, I didn’t filter out non-english posts and this doesn’t account for the time it took for the posts to get to the front page of these sites,  but I do think it’s clear posting late at night or on the weekends + monday is a bad idea.   Your post will most likley go unnoticed.

jake analysis, blogging, marketing

100 Most Popular Words Twittered This Week

April 24th, 2008

Earlier this week I showed how to create a twitter search with thrudb.

The service has been running now for about a week and it’s collected over 8 million tweets. I’ve run some stats on the lucene db and these are the top 100 words with more than 3 letters :)

jake lucene, thrudb, twitter

Roll your own real-time twitter search with thrudb

April 21st, 2008

I’ve been using twitter quite a bit lately and really like the simplicity of the service and api. One thing thats missing though is search, but there are some great sites like tweetscan and summize that let you search public tweets in close to real-time.

I decided indexing twitter is a great application for thrudb, specifically the thrudex service. Thrudex is essentially a Thrift service for CLucene with some special sauce added. If you’d like to read about the inner workings read this.

Anyway, I whipped up a demo (in perl) for a realtime twitter search and have indexed a few days of tweets (over 3 million!) . check it out here.

tweetsearch.gif

One of our regular contributers, Thai Duong, was kind enough to port it to python+django for you new school folks.
*Note* this is running on a single dev box, so be forgiving… It’s currently polling the public timeline feed so it’s not going to catch every tweet. but It captures ~85%
We’ve added the code as a tutorial for thrudb here. Take it and build your own service… Any takers on building a ruby version or a cross site social search aggregation ala friendfeed?

jake thrudb, twitter