May 6th, 2008
Just a quick note to show how to use Wordpress to post at a certain time. Once finished writing your post, expand the “Post Timestamp” and “Post Status” option. Check “Edit Timestamp” and set the date/time you want to post. Also set the status to “Published”. That’s it!

Writen by jake
Posted in blogging, hacks | No Comments »
May 2nd, 2008
It’s happened to me a few times; I stay up late working on a great post and finish at 1am EST. In a rush of excitement I decide to submit it to reddit or del.icio.us and goto bed fully expecting to see it on the front page of their sites the next morning. Of course this rarely happens… so being a programmer I figured I should do some analysis on the best time to post.My approach was simple: Look at the times of day and days of the week that have the most popular posts. To define popularity I used AideRSS’s Postrank ™.
PostRank™ is a scoring system that we have developed to rank each article on relevance and reaction. PostRank ranges from 1-10.
Using the aiderss feed api, I fetched the last 10,000 posts on delicious, digg, reddit, and mixx…Threw it into R and plotted out the number of posts by weekday and posts by hour of day with PostRank > 6
*NOTE* Hours are displayed in GMT

It’s pretty clear that Tues - Friday between 10am - 2pm PST are the “hot times” for popular blog posts.
Now, I didn’t filter out non-english posts and this doesn’t account for the time it took for the posts to get to the front page of these sites, but I do think it’s clear posting late at night or on the weekends + monday is a bad idea. Your post will most likley go unnoticed.
Writen by jake
Posted in analysis, blogging, marketing | 35 Comments »
April 24th, 2008
Earlier this week I showed how to create a twitter search with thrudb.
The service has been running now for about a week and it’s collected over 8 million tweets. I’ve run some stats on the lucene db and these are the top 100 words with more than 3 letters :)
Writen by jake
Posted in lucene, twitter, thrudb | 1 Comment »
April 21st, 2008
I’ve been using twitter quite a bit lately and really like the simplicity of the service and api. One thing thats missing though is search, but there are some great sites like tweetscan and summize that let you search public tweets in close to real-time.
I decided indexing twitter is a great application for thrudb, specifically the thrudex service. Thrudex is essentially a Thrift service for CLucene with some special sauce added. If you’d like to read about the inner workings read this.
Anyway, I whipped up a demo (in perl) for a realtime twitter search and have indexed a few days of tweets (over 3 million!) . check it out here.

One of our regular contributers, Thai Duong, was kind enough to port it to python+django for you new school folks.
*Note* this is running on a single dev box, so be forgiving… It’s currently polling the public timeline feed so it’s not going to catch every tweet. but It captures ~85%
We’ve added the code as a tutorial for thrudb here. Take it and build your own service… Any takers on building a ruby version or a cross site social search aggregation ala friendfeed?
Writen by jake
Posted in twitter, thrudb | 5 Comments »
April 8th, 2008
It is pretty amazing.
What makes it stand out is really the DataStore API and Admin Interface. These two are really a game changer.
The DataStore API is a BigTable interface obviously. Without this we would just have another hosted scaling appserver solution without a database. The fact that the data layer is proven and ubiquitous just makes development/deployment simple. Frankly it scares me to think of how low the bar is to deploy an app now.
I remember when I was out of college I realized that most people thought programming was HARD! I secretly enjoyed the fact that, little did they know, programming by itself is really pretty simple, the hard part was getting something out the door. The fact that it’s getting so easy exposes our little secret.
I think I’ll get used to it, just need to look on the bright side. More focus on creative ideas.
That said, App Engine doesn’t fit everything. BigTable still doesn’t include any free text search which i think is ridiculous. I guess someone will write a python lib for creating an inverted index on BigTable. It also doesn’t fit backend services that crunch or enrich data. So I think mostly open social folks users will use it.
This does put a damper on a lot of startups that have been building user facing apps. I think it’s going to take some time for everyone to adjust but it’s all for the best i hope.
As for me, my startup is thankfully primarily a data enrichment service so App Engine isn’t a good fit. It also makes me think thrudb could pretty easily emulate the DataStore API so apps could be moved else ware.
Well done Google.
Writen by jake
Posted in bigtable, appengine, google | No Comments »
April 6th, 2008
I’ve been thinking a lot about what it means to be successful and researching how others found success.
A lot of successful people will admit their wins involved a bit/lot of luck. Or as Guy Kawasaki puts it good karma. Others believe they are the masters of their universe and they cannot fail (this seems to fade with youth though…)
I recently read a great book called “Fooled By Randomess” by Nassim Taleb, that talks about the role randomness plays in life and specifically financial markets. Nassim’s success as a trader came from his acceptance that failure will always occur and that you must place yourself in the position to expect and capitalize on those random failures.
I think this idea is applicable here since most successful entrepreneurs fail many many times before they succeed. In fact they often succeed because they bootstrap themselves from failures (I hope to be one of these). The best example of this is James Dyson’s story where he failed hundreds of times before succeeding at creating his famous vacuum.
That’s not to say that you can’t be successful when you are young, look at Mark Zuckerburg or even The Million Dollar homepage guy, but these two aren’t exactly the norm. I imagine if you suggested to them luck was why their ideas took off over everyone elses they would be terribly insulted. We all want to believe deep down that we know exactly how to build the next big thing, I sure thought that. Athletes are told to envision the goal, the free throw, the tackle. This is valuable but by no means will help when it’s raining on the day of the game.
My 1 year old loves this show called The Wiggles. It was started by a couple of ex-rockers from Australia. They each wear a different color shirt and sing silly songs about their dog, sleeping and buying apples. They have earned ten multi-platinum awards for sales of over 17 million DVDs and four million CDs! What a great story… Their pub rock band disbanded after little success then changing careers to become educators they hit upon a children’s rock band that’s now a worldwide phenomenon.
As I see it the key to success is perseverance. Failing… getting back on the horse and trying again. Only those that stop playing the game truly fail.
Writen by jake
Posted in business | No Comments »
March 25th, 2008
Libevent provides cross-platform asynchronous callbacks on sockets and file descriptors. Different operating systems have different ways of handling this efficiently, for example linux has kernel support for this operation which can scale to tens of thousands of sockets. It’s all pretty complicated but libevent makes it very simple. Along with a basic api that is used by highly scalable projects like memcached and thrift, it also has asyncronus a dns lookup api and a http server api.
Here’s an example of how simple it is to write a basic http server.
#include <sys/types.h>
#include <sys/time.h>
#include <sys/queue.h>#include <stdlib.h>
#include <err.h>
#include <event.h>
#include <evhttp.h>
void generic_handler(struct evhttp_request *req, void *arg)
{
struct evbuffer *buf;
buf = evbuffer_new();
if (buf == NULL)
err(1, "failed to create response buffer");
evbuffer_add_printf(buf, "Requested: %sn", evhttp_request_uri(req));
evhttp_send_reply(req, HTTP_OK, "OK", buf);
}
int main(int argc, char **argv)
{
struct evhttp *httpd;
event_init();
httpd = evhttp_start("0.0.0.0", 8080);
/* Set a callback for requests to "/specific". */
/* evhttp_set_cb(httpd, "/specific", another_handler, NULL); */
/* Set a callback for all other requests. */
evhttp_set_gencb(httpd, generic_handler, NULL);
event_dispatch();
/* Not reached in this code as it is now. */
evhttp_free(httpd);
return 0;
}
But is it fast? check out the benchmarks (on my laptop):
ab -c 1000 -n 10000 http://localhost/
Apache2: Requests per second: 1274.14 [#/sec] (mean)
libevent: Requests per second: 1584.37 [#/sec] (mean)
Kickass!
Writen by jake
Posted in libevent, web service | 3 Comments »
March 24th, 2008
I’m sure many of the readers of this blog don’t script in perl anymore. Seems now a days python, ruby, php rule the scripting world.
I was happy to see this presentation by Tim Bunce about how perl is still alive and well.
Here’s my two cents:
- Every job I’ve had since I graduated college has required perl coding, it’s truly the glue of many large infrastructures.
- CPAN modules like DBI, LWP and template-toolkit make it easy to build powerful back ends and front ends.
- mod_perl provides incredibly powerful hooks into apache.
- Perl OO lets you write code so clear most java developers can understand it.
I’m not trying to say perl is better than x,z and z. I just think it shouldn’t be written off like so many developers do.
This is why I felt it was so important to add perl bindings to Thrift. Perl deserved them and the thrift philosophy is one that is inherently perl-ish.
Writen by jake
Posted in perl | No Comments »
March 18th, 2008
My trip to SF is next week and I’ve got a lot going on. I’m going to be spending a lot of time with Ross but also have a good number of meetups with thrudb users.
Also managed to get a Thrift hackathon organized which should be awesome!
I’ve still got some time so let me know if any of you want to meet up.
You can track my progress on twitter.
Writen by jake
Posted in trip | No Comments »
March 8th, 2008
I’m definatly a cloud computing groupie.
Xen has really opened up the options for hosting. For a looong time the only way to get root on your host was to pay for a dedicated machine which was unreliable and hard to reboot / upgrade.
3rd rail has been running on unixshell which pioneered the xen hosting approach back in 2005. They ran into problems though and have been asking us to move to a wack closed source virtualization platform.
We’ve also been using EC2 which is by far the leader in this arena but its got some drawbacks… Mainly cost of ownership $70 a month for a single instance.
This afternoon we moved to a relatively new VPS provider called slicehost. The guys over there have really done a superb job building a scalable hosting solution.
The admin tools are first rate and they are building a great community. Best of all their service starts at $20 a month.
We’ll still use EC2 but not for small things like our blogs and low traffic sites.
Writen by jake
Posted in slicehost, hosting, xen, ec2 | No Comments »