Archive for the ‘web’ Category

Remember Authority does not equal Accuracy

Monday, June 30th, 2008

authority

When I was a kid I thought my parents were always right.   Whether it was the best way to dress or the best way to write a sentence for english class; if my parents said it was the better way then it was.

It wasn’t until 8th grade that I realized my parents really didn’t know much about a lot of things like clothes or music or grammar but they would have never admitted it.  Eventually I learned to weigh my parents views as opinions that I respect, while at the same time using my own brain to decide on the right way for me.

If you are an entrepreneur, keep that in mind when you read something from people, companies or bloggers with authority.   If you find yourself always accepting what they say and do as correct then you are probably like me in 7th grade.

[del.icio.us] [Digg] [dzone] [Google] [Mixx] [Reddit] [StumbleUpon]
Writen by jake

IE8’s Odd Standard Compliance Mode

Saturday, January 26th, 2008

Internet Explorer iconArs Technica published an article about how Microsoft intends to further it’s standard compliance in their next browser version, Internet Exlporer 8. Since IE5.5, MS has been trying to implement some sort of web standard compliance into their browser - they’ve regularly come up short or just bungled the process altogether. Web developers already know what a mess version 5.5 was and how version 6 tried to fix the mess with a doctype switching mechanism that allowed developers to design around both versions. Unfortunately this resulted in a lot of hacks and workarounds to get sites working across other browsers and MS’s own.

When version 7 was released MS furthered their standard compliance which further broke even more web sites which made businesses and users reluctant to upgrade. MS apparently will be forcing everyone to use IE7 very soon.

With IE8, MS intends yet further their standards compliance to the level of FireFox, Safari and Opera. This is great news for designers and developers alike - but in true MS style, the implementation is crappy and just plain weird. First off, a third rendering mode will be implemented to take advantage of the new standards - this is for backward compatibility. To invoke this new rendering mode, MS is using a <meta> tag that IE8 will look for. This is really a poor idea and yet forces web developers/designers to keep having to consider IE separately - even if it renders the same in FF/Safari/Opera…Just as strange is apparently MS worked closely with WaSP to develop this new tag. How weird, the Web Standards Project group developing a non-standard procedure! While I’m looking forward to the easier web development and design I just don’t get this silly procedure.

Read the whole article.

Technorati Tags: , , ,

[del.icio.us] [Digg] [dzone] [Google] [Mixx] [Reddit] [StumbleUpon]
Writen by Rich

Chart me up!

Thursday, December 6th, 2007

Charts have always held a special place in my heart.

There is nothing like turning a nasty tabular dataset into a beautiful piece of art.

The problem with charting on the web is people want to interact with their charts, slice, zoom, and dice the data visually. This leaves few options other than flash… ugh.

Recently, however, there is a new wave of dhtml charting libraries. Here’s a quick list to consider.

MIT Simile Timeplot - Full featured timeseries charts using dhtml canvas and works in all modern browsers. Nice features like rollovers and popups. Downside it its for timeseries only and is a bit heavy.

Flot - New jQuery plugin that has a very simple api and allows basic zooming. Works in all browsers modern browsers. Downside is no rollovers.

Google Chart API - Not really dynamic charting tool but still pretty simple way to generate slick charts.

     

    [del.icio.us] [Digg] [dzone] [Google] [Mixx] [Reddit] [StumbleUpon]
    Writen by jake

    Announcing: Thrudb - Document Oriented Database Services

    Sunday, November 4th, 2007

    There has been a lot of talk recently about how traditional relational databases no longer fit the bill for web development. This is certainly a bit over the top since every site I’ve ever built or seen built uses a RDBMS. But I think the point is that not a lot has changed in the world of data storage since the 70’s. SQL, DDL and Referential Integrity are ideas that all came before the onset of the web. Databases are just big spreadsheets really but is that the best storage structure web data?.

    A new breed of databases and data services have emerged to in recent years to address this. The first product I came across was an XMLDB and XQuery but this system was built to offer everything a regular database offers PLUS a bunch of new features like on the fly indexing of any field. The problem with this kind of approach is it ends up complicating the API. Not to mention XML and performance don’t really fit together. I’m a big believer is simple/fast software components that can be put together to create powerful/fast systems. Google is the best known example of this. They are built to be massively parallel, so much so that there was no way a RDMBS would work. Instead they first built the Google File System which splits their data into 64MB chunks and spreads it across thousands of machines making at least 3 copies of any chunk for redundancy. Then they use techniques like MapReduce to create indexes of these documents, split it into index shards and spread those across their network too. Finally they have services that run on these machines that coordinate searches across their index shards returning the document ids and fetches them from the document store.

    They have also built a system called BigTable, which is a Column Oriented Database, which splits a table into columns rather than rows making is much simpler to distribute and parallelize.

    So why are these systems any better than a relational database? Well for one thing they make it much easier to scale horizontally, meaning you can slap on another box to the network and increase your database capacity. This is exactly how webservers scale but anyone who has tried to scale their website will tell you it’s never as easy to scale your database as it is your webservers, since traditional databases are inherently monolithic.

    Another benefit is your data structures can be sparsely populated and linked across any number of facets in these systems. The story of del.icio.us or flickr trying to scale using tagging and mysql is a great read because it illustrates the problem you run into when using fixed schema’s to hold dynamic/fluid data that wants to be searched, mashed-up, split up and grouped any which way.

    Ok, so how do I as a developer address this… Isn’t it obvious? Build a solution from open source components!

    I never would have attempted this if it weren’t for Facebook’s Thrift project. It provides much of what I needed to get this off the ground. Specifically the ability to build services that can communicate with almost any language. They used it internally to build much of their infrastructure like search and the Facebook platform itself. Thrift on the surface looks like a stripped down version of CORBA. You define structures and services in a IDL and use its code compiler to generate object definitions and a client/server interface. But Thrift offers soo much more. Most importantly, the ability to transmit your objects over any protocol be it binary, xml, json as well as over any transport (tcp socket, http, file).  Another big benefit of Thrift is you can adjust your structure definitions over time while keeping backwards compatibility with your previous definition. BINGO. This is a big deal because one of big reasons I keep using databases like mysql is so I can adjust my schema as I find bottlenecks or bugs. In fact Google has built a very similar system to Thrift which is how they store data on GFS, using compressed serialized objects they call protocol buffers.

    Ok, so I had a development platform, Thrift, now just add a few months of late night coding and a little Memcached, Spread, CLucene and Brackup and I ended up with…

    Thrudb is a set of simple services built on top of Facebook’s Thrift framework that provides indexing and document storage services for building and scaling websites. Its purpose is to offer web developers flexible, fast and easy-to-use services which can enhance or replace traditional data storage and access layers.

    Thrudb Features:

    • Client libraries for most languages
    • Multi-master replication
    • Incremental backups and redo logging
    • Multiple storage backends (S3 included)
    • Built for horizontal scalability
    • Simple and powerful search api (Lucene)

    Thrudb solves a lot of problems for me. Biggest of all is, now with Thrudb, I can use Amazon EC2 as a stable server farm since my backend database writes directly to S3. In fact, I’ve successfully moved Junkdepot from a traditional hosting facility using a mysql database to multiple EC2 instances using Thrudb in a week.

    check it out: http://thrudb.googlecode.com

    I’m not saying Thrudb is complete and production ready, but I do think its pretty reliable and simple to try out. I’m hoping you the reader can help make it better with your testing, coding and insight…

    [del.icio.us] [Digg] [dzone] [Google] [Mixx] [Reddit] [StumbleUpon]
    Writen by jake

    Javascript Arrays vs Object Literal

    Thursday, June 28th, 2007

    Recently, I’ve been learning to use Javascript object literals for holding similar sets of data as opposed to using arrays. They are much more manageable and flexible than simpler than arrays and I think even easier to read. Below is an a variable holding form validation data. I can simply loop through these just like I would an array, I can output the values, check against them and even call a function or even set an event listener. With arrays I am limited to mostly common data types like string, int, boolean and such. The fact that I can make references to functions is really cool and allows a pretty flexible and powerful system. Next time you have to work with arrays, consider object literal.

    var emptyValues = [{name:'firstname', id:'firstname', ce:checkEmpty, eid:'first_error', defVal: 'First'},{name:'lastname', id:'lastname', ce:checkEmpty, eid:'last_error', defVal: 'Last'},{name:'phone', id:'phone', ce:checkEmpty, eid:'phone_error', defVal:'Phone'},{name:'message', id:'message', ce:checkEmpty, eid:'message_error', defVal: 'Message'},{name:'name', id:'name', ce:checkEmpty, eid:'name_error', defVal: 'Full Name'},{name:'email', id:'email', ce:checkEmpty, eid:'error_email', defVal:'Email'}];

    [del.icio.us] [Digg] [dzone] [Google] [Mixx] [Reddit] [StumbleUpon]
    Writen by Rich

    is scaling easy? it can be.

    Thursday, May 31st, 2007

    The ruby on rails folk say scaling is easy, and they are correct, but there are many different components to scale. Scaling web servers horizontally using reverse proxy tools like pound or perlbal and caching with memcached and varnish will get you pretty far but real web applications need to scale their content and not just their service. Take Flickr for example, how do you scale millions of photos? Or Twitter, can’t put all those tweets in a single database. Thats why the key to real scaling is considering your service a year down the road, what considerations can you make now to make it easier to scale later on (if you are lucky enough).

    The most surefire way to scale your content is to make it easier to federate or partition your data. That means following these simple rules:

    1. Keep away from sequential primary keys in your database. Use UUIDs they can be generated globally from anywhere and with no chance of collision, you can more easily move to a multi-master database model this way if you have to, or split your data into partitioned chunks based on hashing the UUID.

    2. Don’t use stored procs (ever!). Thankfully most of us are used to not having stored procs in mysql so this isn’t a big deal, but if you split up your database into smaller pieces you can’t use a traditional stored proc to search across them all, not to mention its bad to put business logic in your model layer.

    3. Think about using special search tools, like lucene for searching across specific types of data. Related to the above rule searching across you data is hard when you split it up into pieces but tools like lucene make it easy to create small meta-indexes of your data which can easily fit a lot more info than a big innodb table.

    4. Don’t store binary data in your db, unless you like pain you should never store things like images in a database. Just store the path to it. I’ve found it easy to take the MD5 of the image and use that as the name, since you can then partition your images evenly across many directories (and eventually disks). Or just use amazon s3 :)

    5. Finally, Only store what you need. Scaling becomes much harder when you build a lot of complexity and normalization into your data model. Keep it simple stupid. People don’t like complicated apps, believe me I know :)

    There are some great talks about scaling data here.

    [del.icio.us] [Digg] [dzone] [Google] [Mixx] [Reddit] [StumbleUpon]
    Writen by jake