Wikipedia's Infrastructure

Responding to this article, which details Wikipedia's hardware setup.

A lot of smaller companies I've worked on always look to the "big guys" to see how they're doing things, and what policies should be implemented. This article got me thinking - aren't you better off looking at the biggest demand on the smallest budget, and seeing what works for them?

Wikipedia is all donation based, has only one data center, which has 300 machines, and services 50k HTTP requests a second, requesting data from 1.5 terabytes of compressed text. They're running LAMP, which may be the most persuasive argument out there for folks to take a better look at PHP.

They also attribute a good bit of their speed to the Squid servers that sit out in front of the application server, and those would be worth a hard look for any organization trying to efficiently build out a data center to support a growing application.

As an aside, with all of Wikipedia in Tampa, one good hurricane could really ruin everyone's collective day.

Meanwhile, I'd think it's safe to say Google's also good at optimization, but runs quite a lot larger infrastructure. 12+ data centers, running ~100 MW of power, and fully customized software on an unknown number of machines.

No comments: