As you probably know, the companies that “run the web” (e.g. Google, Yahoo, Amazon, eBay, etc.) generally don’t use the “big iron” mainframe infrastructures used by many of our clients, but instead focus on using LOTS of cheap commodity boxes (e.g. Google had ~450,000 in 2006). Yahoo has an interesting article discussing an open source effort, (Hadoop), to implement one of the pieces of infrastructure that these companies typically use (and have built custom internally). Essentially, Hadoop is an open source software platform that lets applications process huge amounts of information within large clusters, and it’s already being used/taught/researched in some colleges. Technically, it’s an implementation of MapReduce, which is one of Google’s internal framework.