Open Compute Project: Facebook’s idea for the future data center.

I had the privilege to attend Facebook’s Open Compute Project Summit in Palo Alto on June 17, 2011. Attendance appeared excellent; I met folks from silent mode start-ups to Google and from Adobe to ZT Systems. Although I didn’t estimate attendance, Facebook showed the logos of about ~ 100 companies invited to the event. Impressive.

The idea behiond Facebook's Open Compute Project is to move traditional hardware development toward a model followed by open source software development. The OCP's key tenets are: Efficiency, Economy, Environment, and Openness. The Forum began the process of building a community, united by the same vision, to develop specs and accelerate innovation.

An exciting element of the Open Compute Project highlighted by both Tom Furlong and Amir Michael was how Facebook’s approach has really overturned the usual paradigm on efficiency. Generally, higher efficiency is associated with higher cost. For facebook just the opposite is true. Done right, “Efficiency is Profitable” as the posters on the walls of the Forum highlighted. A good “proof point” is the current OCP platform. Not only is the platform economical but efficiency is designed in with 94% efficient PSU’s, a weight that is pounds lighter, and a layout that reduces the time to do routine services like HDD, DIMM, and PSU swaps from multiple minutes to under a minute.  So efficiency really covers the whole usage, from construction, to installation, to operation, all the way to maintenance. And every step is cheaper.  There's more detail in the introduction video below:

Presentations of customer requirements from Rackspace and Goldman Sachs had many common elements. Both are addressing the issues with scale-out and looking to contain costs. Goldman Sachs highlighted that in scale-out the only real differentiation left to vendors was the quality of service and support. Serviceability was highlighted by Rackspace as a key selection criterion due to their SLA requirements.

I enjoyed sitting in on the Networking hardware discussion. The key problem highlighted was the mismatch of the pace of innovation in the network layer compared to what is being done in software and hardware. The comparison between the very limited capabilities today in the monitoring, control, and data layers of the network versus what can be done in Linux where features and capabilities can be stripped down and optimized for a particular use, highlighted the opportunities. A particular use case was responding to a dynamic app stack. Facebook can have a weekend hackathon and completely alter the behavior of the software requirements for the network. Rackspace highlighted that someone could swipe a credit card, fire up a bunch of machines, and completely alter the dynamics of what is running in their data center. Both need affordable flexibility to understand and customize network solutions quickly.

There were also some disagreements about requirements. For instance in server spec design, what command set does a management solution support? It seems that linkage to the Open Data Center Alliance (ODCA) could foster some productive synergy. For instance, the end user-defined usage models that ODCA recently released and the infrastructure building blocks that OCP release look very complementary. In terms of hardware requirements, what feature sets get included in what reference designs might also lead to either divergences, or perhaps branches growing off a main server spec.

Overall I’m very excited about the opportunities presented by the Open Compute Project. It appears to be an opportunity to innovate and create new models which better meet the needs of the end users.