PaaS: Failure Is Not an Option

Intel IT is actively implementing PaaS as the next logical step for our enterprise private cloud, to accelerate custom application deployment and promote cloud-aware application design principles. Our PaaS environment will build on our already successful infrastructure as a service (IaaS) efforts, and will provide an environment featuring self-service, on-demand tools, resources, automation, and a hosted platform runtime container.

But, as many companies discovered earlier this year during a well-publicized IaaS failure, a platform is only as useful as it is available. Companies that designed for failure weathered the outage far better than companies that simply hoped nothing would ever break. For example, one company that used stateless services, graceful degradation methodologies, and multiple availability zones (AZs) experienced only minor errors and higher latency; less prepared companies were knocked out completely.

In our PaaS environment, we promote design for failure at both the platform and the application levels.

Platform Level: We want the underlying platform to do as much as it can to provide cloud capabilities for applications written to PaaS.  For our PaaS pilot, we are providing high availability within the platform but it is not enough.   We strive to implement an active/active model for instances of PaaS which run in multiple AZs.   In an active/active model, applications will be deployed to and synchronized between a primary and secondary PaaS instance.  If there is a failure of the primary PaaS we seamlessly failover to the secondary PaaS using global load balancing.  The platform will provide “eventual consistency” of data.  This means that over a period of time, all updates will propagate through the system and eventually the data associated with all the applications running on various PaaS instances will become consistent across all AZs and uncommitted transactions are resubmitted by end users.  A important concept to provide eventual consistency is sharding, where data is horizontally partitioned in the database architecture.

Application Level: We are actively promoting the idea of cloud-aware applications. We want the application—that is, the application developers—to take more responsibility for designing for failure. Traditional applications take it for granted that five-9s infrastructures are available. But in the cloud, that’s not necessarily true.  They should expect and design for infrastructure outages.  Building cloud-aware applications is a new area for Intel developers, and we are helping build new skill sets into our development community to help them design simplified, fault-tolerant, modular services that run in a virtualized, elastic, multi-tenant environment.

For more information on Intel IT's PaaS efforts, including a detailed discussion of our pilot project and key learnings, see “Extending Intel’s Enterprise Private Cloud with Platform as a Service"