In just the last four months, Microsoft announced its Hololens VR headset, Google launched its VR view SDK that allows users to create interactive experiences from their own content, Facebook expanded its live video offering, Yahoo announced that it will live stream 180 Major League Baseball games, Twitter announced it will live stream 10 NFL games, Amazon acquired image recognition startup Orbeus and Intel acquired immersive sports video startup Replay Technologies.
Are these events unrelated or are they part of something bigger? To me, they indicate the next wave of the Visual Cloud. The first wave was characterized by the emergence of Video on Demand (e.g., Netflix), User Generated Video Content (e.g., YouTube) and MMORPG (e.g., World of Warcraft). The second phase will be characterized by virtual reality, augmented reality, 3D scene understanding and interactivity and immersive live experiences. To paraphrase William Gibson, the announcements I listed above indicate that the future is already here – it’s just not evenly distributed. And it won’t take long for it to spread to the mainstream – remember that YouTube itself was founded in 2005 and NetFlix only started streaming videos in 2007. By 2026, the second wave will seem like old technology. In the technology world, in five years, nothing changes; in ten years, everything changes.
But why now? As with any technology, a new wave requires the convergence of two things: compelling end user value and technology capability and maturity.
It’s pretty clear that this wave can provide enormous user value. One early example is Google Street View (launched 2007). I’m looking for a new house right now and I can’t tell you how much time I’ve saved not touring houses that are right next to a theater or service station or other unappealing neighbor. While this is a valuable consumer application, the Visual Cloud also unlocks many business and public sector applications like graphics-intensive design and modelling applications and cloud-based medical imaging.
But, is the technology ready? The Visual Cloud Second Wave is an integration of several technologies – some are well established, some still emerging. The critical remaining technologies will mature over the next few years – driving widespread adoption of the second wave applications and services. In my opinion, the key technologies are (in decreasing order of maturity):
1. Cloud Computing – the Visual Cloud requires capabilities that only cloud computing can deliver. In most ways, the Visual Cloud First Wave proved out this technology. These capabilities include:
- Massive, inexpensive, on-demand computing. Even something as comparatively simple as speech recognition (think Siri, Google Now, Cortana) requires the scale of the cloud to make it practical. Imagine the scale of compute required to support real time global video recognition for something like traffic management.
- Massive data access and storage capacity. Video content is big - a single high quality 4k video requires 30-50 GB of storage, depending on how it compressed.
- Ubiquitous access. Many Visual Cloud applications are about sharing content between one user and another regardless of whether they might be in the world or what devices they are using to create and consume content.
- Quick Start Development. The easy access to application development tools and resources through Infrastructure as a Service (IaaS) offerings like Amazon Web Services and Microsoft Azure make it much faster for innovative Visual Cloud developers to create new applications and services and get them out to users.
2. High Speed Broadband. See above re: Video Content is Big. Even today, moving video data around is a challenge for many service providers. Video is already over 64% of consumer internet traffic and is expected to grow to over 80% by 2019. High quality visual experiences also require relatively predicable bandwidth. Sudden changes in latency and bandwidth wreak havoc on visual experiences even with compensating technologies like HLS and MPEG-DASH. This is especially true for interactive experiences like cloud gaming or virtual and augmented reality. The deployment of wireless 5G technologies will be critical to enable the Visual Cloud to grow.
3. New End User Devices. – Most of these advanced experiences don’t rely solely on the cloud. For both content capture and consumption, devices need to evolve and improve. Device technologies like Intel® RealSense Technology’s depth images provide innovative visual information to applications that isn’t available from traditional devices. Consumption technologies and form factors like VR headsets are necessary to consume some experiences.
4. Visual Computing Technologies. While many visual computing technologies like video encoding and decoding, raster and ray traced rendering have been around for many years, they have not been scaled to the cloud in any significant way. This process is just beginning. Other technologies, like the voxel 3D point clouds used by Replay Technologies, are just emerging. Advanced technologies like 3D Scene Reconstruction and Videogrammetry have several years to reach the mainstream.
5. Deep Learning. Computer vision, image recognition, and video object identification have long depended on model based technologies like HOG. While these technologies have had some limited use, in the last couple of years, deep learning for image and video recognition– using neural networks to classify objects in image and video content as emerged as one of the most significant new technologies in many years.
If you’re interested in learning more about emerging workloads in the data center that are being made possible by the Visual Cloud, you can watch our latest edition of the Under the Hood video series or check out our Chip Chat podcasts recorded live at the 2016 NAB Show. Much more information about Intel’s role in the Visual Cloud can be found at www.intel.com/visualcloud.