From Venture Beat:
On Wednesday, Cloudera and Hortonworks announced a “merger of equals,” where Cloudera is acquiring Hortonworks with stock so that Cloudera shareholders end up with 60 percent of the combined company. The deal signifies that the Hadoop market could no longer sustain two big competitors. Hadoop has been synonymous with big data for years, but the market — and customer needs — have moved on. Several megatrends are driving this change:
The public cloud tide is rising
The first megatrend is the shift to public cloud. Companies of all sizes are increasing their adoption of AWS, Azure, and Google Cloud services at the expense of on-premises infrastructure and software. Enterprise server revenues reported by IDC and Gartner continue to decline. The Top 3 cloud providers (90 percent of the market) offer their own managed Hadoop/Spark services, such as Amazon’s Elastic Map Reduce (EMR). These are fully integrated offerings that have a lower cost of acquisition and are cheaper to scale. If you’re making the shift to cloud, it makes sense to look at alternative Hadoop offerings as part of that – it’s a natural decision-point. Ironically, there has been no Cloud Era for Cloudera.
Crushing storage costs
The second megatrend? Cloud storage economics are crushing Hadoop storage costs. At introduction in 2005, the Hadoop Distributed File System (HDFS) was revolutionary: It took servers with ordinary hard drives and turned them into a distributed storage system capable of parallel IO consumable by Java apps. There was nothing like it, and it was a crucial component that allowed large scale data sets that didn’t fit onto a single machine to be processed in parallel. But that was 13 years ago. Today, there is a plethora of much cheaper alternatives, primarily object storage services like AWS S3, Azure Blob Storage, and Google Cloud Storage. A terabyte of cloud object storage costs about $20 a month, compared to about $100/month for HDFS (not including the cost to operate it). Which is why Google’s HDFS service, for example, is merely a shim that translates HDFS operations onto object storage operations – because that’s 5x cheaper…