Hadoop’s star dims in the era of cloud object data storage and stream computing
One of the most noteworthy findings from @Wikibon’s annual update to our #bigdata market forecast was how seldom #Hadoop was mentioned in vendors’ roadmaps. I wouldn’t say that Hadoop — open-source software for storing data and running applications on large hardware clusters — is entirely dead. Most big-data analytics platform and cloud providers still support such Hadoop pillars as @YARN, @Pig, @Hive, @HBase, @ZooKeeper and @Ambari. However, none of those really represents the core of this open-source platform in the way that the Hadoop Distributed File System or HDFS does. And HDFS is increasingly missing from big data analytics vendors’ core platform strategies. The core reason why HDFS is receding in vendors’ big data roadmaps is that their customers have moved far beyond the data-at-rest architectures it presupposes. Data-at-rest architectures — such as HDFS-based data lakes — are becoming less central to enterprise data strategies. When you hear “data lake” these days, it’s far more likely to be in reference to some enterprise’s data storage in S3, Microsoft Azure Data Lake Storage, Google Cloud Storage and the like.