Pentaho Releases “Filling the Data Lake” Big Data Blueprint for Hadoop
SAN JOSE, Calif., June 28, 2016 (GLOBE NEWSWIRE) – @HADOOPSUMMIT – @Pentaho, a @Hitachi Group Company, today announced “Filling the Data Lake”, a blueprint that helps organizations architect a modern data onboarding process for ingesting big data into #Hadoop data lakes that is flexible, scalable, and repeatable. Data management professionals can now offload the drudgery of the data preparation process and spend more time on higher value-added projects.
According to Ventana Research, big data projects require organizations to spend 46 percent of their time preparing data and 52 percent of their time checking for data quality and consistency. By following Pentaho’s “Filling the Data Lake” blueprint, organizations can manage a changing array of data sources, establish repeatable processes at scale and maintain control and governance along the way. With this capability, developers can easily scale ingestion processes and automate every step of the data pipeline.
“With disparate sources of data numbering in the thousands, hand coding transformations for each source is time consuming and extremely difficult to manage and maintain,” said Chuck Yarbrough, Senior Director of Solutions Marketing at Pentaho, a Hitachi Group Company. “Developers and data analysts need the ability to create one process that can support many different data sources by detecting metadata on the fly and using it to dynamically generate instructions that drive transformation logic in an automated fashion.”