DATA DO – データ道

Tag: Massively Parallel Processing (MPP) databases

Apache HAWQ: Building an easily accessable Data Lake

Data Lake vs Datawarehouse The Data Lake Architecture is an up and coming approach to making all data accessible through several methods, be that in real-time or batch analysis. This includes unstructured data as well as structured data. In this approach the data is stored on HDFS and made accessible by several tools, including: Apache…

October 20, 2016
Apache HAWQ: Full SQL and MPP support on HDFS

Pivotal ported their massively parallel processing (MPP) database Greenplum to Hadoop and made it open source as an incubating project at Apache, called Apache HAWQ. This bring together full ANSI SQL with MPP capabilities and Hadoop integration. The integration in an existing Hadoop installation is easy, as you can integrate all existing data via external…

October 10, 2016
Hadoop and MPP

With Big Data Map/Reduce is always the first term that comes into mind. But it’s not the only way to handle large amounts of data. There are databasesystems especially built to deal with huge amounts of data and they are called Massively Parallel Processing (MPP) databases. MPP database systems have been around for a longer…

April 26, 2013

By continuing to use the site, you agree to the use of cookies. more information