Apache Zeppelin: Use with remote Spark cluster and Yarn
Apache Zeppelin is pretty usefull for interactive programming using the web browser. It even comes with its own installation of Apache Spark. For further information... Read more.
Apache Zeppelin: Visualization and Spark data processing
Apache Zeppelin is a web-based notebook for interactive data analytics. It comes will features for all the steps of data analysis: Data Ingestion Data Discovery... Read more.
Apache Spark 2.0
Apache Spark has release version 2.0, which is a major step forward in usability for Spark users and mostly for people, who refrained from using it, due to the costs... Read more.
Python vs. R for Data Science
In Data Science there are two languages that compete for users. On one side there is R, on the other Python. Both have a huge userbase, but there is some discussion,... Read more.
Apache Spark: The Next Big (Data) Thing?
Since Apache Spark became a Top Level Project at Apache almost a year ago, it has seen some wide coverage and adoption in the industry. Due to its promise of being... Read more.
Big Data and Data Warehouse Architecture
Further development and new additions to the Hadoop framework, such as Stinger from HortonWorks or Impala from Cloudera try to bridge the gap between traditional... Read more.
Comparing Stinger to Impala
With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer. It... Read more.
SQL on Hadoop: Facebook’s Presto
Earlier this month Facebook open sourced its own product for using SQL on Hadoop. It is called Presto and is something like Facebook’s answer to Cloudera’s... Read more.
SQL and Hadoop
Bringing SQL to Hadoop has been one of the major trends in Big Data these last twelve months. Reason enough for me to take a closer look at that scene right now.... Read more.
REST (Representational state transfer) APIs and Big Data
Getting data, huge amounts of data, out of some systems tends to be quite a hazzle sometimes. Often you are required to use techniques such as FTP or SSH for transfering... Read more.