SQL on Hadoop: Facebook’s Presto

Earlier this month Facebook open sourced its own product for using SQL on Hadoop. It is called Presto and is something like Facebook’s answer to Cloudera’s Impala or Hortonwork’s Stinger already presented in an earlier post called SQL and Hadoop on this site.
Presto is unlike Hive and more like Impala, since it doesn’t rely on MapReduce for its queries. This makes it about 10 times faster than Hive on large datasets, or so Facebook claims in a blog post.
This product may have a huge impact on the further development of SQL on Hadoop tools, if it’s taken up by enough companies. But since there is no commercial goal linked to it right now, it seems more like Facebook will develop it as their needs increase. So they will not be hurried along.
Like Impala it does support a huge subset of ANSI SQL contrary to Hive’s SQL like HiveQL. So it again aims on making Hadoop more accessible for a broader audience of analyst, that already are familiar with SQL.
Analysis on Big Data sets have been strengthened by this release even more and the entry level investments for more companies to use Hadoop as data storage system are decreasing with every development in this direction.

Posted

November 16, 2013

Big Data, Data Science, Machine Learning

marc

Tags:

Facebook, Hadoop, Hive, Impala, Presto, SQL

SQL on Hadoop: Facebook’s Presto

Comments

Leave a Reply Cancel reply