{"id":175,"date":"2013-11-16T22:03:34","date_gmt":"2013-11-16T20:03:34","guid":{"rendered":"http:\/\/datascientists.info\/?p=175"},"modified":"2013-11-16T22:03:34","modified_gmt":"2013-11-16T20:03:34","slug":"sql-on-hadoop-facebooks-presto","status":"publish","type":"post","link":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/","title":{"rendered":"SQL on Hadoop: Facebook&#8217;s Presto"},"content":{"rendered":"<p>Earlier this month <a href=\"http:\/\/facebook.com\" title=\"Facebook\" target=\"_blank\">Facebook<\/a> open sourced its own product for using SQL on Hadoop. It is called <a href=\"http:\/\/prestodb.io\/\" title=\"Presto\" target=\"_blank\">Presto<\/a> and is something like Facebook&#8217;s answer to Cloudera&#8217;s <a href=\"http:\/\/www.cloudera.com\/\" title=\"Cloudera\" target=\"_blank\">Impala<\/a> or Hortonwork&#8217;s <a href=\"http:\/\/hortonworks.com\/labs\/stinger\/\" title=\"Stinger\" target=\"_blank\">Stinger<\/a> already presented in an earlier post called <a href=\"http:\/\/datascientists.info\/sql-and-hadoop\/\" title=\"SQL and Hadoop\" target=\"_blank\">SQL and Hadoop<\/a> on this site.<br \/>\nPresto is unlike Hive and more like Impala, since it doesn&#8217;t rely on MapReduce for its queries. This makes it about 10 times faster than Hive on large datasets, or so Facebook claims in a <a href=\"https:\/\/www.facebook.com\/notes\/facebook-engineering\/presto-interacting-with-petabytes-of-data-at-facebook\/10151786197628920\" title=\"Blog post\" target=\"_blank\">blog post<\/a>.<br \/>\nThis product may have a huge impact on the further development of SQL on Hadoop tools, if it&#8217;s taken up by enough companies. But since there is no commercial goal linked to it right now, it seems more like Facebook will develop it as their needs increase. So they will not be hurried along.<br \/>\nLike Impala it does support a huge subset of ANSI SQL contrary to Hive&#8217;s SQL like HiveQL. So it again aims on making Hadoop more accessible for a broader audience of analyst, that already are familiar with SQL.<br \/>\nAnalysis on Big Data sets have been strengthened by this release even more and the entry level investments for more companies to use Hadoop as data storage system are decreasing with every development in this direction.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Earlier this month Facebook open sourced its own product for using SQL on Hadoop. It is called Presto and is something like Facebook&#8217;s answer to Cloudera&#8217;s Impala or Hortonwork&#8217;s Stinger already presented in an earlier post called SQL and Hadoop on this site. Presto is unlike Hive and more like Impala, since it doesn&#8217;t rely [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,5,7],"tags":[37,43,45,47,64,77],"ppma_author":[144],"class_list":["post-175","post","type-post","status-publish","format-standard","hentry","category-big-data","category-data-science","category-machine-learning","tag-facebook","tag-hadoop","tag-hive","tag-impala","tag-presto","tag-sql","author-marc"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>SQL on Hadoop: Facebook&#039;s Presto<\/title>\n<meta name=\"description\" content=\"Facebook open sourced its own version of SQL on Hadoop called Presto\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"SQL on Hadoop: Facebook&#039;s Presto\" \/>\n<meta property=\"og:description\" content=\"Facebook open sourced its own version of SQL on Hadoop called Presto\" \/>\n<meta property=\"og:url\" content=\"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/\" \/>\n<meta property=\"og:site_name\" content=\"DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataScientists\/\" \/>\n<meta property=\"article:published_time\" content=\"2013-11-16T20:03:34+00:00\" \/>\n<meta name=\"author\" content=\"Marc Matt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Marc Matt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/\"},\"author\":{\"name\":\"Marc Matt\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\"},\"headline\":\"SQL on Hadoop: Facebook&#8217;s Presto\",\"datePublished\":\"2013-11-16T20:03:34+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/\"},\"wordCount\":219,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"keywords\":[\"Facebook\",\"Hadoop\",\"Hive\",\"Impala\",\"Presto\",\"SQL\"],\"articleSection\":[\"Big Data\",\"Data Science\",\"Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/\",\"name\":\"SQL on Hadoop: Facebook's Presto\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\"},\"datePublished\":\"2013-11-16T20:03:34+00:00\",\"description\":\"Facebook open sourced its own version of SQL on Hadoop called Presto\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2013\\\/11\\\/16\\\/sql-on-hadoop-facebooks-presto\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/datascientists.info\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"SQL on Hadoop: Facebook&#8217;s Presto\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"name\":\"Data Scientists\",\"description\":\"Digging data, Big Data, Analysis, Data Mining\",\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/datascientists.info\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\",\"name\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"width\":250,\"height\":174,\"caption\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/DataScientists\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\",\"name\":\"Marc Matt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"caption\":\"Marc Matt\"},\"description\":\"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\\\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.\",\"sameAs\":[\"https:\\\/\\\/data-do.de\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"SQL on Hadoop: Facebook's Presto","description":"Facebook open sourced its own version of SQL on Hadoop called Presto","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/","og_locale":"en_US","og_type":"article","og_title":"SQL on Hadoop: Facebook's Presto","og_description":"Facebook open sourced its own version of SQL on Hadoop called Presto","og_url":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/","og_site_name":"DATA DO - \u30c7\u30fc\u30bf \u9053","article_publisher":"https:\/\/www.facebook.com\/DataScientists\/","article_published_time":"2013-11-16T20:03:34+00:00","author":"Marc Matt","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Marc Matt","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/#article","isPartOf":{"@id":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/"},"author":{"name":"Marc Matt","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19"},"headline":"SQL on Hadoop: Facebook&#8217;s Presto","datePublished":"2013-11-16T20:03:34+00:00","mainEntityOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/"},"wordCount":219,"commentCount":0,"publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"keywords":["Facebook","Hadoop","Hive","Impala","Presto","SQL"],"articleSection":["Big Data","Data Science","Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/","url":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/","name":"SQL on Hadoop: Facebook's Presto","isPartOf":{"@id":"https:\/\/datascientists.info\/#website"},"datePublished":"2013-11-16T20:03:34+00:00","description":"Facebook open sourced its own version of SQL on Hadoop called Presto","breadcrumb":{"@id":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/datascientists.info\/index.php\/2013\/11\/16\/sql-on-hadoop-facebooks-presto\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/datascientists.info\/"},{"@type":"ListItem","position":2,"name":"SQL on Hadoop: Facebook&#8217;s Presto"}]},{"@type":"WebSite","@id":"https:\/\/datascientists.info\/#website","url":"https:\/\/datascientists.info\/","name":"Data Scientists","description":"Digging data, Big Data, Analysis, Data Mining","publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/datascientists.info\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/datascientists.info\/#organization","name":"DATA DO - \u30c7\u30fc\u30bf \u9053","url":"https:\/\/datascientists.info\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","width":250,"height":174,"caption":"DATA DO - \u30c7\u30fc\u30bf \u9053"},"image":{"@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataScientists\/"]},{"@type":"Person","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19","name":"Marc Matt","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc","url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","caption":"Marc Matt"},"description":"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.","sameAs":["https:\/\/data-do.de"]}]}},"authors":[{"term_id":144,"user_id":1,"is_guest":0,"slug":"marc","display_name":"Marc Matt","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","author_category":"1","first_name":"Marc","last_name":"Matt","user_url":"https:\/\/data-do.de","job_title":"Senior Data Architect | GenAI & RAG Expert | GCP \/ AWS","description":"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities.\r\n\r\nI help clients:\r\n\r\n \tMigrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\/AWS to reduce costs and increase agility.\r\n\r\n\r\n \tImplement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs.\r\n \tScale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow.\r\n\r\nProven track record leading engineering teams."}],"_links":{"self":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/175","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/comments?post=175"}],"version-history":[{"count":0,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/175\/revisions"}],"wp:attachment":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/media?parent=175"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/categories?post=175"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/tags?post=175"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/ppma_author?post=175"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}