{"id":196,"date":"2014-01-30T10:56:41","date_gmt":"2014-01-30T08:56:41","guid":{"rendered":"http:\/\/datascientists.info\/?p=196"},"modified":"2018-01-21T11:33:38","modified_gmt":"2018-01-21T11:33:38","slug":"comparing-stinger-to-impala","status":"publish","type":"post","link":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/","title":{"rendered":"Comparing Stinger to Impala"},"content":{"rendered":"<p>With Hadoop 2.0 and the new additions of <a href=\"http:\/\/hortonworks.com\/labs\/stinger\/\" title=\"Stinger\" target=\"_blank\">Stinger<\/a> and <a href=\"http:\/\/www.cloudera.com\/content\/cloudera\/en\/products-and-services\/cdh\/impala.html\" title=\"Impala\" target=\"_blank\">Impala<\/a> I did a (not representive) test of the performance on a Virtual Box running on my desktop computer. It was using the following setup:<\/p>\n<ul>\n<li>4 GB RAM<\/li>\n<li>Intel Core i5 2500 3.3 GHz<\/li>\n<\/ul>\n<p>The datasets were the following:<\/p>\n<ol>\n<li>Dataset 1: 71.386.291 rows and 5 columns<\/li>\n<li>Dataset 2: 132.430.086 rows and 4 columns<\/li>\n<li>Dataset 3: partitioned data of 2.153.924 rows and 32 columns<\/li>\n<li>Dataset 4: unpartitioned data of 2.153.924 rows and 32 columns<\/li>\n<\/ol>\n<p>The results were the following:<\/p>\n\n<table id=\"tablepress-1\" class=\"tablepress tablepress-id-1\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">Query<\/th><th class=\"column-2\">Hive (0.10.0)<\/th><th class=\"column-3\">Impala<\/th><th class=\"column-4\">Stinger (Hive 0.12.0)<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">Join tables<\/td><td class=\"column-2\">167.61 sec<\/td><td class=\"column-3\">31.46 sec<\/td><td class=\"column-4\">122.58 sec<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">Partitioned tables Dataset 3<\/td><td class=\"column-2\">42.45 sec<\/td><td class=\"column-3\">0.29 sec<\/td><td class=\"column-4\">20.97 sec<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">Unpartitioned tables Dataset 4<\/td><td class=\"column-2\">47.92 sec<\/td><td class=\"column-3\">1.20 sec<\/td><td class=\"column-4\">36.46 sec<\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">Grouped Select Dataset 1<\/td><td class=\"column-2\">533.83 sec<\/td><td class=\"column-3\">81.11 sec<\/td><td class=\"column-4\">444.634 sec<\/td>\n<\/tr>\n<tr class=\"row-6\">\n\t<td class=\"column-1\">Grouped Select Dataset 2<\/td><td class=\"column-2\">323.56 sec<\/td><td class=\"column-3\">49.72 sec<\/td><td class=\"column-4\">313.98 sec<\/td>\n<\/tr>\n<tr class=\"row-7\">\n\t<td class=\"column-1\">Count Dataset 1<\/td><td class=\"column-2\">252.56 sec<\/td><td class=\"column-3\">66.48 sec<\/td><td class=\"column-4\">243.91 sec<\/td>\n<\/tr>\n<tr class=\"row-8\">\n\t<td class=\"column-1\">Count Dataset 2<\/td><td class=\"column-2\">158.93 sec<\/td><td class=\"column-3\">41.64 sec<\/td><td class=\"column-4\">174.46 sec<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-1 from cache -->\n<figure id=\"attachment_207\" aria-describedby=\"caption-attachment-207\" style=\"width: 604px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png\" alt=\"Compare Impala vs. Stinger\" width=\"604\" height=\"208\" class=\"size-large wp-image-207\" \/><\/a><figcaption id=\"caption-attachment-207\" class=\"wp-caption-text\">Compare Impala vs. Stinger<\/figcaption><\/figure>\n<p>This shows that Stinger provides a faster SQL interface on Hive, but since it is still using Map \/ Reduce when calculating data it is no match for Impala that doesn&#8217;t use Map \/ Reduce. So using Impala makes sense when you want to analyse data in Hadoop using SQL even on a small installation. This should give you easy and fast access to all data stored in your Hadoop cluster, that was before not possible.<br \/>\nFacebook&#8217;s <a href=\"http:\/\/prestodb.io\/\" title=\"Presto\" target=\"_blank\">Presto<\/a> should achieve nearly the same results, since the underlying technique is similar. These latest additions and changes to the Hadoop framework really seem like a big boost in making this project more accessible for many people.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer. It was using the following setup: 4 GB RAM Intel Core i5 2500 3.3 GHz The datasets were the following: Dataset 1: 71.386.291 rows and 5 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,5,9],"tags":[26,45,46,47,64,80],"ppma_author":[144],"class_list":["post-196","post","type-post","status-publish","format-standard","hentry","category-big-data","category-data-science","category-tools","tag-cloudera","tag-hive","tag-hortonworks","tag-impala","tag-presto","tag-stinger","author-marc"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Comparing Stinger to Impala<\/title>\n<meta name=\"description\" content=\"With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Comparing Stinger to Impala\" \/>\n<meta property=\"og:description\" content=\"With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/\" \/>\n<meta property=\"og:site_name\" content=\"DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataScientists\/\" \/>\n<meta property=\"article:published_time\" content=\"2014-01-30T08:56:41+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-01-21T11:33:38+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png\" \/>\n<meta name=\"author\" content=\"Marc Matt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Marc Matt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/\"},\"author\":{\"name\":\"Marc Matt\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\"},\"headline\":\"Comparing Stinger to Impala\",\"datePublished\":\"2014-01-30T08:56:41+00:00\",\"dateModified\":\"2018-01-21T11:33:38+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/\"},\"wordCount\":203,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2014\\\/01\\\/vgl_impala_stinger.png\",\"keywords\":[\"Cloudera\",\"Hive\",\"Hortonworks\",\"Impala\",\"Presto\",\"Stinger\"],\"articleSection\":[\"Big Data\",\"Data Science\",\"Tools\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/\",\"name\":\"Comparing Stinger to Impala\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2014\\\/01\\\/vgl_impala_stinger.png\",\"datePublished\":\"2014-01-30T08:56:41+00:00\",\"dateModified\":\"2018-01-21T11:33:38+00:00\",\"description\":\"With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#primaryimage\",\"url\":\"http:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2014\\\/01\\\/vgl_impala_stinger.png\",\"contentUrl\":\"http:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2014\\\/01\\\/vgl_impala_stinger.png\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2014\\\/01\\\/30\\\/comparing-stinger-to-impala\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/datascientists.info\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Comparing Stinger to Impala\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"name\":\"Data Scientists\",\"description\":\"Digging data, Big Data, Analysis, Data Mining\",\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/datascientists.info\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\",\"name\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"width\":250,\"height\":174,\"caption\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/DataScientists\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\",\"name\":\"Marc Matt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"caption\":\"Marc Matt\"},\"description\":\"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\\\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.\",\"sameAs\":[\"https:\\\/\\\/data-do.de\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Comparing Stinger to Impala","description":"With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/","og_locale":"en_US","og_type":"article","og_title":"Comparing Stinger to Impala","og_description":"With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer.","og_url":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/","og_site_name":"DATA DO - \u30c7\u30fc\u30bf \u9053","article_publisher":"https:\/\/www.facebook.com\/DataScientists\/","article_published_time":"2014-01-30T08:56:41+00:00","article_modified_time":"2018-01-21T11:33:38+00:00","og_image":[{"url":"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png","type":"","width":"","height":""}],"author":"Marc Matt","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Marc Matt","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#article","isPartOf":{"@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/"},"author":{"name":"Marc Matt","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19"},"headline":"Comparing Stinger to Impala","datePublished":"2014-01-30T08:56:41+00:00","dateModified":"2018-01-21T11:33:38+00:00","mainEntityOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/"},"wordCount":203,"commentCount":0,"publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#primaryimage"},"thumbnailUrl":"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png","keywords":["Cloudera","Hive","Hortonworks","Impala","Presto","Stinger"],"articleSection":["Big Data","Data Science","Tools"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/","url":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/","name":"Comparing Stinger to Impala","isPartOf":{"@id":"https:\/\/datascientists.info\/#website"},"primaryImageOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#primaryimage"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#primaryimage"},"thumbnailUrl":"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png","datePublished":"2014-01-30T08:56:41+00:00","dateModified":"2018-01-21T11:33:38+00:00","description":"With Hadoop 2.0 and the new additions of Stinger and Impala I did a (not representive) test of the performance on a Virtual Box running on my desktop computer.","breadcrumb":{"@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#primaryimage","url":"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png","contentUrl":"http:\/\/datascientists.info\/wp-content\/uploads\/2014\/01\/vgl_impala_stinger.png"},{"@type":"BreadcrumbList","@id":"https:\/\/datascientists.info\/index.php\/2014\/01\/30\/comparing-stinger-to-impala\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/datascientists.info\/"},{"@type":"ListItem","position":2,"name":"Comparing Stinger to Impala"}]},{"@type":"WebSite","@id":"https:\/\/datascientists.info\/#website","url":"https:\/\/datascientists.info\/","name":"Data Scientists","description":"Digging data, Big Data, Analysis, Data Mining","publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/datascientists.info\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/datascientists.info\/#organization","name":"DATA DO - \u30c7\u30fc\u30bf \u9053","url":"https:\/\/datascientists.info\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","width":250,"height":174,"caption":"DATA DO - \u30c7\u30fc\u30bf \u9053"},"image":{"@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataScientists\/"]},{"@type":"Person","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19","name":"Marc Matt","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc","url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","caption":"Marc Matt"},"description":"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.","sameAs":["https:\/\/data-do.de"]}]}},"authors":[{"term_id":144,"user_id":1,"is_guest":0,"slug":"marc","display_name":"Marc Matt","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/196","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/comments?post=196"}],"version-history":[{"count":2,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/196\/revisions"}],"predecessor-version":[{"id":525,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/196\/revisions\/525"}],"wp:attachment":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/media?parent=196"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/categories?post=196"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/tags?post=196"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/ppma_author?post=196"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}