{"id":627,"date":"2022-12-06T08:51:12","date_gmt":"2022-12-06T08:51:12","guid":{"rendered":"https:\/\/datascientists.info\/?p=627"},"modified":"2022-12-06T08:53:01","modified_gmt":"2022-12-06T08:53:01","slug":"apache-nifi-on-gke","status":"publish","type":"post","link":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/","title":{"rendered":"Apache Nifi on Google Cloud Kubernetes Engine (GKE)"},"content":{"rendered":"\n<p><a href=\"https:\/\/nifi.apache.org\/\">Apache Nifi<\/a> on GKE can be a good solution, if you want to have a low code solution for processing streaming data. If you set it up on GKE, a managed version of Kubernetes, you have a managed scalable environment and do not need to worry about handling the actual servers.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Setup of the Apache Nifi cluster<\/h2>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1054\" height=\"948\" src=\"https:\/\/datascientists.info\/wp-content\/uploads\/2022\/12\/nifi_cluster_deployment-1.jpg\" alt=\"Apache Nifi setup on Google Cloud Kubernetes Engine\" class=\"wp-image-643\"\/><figcaption class=\"wp-element-caption\">Apache Nifi setup on Google Cloud Kubernetes Engine<\/figcaption><\/figure>\n\n\n\n<p>Setting up the Apache Nifi on GKE can be managed by using <a href=\"https:\/\/www.terraform.io\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Terraform<\/a>, to make the deployment automated. This creates an easy way to manage changes and keep track of everything.<\/p>\n\n\n\n<p>Above you see an example architecture for an Apache Nifi on GKE setup. This setup uses Terraform to deploy the cluster and stores the processed data from Nifi into Bigquery and Cloud Storage. There are Nifi processors for both of these as data sinks.<\/p>\n\n\n\n<p>The Nifi Cluster has a Helm Chart for easy management of all the needed components, like:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/zookeeper.apache.org\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Zookeeper<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/nifi.apache.org\/registry.html\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Nifi Registry<\/a><\/li>\n\n\n\n<li>Nifi Nodes<\/li>\n<\/ul>\n\n\n\n<p>The chart is provided by <a href=\"https:\/\/cetic.github.io\/helm-charts\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Cetic<\/a> and can be installed on your cluster using the following code.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>helm repo add cetic https:\/\/cetic.github.io\/helm-charts\nhelm repo update\nhelm install esb cetic\/nifi -f custom_values.yaml<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Customizing the Helm Chart<\/h3>\n\n\n\n<p>To customize your Apache Nifi on GKE deployment, there is the possibility to adapt the <a href=\"https:\/\/github.com\/cetic\/helm-nifi\/blob\/master\/values.yaml\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">values.yaml<\/a> file provided in github. It contains information on e.g. how many Nifi nodes to deploy or to set up authentification for the NifiUI.<\/p>\n\n\n\n<p>One of the important things to set here, is to enable Nifi Registry. If this is not enabled and set up, a crash of the cluster might result in you losing your flows.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Apache Nifi Registry Setup on GKE<\/h3>\n\n\n\n<p><a href=\"https:\/\/nifi.apache.org\/registry.html\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Nifi Registry<\/a> is an additional Nifi service, that provides a version control for your Nifi Flows and also provides two options on how to store the versions. Git and Storage are the provided options. The XML configuration for the <strong>provider.xml<\/strong> is shown below.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&lt;flowPersistenceProvider>    &lt;class>org.apache.nifi.registry.provider.flow.FileSystemFlowPersistenceProvider&lt;\/class>\n&lt;property name=\"Flow Storage Directory\">\/opt\/nifi-registry\/nifi-registry-current\/flow_storage\/&lt;\/property>\n&lt;\/flowPersistenceProvider>\n&lt;flowPersistenceProvider>\n&lt;class>org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider&lt;\/class>\n&lt;property name=\"Flow Storage Directory\">\/opt\/nifi-registry\/nifi-registry-current\/git\/&lt;\/property>\n&lt;property name=\"Remote To Push\">origin&lt;\/property>\n&lt;property name=\"Remote Access User\">USERNAME&lt;\/property>\n&lt;property name=\"Remote Access password\">PASSWORD&lt;\/property>\n&lt;\/flowPersistenceProvider><\/code><\/pre>\n\n\n\n<p>In Google Cloud it is also possible to use Cloud Storage as a persistence backend. If you want to set this up, you need to customize the the container by adding <a href=\"https:\/\/github.com\/GoogleCloudPlatform\/gcsfuse\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">GCSFuse<\/a> to it. After adding this to the container, you need to adapt the start.sh file, to actually mount the bucket on container startup.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>echo \"Mounting GCS Fuse.\"\ngcsfuse -file-mode=777 -dir-mode=777 nifi-repository pt\/nifi-registry\/nifi-registry-current\/storage\/\necho \"Mounting completed.\"<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\"><em>Connecting Nifi to Registry<\/em><\/h4>\n\n\n\n<p>To connect Nifi to a registry you need to add a registry controller to Nifi under &#8220;Options&#8221; -> &#8220;Controller Settings&#8221; -> &#8220;Registry Clients&#8221;. Use the Kubernetes cluster internal IP for Nifi Registry here.<br>Then add a bucket to Nifi Registry. This bucket can then be selected in Nifi when setting up version control and will appear as a directory in the git repository.<br>To version control a flow, it needs to be nested inside a &#8220;Process group&#8221;. Once this is done, right click on the &#8220;Process Group&#8221; and under &#8220;Version&#8221; click &#8220;Start version controll&#8221;.<br>More documentation can be found <a href=\"https:\/\/nifi.apache.org\/docs\/nifi-registry-docs\/index.html\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">here<\/a>.<br><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Code examples<\/h3>\n\n\n\n<p>You can find examples for the Nifi Registry customization in this <a href=\"https:\/\/gitlab.com\/-\/snippets\/2467975\" target=\"_blank\" rel=\"noreferrer noopener\">snippet<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Apache Nifi on GKE can be a good solution, if you want to have a low code solution for processing streaming data. If you set it up on GKE, a managed version of Kubernetes, you have a managed scalable environment and do not need to worry about handling the actual servers. Setup of the Apache [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":643,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,3,4,6,9],"tags":[17,93,50,120,121],"ppma_author":[144],"class_list":["post-627","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-analytics-platform","category-big-data","category-data-lake","category-data-warehouse","category-tools","tag-apache-nifi","tag-google-cloud","tag-kubernetes","tag-nifi","tag-nifi-registry","author-marc"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Apache Nifi on Google Cloud Kubernetes Engine (GKE) - DATA DO - \u30c7\u30fc\u30bf \u9053<\/title>\n<meta name=\"description\" content=\"Setting up Apache Nifi on GKE can be a good solution, if you want to have a low code solution for processing streaming data.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Nifi on Google Cloud Kubernetes Engine (GKE) - DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"og:description\" content=\"Setting up Apache Nifi on GKE can be a good solution, if you want to have a low code solution for processing streaming data.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/\" \/>\n<meta property=\"og:site_name\" content=\"DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataScientists\/\" \/>\n<meta property=\"article:published_time\" content=\"2022-12-06T08:51:12+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-12-06T08:53:01+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/datascientists.info\/wp-content\/uploads\/2022\/12\/nifi_cluster_deployment-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1054\" \/>\n\t<meta property=\"og:image:height\" content=\"948\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Marc Matt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Marc Matt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/\"},\"author\":{\"name\":\"Marc Matt\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\"},\"headline\":\"Apache Nifi on Google Cloud Kubernetes Engine (GKE)\",\"datePublished\":\"2022-12-06T08:51:12+00:00\",\"dateModified\":\"2022-12-06T08:53:01+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/\"},\"wordCount\":488,\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2022\\\/12\\\/nifi_cluster_deployment-1.jpg\",\"keywords\":[\"Apache Nifi\",\"Google Cloud\",\"Kubernetes\",\"Nifi\",\"Nifi Registry\"],\"articleSection\":[\"Analytics Platform\",\"Big Data\",\"Data Lake\",\"Data Warehouse\",\"Tools\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/\",\"name\":\"Apache Nifi on Google Cloud Kubernetes Engine (GKE) - DATA DO - \u30c7\u30fc\u30bf \u9053\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2022\\\/12\\\/nifi_cluster_deployment-1.jpg\",\"datePublished\":\"2022-12-06T08:51:12+00:00\",\"dateModified\":\"2022-12-06T08:53:01+00:00\",\"description\":\"Setting up Apache Nifi on GKE can be a good solution, if you want to have a low code solution for processing streaming data.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/#primaryimage\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2022\\\/12\\\/nifi_cluster_deployment-1.jpg\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2022\\\/12\\\/nifi_cluster_deployment-1.jpg\",\"width\":1054,\"height\":948},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2022\\\/12\\\/06\\\/apache-nifi-on-gke\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/datascientists.info\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Apache Nifi on Google Cloud Kubernetes Engine (GKE)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"name\":\"Data Scientists\",\"description\":\"Digging data, Big Data, Analysis, Data Mining\",\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/datascientists.info\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\",\"name\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"width\":250,\"height\":174,\"caption\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/DataScientists\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\",\"name\":\"Marc Matt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"caption\":\"Marc Matt\"},\"description\":\"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\\\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.\",\"sameAs\":[\"https:\\\/\\\/data-do.de\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Nifi on Google Cloud Kubernetes Engine (GKE) - DATA DO - \u30c7\u30fc\u30bf \u9053","description":"Setting up Apache Nifi on GKE can be a good solution, if you want to have a low code solution for processing streaming data.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/","og_locale":"en_US","og_type":"article","og_title":"Apache Nifi on Google Cloud Kubernetes Engine (GKE) - DATA DO - \u30c7\u30fc\u30bf \u9053","og_description":"Setting up Apache Nifi on GKE can be a good solution, if you want to have a low code solution for processing streaming data.","og_url":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/","og_site_name":"DATA DO - \u30c7\u30fc\u30bf \u9053","article_publisher":"https:\/\/www.facebook.com\/DataScientists\/","article_published_time":"2022-12-06T08:51:12+00:00","article_modified_time":"2022-12-06T08:53:01+00:00","og_image":[{"width":1054,"height":948,"url":"https:\/\/datascientists.info\/wp-content\/uploads\/2022\/12\/nifi_cluster_deployment-1.jpg","type":"image\/jpeg"}],"author":"Marc Matt","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Marc Matt","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/#article","isPartOf":{"@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/"},"author":{"name":"Marc Matt","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19"},"headline":"Apache Nifi on Google Cloud Kubernetes Engine (GKE)","datePublished":"2022-12-06T08:51:12+00:00","dateModified":"2022-12-06T08:53:01+00:00","mainEntityOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/"},"wordCount":488,"publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/#primaryimage"},"thumbnailUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2022\/12\/nifi_cluster_deployment-1.jpg","keywords":["Apache Nifi","Google Cloud","Kubernetes","Nifi","Nifi Registry"],"articleSection":["Analytics Platform","Big Data","Data Lake","Data Warehouse","Tools"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/","url":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/","name":"Apache Nifi on Google Cloud Kubernetes Engine (GKE) - DATA DO - \u30c7\u30fc\u30bf \u9053","isPartOf":{"@id":"https:\/\/datascientists.info\/#website"},"primaryImageOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/#primaryimage"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/#primaryimage"},"thumbnailUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2022\/12\/nifi_cluster_deployment-1.jpg","datePublished":"2022-12-06T08:51:12+00:00","dateModified":"2022-12-06T08:53:01+00:00","description":"Setting up Apache Nifi on GKE can be a good solution, if you want to have a low code solution for processing streaming data.","breadcrumb":{"@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/#primaryimage","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2022\/12\/nifi_cluster_deployment-1.jpg","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2022\/12\/nifi_cluster_deployment-1.jpg","width":1054,"height":948},{"@type":"BreadcrumbList","@id":"https:\/\/datascientists.info\/index.php\/2022\/12\/06\/apache-nifi-on-gke\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/datascientists.info\/"},{"@type":"ListItem","position":2,"name":"Apache Nifi on Google Cloud Kubernetes Engine (GKE)"}]},{"@type":"WebSite","@id":"https:\/\/datascientists.info\/#website","url":"https:\/\/datascientists.info\/","name":"Data Scientists","description":"Digging data, Big Data, Analysis, Data Mining","publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/datascientists.info\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/datascientists.info\/#organization","name":"DATA DO - \u30c7\u30fc\u30bf \u9053","url":"https:\/\/datascientists.info\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","width":250,"height":174,"caption":"DATA DO - \u30c7\u30fc\u30bf \u9053"},"image":{"@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataScientists\/"]},{"@type":"Person","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19","name":"Marc Matt","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc","url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","caption":"Marc Matt"},"description":"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.","sameAs":["https:\/\/data-do.de"]}]}},"authors":[{"term_id":144,"user_id":1,"is_guest":0,"slug":"marc","display_name":"Marc Matt","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/627","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/comments?post=627"}],"version-history":[{"count":5,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/627\/revisions"}],"predecessor-version":[{"id":647,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/627\/revisions\/647"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/media\/643"}],"wp:attachment":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/media?parent=627"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/categories?post=627"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/tags?post=627"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/ppma_author?post=627"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}