{"id":597,"date":"2020-05-29T05:40:47","date_gmt":"2020-05-29T05:40:47","guid":{"rendered":"https:\/\/datascientists.info\/?p=597"},"modified":"2020-05-29T05:40:47","modified_gmt":"2020-05-29T05:40:47","slug":"bringing-machine-learning-models-into-production","status":"publish","type":"post","link":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/","title":{"rendered":"Bringing machine learning models into production"},"content":{"rendered":"\n<p>Developing and bringing machine learning models into production is a task with a lot of challenges. These include model and attribute selection, dealing with missing values, normalization and others.<\/p>\n\n\n\n<p>Finding a workflow that puts all the gears, from data preprocessing and analysis over building models and selecting the best performing one to serving the model in a real time API, into motion is the one I want to discuss here.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Life cycle of a machine learning model<\/h2>\n\n\n\n<p>The life cycle of machine learning is basically described by the iteration of the following four steps.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"789\" height=\"528\" src=\"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image.png\" alt=\"Life cycle of machine learning models\" class=\"wp-image-598\"\/><figcaption>Machine learning steps<\/figcaption><\/figure><\/div>\n\n\n\n<p>Each of these steps is under constant evaluation. Especially in case model performance can be enhanced by adding different data attributes or preprocessing methods.<\/p>\n\n\n\n<p>For the presented approach we split the process of modeling into two other parts. Part one contains the above mentioned four steps and we call it Manual Run Modeling. Step two is automating the steps of part one.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"2123\" height=\"1490\" src=\"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image-1.png\" alt=\"machine learning for production\" class=\"wp-image-600\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Manual Run Modeling<\/h2>\n\n\n\n<p>In the manual part we start by analysing our new task. After that we come up with a hypothesis we want to prove and test.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Development and Prototyping Environment<\/h3>\n\n\n\n<p>First we set up a development environment for working on the new task. For this we spin up a <a rel=\"noreferrer noopener\" href=\"https:\/\/jupyter.org\/\" target=\"_blank\">Jupyter<\/a> notebook server. We deploy it on Google Cloud AI Platform. It provides ready to run containers for Jupyter. The notebook approach enables us to develop fast and share results with the team using a browser. With the ability to easily visualize data inline in a notebook, this approach is especially useful in the data extraction and preprocessing process.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Data preparation and visualization<\/h4>\n\n\n\n<p>Python provides some nice packages for generating graphics on data for faster insights. This speeds up our prototyping process in the notebook. We are especially fond of using <a rel=\"noreferrer noopener\" href=\"https:\/\/seaborn.pydata.org\/\" target=\"_blank\">Seaborn<\/a>.<\/p>\n\n\n\n<p>We load the data identified for this model into a dataframe in the notebook. After that we start looking at each attribute and its values, often in combination with the other attributes. For a first overview we use a pairplot provided by Seaborn.<\/p>\n\n\n\n<p>We use a combination of visualizations, such as e.g. a correlation matrix. Then we decide which attributes to use and how to handle outliers and missing values. Finally we use one hot encoding for categorical attributes and normalize the continuous attributes to create the input into our models.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/SUDO13AtJFaqfl-1C6cSvf8ihKRaa6manEqB_ww4ZXycA9boVN4D1gdGzvLC2L-KWIJ6Y3pLFmkz8eoxuP9ON16z3q7AVuGg4FOgtGJ3uwHvjkXavRpgKUUAg0W73uqsPOczceIw\" alt=\"pairplot of attributes\"\/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Model Selection and Evaluation<\/h4>\n\n\n\n<p>When the data is ready we choose several models to find a solution for our problem. These models can range from a multilinear regression model over random forests to deep neural networks with Tensorflow.<\/p>\n\n\n\n<p>After splitting the data for training, evaluation and test we decide on a measure each model has to optimize for, e.g. mean squared error or precision. This depends on the kind of problem. Once we identified the best performing model we start by transforming the code for Google Cloud AI Platform.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">RT Prediction Deployment<\/h2>\n\n\n\n<p>After manual evaluation of preprocessing and modeling, we start the task of automating training and deployment for bringing machine learning models into production. This can be split into three tasks:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Training the model with hyperparameter optimization on Google Cloud AI platform<\/li><li>Deploying the model on Google Cloud AI platform<\/li><li>Deploying an API to access the model for real time predictions<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Training on Google Cloud AI Platform<\/h3>\n\n\n\n<p>After deciding on a model to go forward into production with, we optimize our code for data extraction and preprocessing. The reason for this is to make it reusable and compliant with Google Cloud AI Platform rules. This means basically we have to create a Python package out of the first three steps.<\/p>\n\n\n\n<p>A project could be set up as shown in the picture below.<\/p>\n\n\n\n<p class=\"has-text-align-center\"><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/GoogleCloudPlatform\/cloudml-samples\/tree\/master\/sklearn\/sklearn-template\/template\" target=\"_blank\">Sample structure of AI platform package<\/a><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/Tf8CphGLpTcSjTMyYUcIrpqUZ0fC_EmSx5hD6ZdsoNGsEvxNBb6ogR_AnW3O_pM598VFeLjaLgU-ZOKY5KgW4kXYWK4OzNe40MLxiWtQEpuAGNu5oDgSuYLzya--0VdRztC676lM\" alt=\"Sample structure of AI platform package\"\/><\/figure>\n\n\n\n<p>This Python package is then deployed to Google Cloud platform and executed there. If you have custom packages there is an option to supply those too. An example call for training on the cloud would then look like the following example.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/S01NGicdd0Cv4PvxHbqIufmZf5os-U-quzE5XEF-NZ-7Bd97Bkcp33NgczLFCHS9-IlEJ-CLI2sZFGgRaU8lzGpAGCnDSai_uSAXamS_r0jQSVHN0lDHYbzIzSRLZvMohdWm8FBG\" alt=\"\" width=\"412\" height=\"180\"\/><\/figure><\/div>\n\n\n\n<p>One advantage of using Google Cloud AI platform is the possibility of using automated <a rel=\"noreferrer noopener\" href=\"https:\/\/cloud.google.com\/ai-platform\/training\/docs\/using-hyperparameter-tuning\" target=\"_blank\">hyperparameter tuning<\/a> for models. This enables us to train a model automatically with different configurations. Then we select the one performing best for the defined measure in hptuning_config.yaml.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/VfNW2g7UdS3ryEyZQP30TDxo6RlzKCk3qca8DY8XaLE6RB8iZi8mWf0tSCzfWZdvgdxhwcJcmXbGkRv2Sh4tqM-33ncQppN5H82S-dp2s9xMwDyDDodK6PiY6_nxPgKH0ZHooRlP\" alt=\"hyperparameter yaml example\"\/><\/figure>\n\n\n\n<p>In the AI platform dashboard you can then see, which hyperparameter combination of your defined values in <em>params<\/em> had the best results for the defined <em>hyperparameterMetricTag <\/em>and <em>goal.<\/em><\/p>\n\n\n\n<p><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"200\" src=\"https:\/\/lh3.googleusercontent.com\/tQ9O_OKkrgExid_AWh3gP8NIjUu10YdWErkmyM4BR0aj29aMGuH_63vkNEqsGtIcVUTlrb52lsSfs5M57D9rQvCBAvWpmKdC6GwyhHtBjdYlJGkgMbdjRzGTAbuapF0TbzqHo5Ju\"><\/p>\n\n\n\n<p>The identified model is then ready to be deployed to the platform, where Google provides an URL to access the model in real time.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Deploying the model on GCP<\/h3>\n\n\n\n<p>Deploying to production is done with a <a href=\"https:\/\/www.jenkins.io\/\" target=\"_blank\" rel=\"noreferrer noopener\">Jenkins<\/a> job. We use <a href=\"https:\/\/www.jenkins.io\/doc\/book\/pipeline\/jenkinsfile\/\" target=\"_blank\" rel=\"noreferrer noopener\">Jenkinsfile<\/a> to define our jobs as part of our code. A model deployment consists of the following steps:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Copying the model to the correct GCP bucket (differs for our three development systems development, staging and production)<\/li><li>Deploy model to AI platform using a gcloud command<\/li><li>Test model with a prepared test dataset<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh4.googleusercontent.com\/usDMiATl4D5XvqYmlhwMoC2H5eZsLATDhUGOzZVr1ROsx1M9bKlBpmdk2FrURy0Vy8-eoxeJNXpo5f5UKdXyLVAtlDPMGPxcP3IHS-nae1W4rM0PJ8aoWu7W6jaVsnjfRlx1T-Gj\" alt=\"deploy model on ai platform\"\/><\/figure>\n\n\n\n<p>If all of these steps are successful the model is ready for usage in the specified environment via an URL endpoint.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Deploying the Real Time API<\/h3>\n\n\n\n<p>Since the model is deployed and accessible using an URL endpoint, we now have to build a transformation API that takes the input data and transforms it into the needed format for the model endpoint and then calls the model.<\/p>\n\n\n\n<p>To make using the model easier for other services, our data entry format is JSON. This makes the data human readable and changes to any steps concerning the model, except changing the number of attributes, can be done without dependencies on our client services.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">REST Service<\/h4>\n\n\n\n<p>As framework for our REST API we chose <a href=\"https:\/\/flask.palletsprojects.com\/en\/1.1.x\/\" target=\"_blank\" rel=\"noreferrer noopener\">Flask<\/a>, since it is lightweight, flexible, easy to use and also written in Python. Since API and model are written in the same language we can make use of the preprocessing from the training package we needed for training above. The main work here lies in adapting the code to only run one single event, instead of the batch prediction, used to validate the result during training.<\/p>\n\n\n\n<p>For stability and security reasons we added some additional checks:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>JWT token authorization with <a href=\"https:\/\/pythonhosted.org\/Flask-JWT\/\" target=\"_blank\" rel=\"noreferrer noopener\">Flask-JWT<\/a><\/li><li>Input format checks<ul><li>all required fields in request<\/li><li>filling in default values for optional fields<\/li><li>checking values for validity (ee.g. range or location checks)<\/li><\/ul><\/li><\/ul>\n\n\n\n<p>We also created an extra package containing all transformation functions, we use in several of our models. This package contains, e.g. min-max-normalization and distance calculation functions.<\/p>\n\n\n\n<p>Speed is important in this component, so we store all data for enriching and transforming the incoming data inside a cache.<\/p>\n\n\n\n<p>After receiving the prediction from the model, we qualify the results for regression models, by adding a confidence value. This helps our clients to better understand the results, especially if they are meant to be shown to end users.<\/p>\n\n\n\n<p>Each of our responses has its own error code and message that is supplied in the result. The result is again in JSON format with the following fields:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>success: true or false, indicating result of request<\/li><li>message: (error) message for response<\/li><li>prediction object<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Deployment of API<\/h4>\n\n\n\n<p>Deployment to our production system is then handled by a Jenkins job with the following steps:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Unit and integration testing of Flask API<\/li><li>Building a Docker container for the Flask API<\/li><li>Pushing Container Image to GCP project repository<\/li><li>Deploying Container to <a href=\"https:\/\/cloud.google.com\/run\" target=\"_blank\" rel=\"noreferrer noopener\">Google Cloud Run<\/a><\/li><\/ul>\n\n\n\n<p>By using Cloud Run we do not need to worry about hardware configuration and can focus on optimizing the API and the model.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>By following this process we make sure that the time spent on the necessary things, beside building a model, for bringing machine learning models into production is kept to a minimum and does not include managing any hardware resources or availability concerns.<\/p>\n\n\n\n<p>Especially the part after the manual data and model selection process is usable as a template to fasten the deployment process. This is thanks to the tools provided by Google and our extracting reusable functions into their own Python package.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Developing and bringing machine learning models into production is a task with a lot of challenges. These include model and attribute selection, dealing with missing values, normalization and others. Finding a workflow that puts all the gears, from data preprocessing and analysis over building models and selecting the best performing one to serving the model [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,5,10],"tags":[98,52,96,99,97],"ppma_author":[144],"class_list":["post-597","post","type-post","status-publish","format-standard","hentry","category-analytics-platform","category-data-science","category-visualization","tag-jupyter","tag-machine-learning","tag-production","tag-seaborn","tag-tensorflow","author-marc"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Bringing machine learning models into production - DATA DO - \u30c7\u30fc\u30bf \u9053<\/title>\n<meta name=\"description\" content=\"Bringing machine learning models into production is a task with a lot of challenges. We want to discuss one way of doing it.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Bringing machine learning models into production - DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"og:description\" content=\"Bringing machine learning models into production is a task with a lot of challenges. We want to discuss one way of doing it.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/\" \/>\n<meta property=\"og:site_name\" content=\"DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataScientists\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-05-29T05:40:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image.png\" \/>\n<meta name=\"author\" content=\"Marc Matt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Marc Matt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/\"},\"author\":{\"name\":\"Marc Matt\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\"},\"headline\":\"Bringing machine learning models into production\",\"datePublished\":\"2020-05-29T05:40:47+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/\"},\"wordCount\":1345,\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2020\\\/05\\\/image.png\",\"keywords\":[\"Jupyter\",\"Machine Learning\",\"Production\",\"Seaborn\",\"Tensorflow\"],\"articleSection\":[\"Analytics Platform\",\"Data Science\",\"Visualization\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/\",\"name\":\"Bringing machine learning models into production - DATA DO - \u30c7\u30fc\u30bf \u9053\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2020\\\/05\\\/image.png\",\"datePublished\":\"2020-05-29T05:40:47+00:00\",\"description\":\"Bringing machine learning models into production is a task with a lot of challenges. We want to discuss one way of doing it.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/#primaryimage\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2020\\\/05\\\/image.png\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2020\\\/05\\\/image.png\",\"width\":789,\"height\":528},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2020\\\/05\\\/29\\\/bringing-machine-learning-models-into-production\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/datascientists.info\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Bringing machine learning models into production\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"name\":\"Data Scientists\",\"description\":\"Digging data, Big Data, Analysis, Data Mining\",\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/datascientists.info\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\",\"name\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"width\":250,\"height\":174,\"caption\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/DataScientists\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\",\"name\":\"Marc Matt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"caption\":\"Marc Matt\"},\"description\":\"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\\\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.\",\"sameAs\":[\"https:\\\/\\\/data-do.de\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Bringing machine learning models into production - DATA DO - \u30c7\u30fc\u30bf \u9053","description":"Bringing machine learning models into production is a task with a lot of challenges. We want to discuss one way of doing it.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/","og_locale":"en_US","og_type":"article","og_title":"Bringing machine learning models into production - DATA DO - \u30c7\u30fc\u30bf \u9053","og_description":"Bringing machine learning models into production is a task with a lot of challenges. We want to discuss one way of doing it.","og_url":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/","og_site_name":"DATA DO - \u30c7\u30fc\u30bf \u9053","article_publisher":"https:\/\/www.facebook.com\/DataScientists\/","article_published_time":"2020-05-29T05:40:47+00:00","og_image":[{"url":"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image.png","type":"","width":"","height":""}],"author":"Marc Matt","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Marc Matt","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/#article","isPartOf":{"@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/"},"author":{"name":"Marc Matt","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19"},"headline":"Bringing machine learning models into production","datePublished":"2020-05-29T05:40:47+00:00","mainEntityOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/"},"wordCount":1345,"publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/#primaryimage"},"thumbnailUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image.png","keywords":["Jupyter","Machine Learning","Production","Seaborn","Tensorflow"],"articleSection":["Analytics Platform","Data Science","Visualization"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/","url":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/","name":"Bringing machine learning models into production - DATA DO - \u30c7\u30fc\u30bf \u9053","isPartOf":{"@id":"https:\/\/datascientists.info\/#website"},"primaryImageOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/#primaryimage"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/#primaryimage"},"thumbnailUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image.png","datePublished":"2020-05-29T05:40:47+00:00","description":"Bringing machine learning models into production is a task with a lot of challenges. We want to discuss one way of doing it.","breadcrumb":{"@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/#primaryimage","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image.png","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2020\/05\/image.png","width":789,"height":528},{"@type":"BreadcrumbList","@id":"https:\/\/datascientists.info\/index.php\/2020\/05\/29\/bringing-machine-learning-models-into-production\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/datascientists.info\/"},{"@type":"ListItem","position":2,"name":"Bringing machine learning models into production"}]},{"@type":"WebSite","@id":"https:\/\/datascientists.info\/#website","url":"https:\/\/datascientists.info\/","name":"Data Scientists","description":"Digging data, Big Data, Analysis, Data Mining","publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/datascientists.info\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/datascientists.info\/#organization","name":"DATA DO - \u30c7\u30fc\u30bf \u9053","url":"https:\/\/datascientists.info\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","width":250,"height":174,"caption":"DATA DO - \u30c7\u30fc\u30bf \u9053"},"image":{"@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataScientists\/"]},{"@type":"Person","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19","name":"Marc Matt","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc","url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","caption":"Marc Matt"},"description":"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.","sameAs":["https:\/\/data-do.de"]}]}},"authors":[{"term_id":144,"user_id":1,"is_guest":0,"slug":"marc","display_name":"Marc Matt","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/597","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/comments?post=597"}],"version-history":[{"count":2,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/597\/revisions"}],"predecessor-version":[{"id":601,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/597\/revisions\/601"}],"wp:attachment":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/media?parent=597"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/categories?post=597"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/tags?post=597"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/ppma_author?post=597"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}