Machine Learning Model Deployment

Rudraksh Singh, Senior Consultant, Qarar

If a machine learning (ML) model makes a prediction in your system, is anyone around to hear it? Probably not.

Deploying models is the key to making them useful. It is not only if you are building product – in which case deployment is a necessity — it also applies if you are generating reports for management. Ten years ago, it was unthinkable that execs would not question assumptions and plug their own numbers into an Excel sheet to see what changed. Today, a PDF of impenetrable matplotlib figures might impress junior VPs, but could well fuel ML skepticism in the eyes of experienced C-suite execs. Therefore, deploying models in your IT infrastructure holds key to fully milk the opportunities offered by machine learning.

Building and deploying models involves key roles played by different contributors, depending on value required by the organization from the ML algorithm:

The data science department would look into model development
The devops team would require skills for designing data architecture to absorb model in the production environment
The software engineering team would require the skill set to absorb model outputs into softwares, which would eventually utilize the output for business benefits

All four stakeholders interact at different levels to ensure quality output. Information traded between the four range from business requirement, system specifications, model output format and more.

An ML system architecture is designed based on the ML system contributor’s strengths and limitations. An ML system architecture which seamlessly integrates the development and production environment needs to adhere to some core principles. These principles are important for successful deployment of the developed model. However the extent of adherence will depend on the organization’s limitations. For example, if the ML model is planned to be utilized for only one software, scalability might be compromised on; however reproducibility and modularity will still hold.

One of the ML architectures is selected based on the above principles, business needs and technical limitation.

The most common way of implementing ML is by the way of serving via REST API. This rest API could be embedded in software which will be the ultimate consumer of the output. Batch prediction-based architectures are usually slow and are therefore less commonly used in real-time scenarios; they are however more common in reporting scenarios.

Streaming-based predictions are very common in ecommerce-like environment where models are trained, and predictions are done dynamically. They mainly utilize cutting-edge big data technologies like spark, Kafka and others.

With the advent of smartphones-based infrastructure, it is also possible to embed an ML model via an app in a mobile phone. Apple IOS framework makes it possible without any dependence on the internet or a big data framework.

Based on which ML architecture is selected, it is essential to understand the difference between a research and a production environment, in order to plan properly.

There are certain challenges which are faced in a production environment which cannot be foreseen during a research environment (model development environment). Reproducibility is the major challenge which can be avoided by keeping the software (for example Python) the same across research and production. Good infrastructure planning is key to ensure proper integration with software deployment of the model, and seamless integration of the model with IT infrastructure.

Thus, the first challenge which the organization should look for is to design a good CI/CD pipeline on the basis of current resources available with the organization on the research and development front; it should then find a way to integrate with the IT infrastructure of the production environment.

This requires a good level of information exchange between devops and data scientists. This interaction needs to be documented, and any error penalized with time and redevelopment of the pipeline.

Once the model is ready and deployed, the software team comes into play for absorbing model output into the interfacing software.

Data scientists are typically responsible for the model’s accuracy and stability, whereas devops assumes responsibility for speed and interfacing. However, speed can fall under the data scientist core if business does not have compatible infrastructure with respect to business demand.

To conclude, architecture for deploying a machine learning model requires considerable planning on resources and infrastructure. If an organization does not have enough relevant infrastructure to consume a machine learning model, then IAAS, PAAS and SAAS are good options for consideration. Selection of infrastructure will likely depend on the organization’s vision of digitization for itself, and how business visualizes interaction with Artificial Intelligence.

Rudraksh is a analytically-focused professional with a proven track record of delivering excellence, skilled in machine learning, predictive modelling, strategy and optimization. Rudraksh specializes in end-to-end development , implementation and presentation of machine learning models which ensure consistent profitability for clients.