This is a guest blog written by Dotscience's Luke Marsden, CEO and founder of Dotscience.
In this blog, we explain why you need DevOps for machine learning (also known as MLOps), what is the difference between regular DevOps for software engineering and DevOps for ML, and how DevOps for ML can be implemented with OpenShift + Dotscience.
Why MLOps?
Many people may ask, “What is MLOps (DevOps for ML), and why do I need it?” The data scientist may say, “DevOps is only for the engineers and IT.” The engineers may ask, “I know DevOps (the combining of software development and its deployment in production via IT operations), but why is it different for ML versus software engineering?” And managers may ask, “Is DevOps something urgent I need now, or is it something that would be nice to have in the future?” The answer to all of these is, “yes.” If you want to use ML in the real world to create value for your business, with reproducibility and accountability, then you need DevOps for ML.
There is a fundamental reason for this: Data science is iterative (Figure 1). At each of the major stages of the process (data preparation, model development, model serving, inferencing, and monitoring), issues can occur that require modifying one of the other stages. The most obvious issue is that the performance of a model in production degrades and the model has to be retrained. Another possible issues could be that a model worked well on its training, validation, and test data, but fails in production when inferencing. If this occurs, the user has to return and retrain the model or re-prepare the data. If the data has been prepared but a feature passed to the model turns out to be a cheat variable and leaks information from training set labels into the test set, unrealistic model performance can result. While these particular scenarios may not happen every time, in general some earlier step will require revisiting from a later step.
Figure 1: Data science is iterative at all stages.
The iterative nature of data science means that when you are adding AI to your business by using data science and machine learning, you cannot try experiments and build models and then “code it properly later” by handing off to an engineering team. The input data and the models are going to change, and business requirements and key personnel will change. Reflecting those changes in an updated workflow that was handed off as a finished, static, end-to-end piece of code will be a lot of work. What is needed is a process in which each component — the code, datasets, models, metrics, and run environment — are automatically tracked and versioned so that changes can be made quickly and easily to achieve accountability and auditability. The fact is, most companies, even the ones with data science teams, do not use such a process. As a result there are large amounts of ad hoc work and technical debt, which cost companies time and money in wasted opportunity.
So how does DevOps for ML help? In the 1990s, software engineering was siloed and inefficient. Releases took months to ship and involved many manual steps. Now, thanks to DevOps, and processes like continuous integration and continuous delivery (CI/CD), software can be shipped in seconds, because the steps involved are automated. At present, ML models are in a similar situation to software in the 1990s: Their creation is siloed and inefficient, they take months to ship into production, and they require many manual steps. At Dotscience, we believe that the same transformation that has taken place as a result of DevOps for software engineering can be achieved for ML, and our tool helps to lower the barriers to this transformation for businesses seeking to get more value from AI.
Through our collaboration with Red Hat, we’ve delivered a Dotscience MLOps pipeline on top of Red Hat OpenShift. This solution enables you to accelerate the development and delivery of ML models and AI-powered intelligent applications across data-center, edge, and public clouds.
The difference between DevOps and MLOps (DevOps for ML)
DevOps for ML, also known as MLOps, is different from the original DevOps because the data science and machine learning process is intrinsically complex in ways different from software engineering and contains elements that software DevOps does not. While software engineering is by no means easy or simple, data science and ML require the user to track several conceptually new parts of their activity that are fundamental to the workflow. These include data provenance, datasets, models, model parameters and hyperparameters, metrics, and the outputs of models in production. This is in addition to code versioning, compute environment, CI/CD, and general production monitoring. Table 1 summarizes this:
Table 1: Extra requirements of DevOps for ML versus DevOps for software
How can DevOps for ML be implemented?
So let’s say we are convinced of the need for DevOps for ML and would like to implement it. How can this be done?
Dotscience – MLOps platform for collaboration, deployment & tracking
Dotscience is an MLOps platform which delivers all of the requirements described above out of the box:
- Data provenance
- Data versioning
- Model versioning
- Hyperparameter and metric tracking
- Workflows
RedHat OpenShift – ML with the production-ready power of Enterprise Kubernetes
Containers and Kubernetes help accelerate machine learning lifecycle, as these technologies provide data scientists and software developers with agility, flexibility, portability, and scalability to train, test, and deploy ML models and associated intelligent applications in production. Red Hat OpenShift is the industry’s most comprehensive Kubernetes hybrid cloud platform. It provides the necessary benefits for machine learning by leaning on Kubernetes Operators, integrating DevOps capabilities, and integrating with GPU hardware accelerators. Red Hat OpenShift enables better collaboration between data scientists and software developers, accelerating the roll-out of intelligent applications across hybrid cloud.
Kubernetes Operators codify operational knowledge and workflows to automate the installation and lifecycle management of containerized applications with Kubernetes. For further details on Red Hat OpenShift Kubernetes Platform for accelerating AI/ML workflows, please visit the AI/ML on OpenShift webpage.
By configuring Dotscience to deploy ML models to OpenShift with the Dotscience OpenShift Operator, you combine the powerful, data scientist-friendly workflows from Dotscience, with Red Hat’s more reliable and scalable enterprise Kubernetes platform with integrated DevOps capabilities.
For example, from within Dotscience you can edit a ML model deployment and increase the number of replicas, and OpenShift will automatically scale the model horizontally across multiple GPU-capable servers. If a server fails, OpenShift can automatically reschedule the ML model pods across other available servers, driving less downtime for your critical AI-enabled business applications. And all this happens without the need to engage IT operations for scaling in and out.
DevOps for ML is appropriate for any real-world data science project, especially those that drive business value in production. Regular software engineering DevOps tools cannot be used because several intrinsically new concepts have to be tracked in DevOps for ML. While it is possible to implement it oneself using open source and/or available cloud tools, many businesses lack the time or expertise to do so. Products such as Dotscience on OpenShift can help such companies bridge the gap and derive greater value from their data via AI and machine learning.
Try Dotscience on OpenShift today
If you are convinced that an MLOps pipeline is critical to your business's successful adoption of AI/ML contact the Dotscience product solutions team for a demo. In the coming weeks, we'll publish a full how-to blog here, so stay tuned!
저자 소개
Red Hatter since 2018, technology historian and founder of The Museum of Art and Digital Entertainment. Two decades of journalism mixed with technology expertise, storytelling and oodles of computing experience from inception to ewaste recycling. I have taught or had my work used in classes at USF, SFSU, AAU, UC Law Hastings and Harvard Law.
I have worked with the EFF, Stanford, MIT, and Archive.org to brief the US Copyright Office and change US copyright law. We won multiple exemptions to the DMCA, accepted and implemented by the Librarian of Congress. My writings have appeared in Wired, Bloomberg, Make Magazine, SD Times, The Austin American Statesman, The Atlanta Journal Constitution and many other outlets.
I have been written about by the Wall Street Journal, The Washington Post, Wired and The Atlantic. I have been called "The Gertrude Stein of Video Games," an honor I accept, as I live less than a mile from her childhood home in Oakland, CA. I was project lead on the first successful institutional preservation and rebooting of the first massively multiplayer game, Habitat, for the C64, from 1986: https://neohabitat.org . I've consulted and collaborated with the NY MOMA, the Oakland Museum of California, Cisco, Semtech, Twilio, Game Developers Conference, NGNX, the Anti-Defamation League, the Library of Congress and the Oakland Public Library System on projects, contracts, and exhibitions.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.