Written by Jens Eriksvik & Peter Wahlgren

Building powerful AI models is just the first step. To unlock full potential and ensure robust implementations, ongoing management and maintenance of AI models are crucial. This involves a multifaceted approach that combines well-known best practices for application management and maintenance, with elements of data governance, data engineering, and MLOps.

Data is the foundation for trustworthy AI

Data is the very foundation for AI. Effective data governance establishes a framework for ensuring data quality, security, and compliance with regulations. This includes:

Data quality management: Processes and tools to maintain data accuracy, completeness, and consistency over time. Regular data cleansing and validation are essential to prevent model degradation.
Data security measures: Implementing robust security protocols to safeguard sensitive data from unauthorized access, breaches, or manipulation. This becomes particularly important as AI models are increasingly deployed in real-world applications.
Data lineage and auditing: Tracking the origin, transformation, and usage of data throughout its lifecycle. This transparency fosters trust and facilitates regulatory compliance (e.g., GDPR, AI Act).

The heavy reliance on a robust data foundation and data infrastructure creates some key differences on how to approach AI model management, as compared to more classical approaches:

CM tools track and control changes to application configurations to prevent unexpected behaviour. Data governance establishes policies and controls for data collection, storage, usage, and access, ensuring data quality and consistency for reliable model performance.

CI/CD pipelines automate building, testing, and deploying applications. Data engineering practices automate the process of acquiring, preprocessing, and integrating data, ensuring a continuous flow of clean data to AI models.

Application monitoring and logging tools track performance and health of deployed applications. MLOps practices for monitoring model performance and drift track metrics like accuracy, fairness, and data drift, allowing for proactive model retraining and optimization.

The Deming cycle emphasizes iterative development, testing, and feedback loops. The AI lifecycle management process follows planning (data governance), building models (data engineering), deploying and monitoring (MLOps), and refining based on results.

Classical application management and AI model management both benefit from version control systems for tracking changes, reverting if needed, and ensuring reproducibility. In addition, Clear documentation of data pipelines, model architecture, and MLOps practices is crucial for both application management and AI model management, facilitating troubleshooting, knowledge sharing, and collaboration within teams (there are also links to ITIL, DMBoK, XAI principles and, to a lesser extent agile and SAFE, but these till not be covered extensively in this post)

Key considerations to establish the basis for ongoing AI model management

While each of the links outlined above warrants its own deep-dive, there are immediate take-aways for businesses looking to establish high-performing AI model management.

Data governance as configuration management

Data governance practices that mirror best practices in configuration management for application deployments ensures consistent and reliable data for AI models.

Establish clear policies and procedures for data collection, storage, access, and usage.
Implement data quality checks and data lineage tracking to ensure data integrity and reproducibility of results.
Regularly review and update data governance policies to adapt to evolving regulations and organizational needs.

Data engineering as a CI/CD pipeline

Data engineering practices that align with best practices for CI/CD pipelines in software development ensures a continuous flow of clean data, similar to how CI/CD automates building and deploying applications.

Automate data pipelines for data acquisition, preprocessing, and feature engineering.
Implement version control for data pipelines to track changes and facilitate rollbacks if needed.
Continuously monitor data pipelines for errors and potential biases in the data

MLOps as application monitoring and logging

MLOps practices that mirror best practices for application monitoring and logging monitors model performance just as application monitoring tracks application health. Additionally, XAI techniques in MLOps provide insights similar to application logs that aid in troubleshooting and debugging.

Implement MLOps tools to monitor model performance metrics like accuracy, precision, and recall.
Track model drift and data drift to identify potential performance degradation or changes in the underlying data distribution.
Integrate explainable AI (XAI) techniques to understand model behavior and mitigate potential biases.

The Deming cycle for overall AI lifecycle management

The Deming cycle (Plan-Do-Check-Act) is a cornerstone of continuous improvement processes. Applying this cycle to AI model management ensures ongoing optimization and adaptation, similar to how it's used in software development.

Adopt an iterative development approach for building and refining AI models.
Continuously evaluate model performance and identify areas for improvement.
Retrain models with fresh data to address data drift and maintain performance.
Cross-functional collaboration between data scientists, data engineers, and MLOps specialists throughout the AI lifecycle.

Tracking and measuring performance

Effectively managing AI models requires a multifaceted approach that combines best practices from established frameworks. Exhibit 1 above highlighted these connections.

To measure the success of an AI model, a combination of metrics is crucial, akin to a balanced performance scorecard. Performance metrics like accuracy and error rates assess how well the model performs its task. Business value metrics track the model's impact on revenue, cost savings, or efficiency. Fairness and bias metrics ensure responsible AI development by identifying and mitigating potential biases in model outputs. Finally, user satisfaction and adoption rates gauge the model's real-world usability and value. By monitoring these metrics businesses can ensure their AI models deliver long-term value and contribute to achieving strategic business goals.

"AI is not only a sprint. Building models is the starting line. The real work involves data governance and engineering, MLOps, explainability, and continuous improvement. It's a team effort, where data scientists, engineers, MLOps specialists and business functions work together to ensure accuracy, value, fairness, and happy users."

**Jens Ekberg, CEO Algorithma**

A collaborative effort for long-term AI success

Data governance, data engineering, and MLOps function as interconnected components within the AI lifecycle management system. Data governance policies guide data collection and management practices, ensuring high-quality data for model training. Data engineers build and maintain data pipelines that deliver clean and informative data to AI models. MLOps tools and techniques optimize model deployment, monitoring, and retraining, ensuring models perform effectively over time. And finally, Continuous monitoring identifies issues with data quality, model performance, or potential biases, prompting corrective actions from data engineers and data scientists.

Effective management and maintenance of AI models require an ongoing collaborative effort. Data scientists, data engineers, MLOps specialists, and data governance experts all play crucial roles in ensuring the success of AI initiatives. By establishing a robust framework that combines data governance, data engineering, and MLOps practices, organizations can reap the long-term benefits of AI while maintaining responsible and trustworthy operations.

Managing and maintaining AI models in the long run

CTO Update: Training LLMs on ROCm platform

Creating certainty in uncertainty: Ensuring robust and reliable AI models through uncertainty quantification