ML Cloud – the First National AI/ML Cloud Platform

Development and Launch of an AI/ML Cloud Service in Partnership with De Novo

Our team of engineers has gathered in one product the most popular tools of AI engineers, convenient billing and support in the implementation of AI/ML projects, from Proof of Concept (PoC) to Production. DeNovo provides resources for PoC and prices significantly lower than international hyperscalers. We guide clients from idea and business analysis to implementation.

The platform is available for deployment within organizations and other data centers. Various methods of restricting access to datasets for training third-party models are possible.

Introduction

Our client

De Novo – a leading Ukrainian cloud service provider (the country’s first national cloud operator, in operation since 2008), offering IaaS, BaaS/DRaaS, Kubernetes-as-a-Service, and other cloud solutions. De Novo’s goal was to build a secure, high-performance cloud platform for artificial intelligence and machine learning (AI/ML) workloads in Ukraine, leveraging cutting-edge NVIDIA hardware and MLOps software tools.

Value of cooperation

Characteristics

1

AI/ML cloud platform in Ukraine

2

lower service cost compared to hyperscalers

100

data residency in Ukraine

5

integrated MLOps tools (Kubeflow, MLFlow, MinIO, CVAT, Grafana)

Technologies

The ML Cloud platform architecture integrates infrastructure components (GPUs, virtualization) with a suite of software tools covering the end-to-end ML lifecycle. The diagram shows how various open-source solutions are combined: Kubeflow (notebooks, pipelines, model serving), MLFlow (experiment tracking and model registry), MinIO (object storage for data), CVAT/Doccano (data labeling tools), as well as components for distributed computing and monitoring (Ray, Feast, Redis, Grafana).

ML Cloud – Ukraine’s First AI/ML Cloud

Contractor

De Novo LLC (Ukraine)

Service Period

2023 – 2024 (development and launch; ongoing support thereafter)

Goal

•  Pioneer a national AI/ML cloud platform: Build Ukraine’s first cloud platform for AI/ML engineers, providing a ready-to-use workspace preloaded with all necessary software and tools to manage the machine learning lifecycle. The platform needed to eliminate the heavy lifting of infrastructure setup for data scientists, lowering the barrier to entry into AI/ML.
•  High-performance infrastructure: Implement a cloud environment accelerated by state-of-the-art NVIDIA tensor GPUs (H100 and L40S) to deliver unparalleled performance for training models and running AI workloads. This required integrating the latest GPU hardware with a robust virtualization and container orchestration stack to achieve world-class computation speeds for machine learning and inference tasks.
•  Data sovereignty and security: Ensure the solution is hosted in a reliable Ukrainian data center and complies with domestic security standards, so that government and enterprise clients can process sensitive data entirely within Ukraine. The platform needed to meet strict Ukrainian regulations (e.g. data protection by DSTZI certification) to enable AI projects for the public sector that cannot use foreign cloud services.
•  Cost efficiency: Offer AI/ML cloud resources at a significantly lower price point than global cloud providers (hyperscalers), with transparent and predictable pricing. The challenge was to optimize the infrastructure and pricing model to achieve up to a 2× cost reduction while maintaining high quality of service.
•  Scalability and multi-GPU support: Design the platform to support flexible scaling and multi-GPU workloads for large-scale modelsl. This included enabling distributed training across multiple GPUs and easy expansion of resources as user demands grow, all while providing integration with popular ML frameworks and orchestration tools (Kubernetes, Kubeflow, etc.).

Results

•  Launch of ML Cloud platform: Successfully developed and deployed the first-of-its-kind AI/ML cloud service in Ukraine. ML Cloud is offered in both a public multi-tenant cloud and private cloud format, delivering a high-performance tensor computing environment for AI/ML. The platform combines the world’s most powerful NVIDIA H100/L40S GPUs with an integrated VMware virtualization and Kubernetes container stack, resulting in a scalable, enterprise-grade solution for advanced AI tasks.
•  Out-of-the-box MLOps environment: Users of ML Cloud have access to a fully configured ecosystem of popular open-source ML tools. The platform comes pre-integrated with Kubeflow (to manage the ML workflow from experimentation to deployment), MLFlow (to track experiments, metrics, and models), MinIO (S3-compatible on-premise object storage for datasets), CVAT/Doccano (visual and text data annotation tools), Grafana (monitoring dashboards), and more. This comprehensive environment enables ML engineers to start developing and training models immediately, without the overhead of setting up and managing infrastructure.
•  Unmatched performance and scalability: By utilizing NVIDIA H100 and L40S GPUs, the platform provides computation power previously out of reach for most local organizations. Intensive machine learning tasks that typically require specialized hardware can now be run in the cloud with near-hardware-level performance. ML Cloud supports dynamic scaling of resources are supported to accelerate complex AI workflows.
•  Cost savings and predictability: ML Cloud’s pricing strategy offers cloud AI/ML infrastructure at substantially lower cost than comparable global cloud services. Thanks to infrastructure optimizations and local deployment, the service is up to 2 times cheaper than hyperscaler alternatives, providing approximately 35%+ cost savings with 100% price predictability (fixed rates in local currency). This makes cutting-edge AI resources financially accessible to Ukrainian businesses and government agencies, eliminating the cost barrier for AI innovation.
•  Successful pilot projects on ML Cloud: The platform has already powered several pioneering AI applications. Notably, the Ministry of Youth and Sports of Ukraine deployed an AI assistant service for grant application evaluation that runs entirely on ML Cloud infrastructure, ensuring all computations and data stay within Ukraine. This solution – an LLM-based application built with MK Consulting’s technical support – became the first use of large language models in Ukrainian state e-services. The project was delivered in about 3.5 months and demonstrated the platform’s capabilities in practice: it provided automatic application analysis with high accuracy for Ukrainian-language text, all while keeping data secure on domestic servers. The success of this and other early projects proved the effectiveness of ML Cloud in accelerating AI deployments and validated that sensitive workloads can be executed locally without reliance on foreign clouds.
•  Foundation for scaling AI innovation: The launch of ML Cloud and its initial use cases have set an important precedent for scaling sovereign AI solutions in Ukraine. This platform demonstrates that world-class AI/ML services can be built and run domestically, fostering trust in local cloud capabilities. It opens the door for wider adoption of AI in both the public and private sectors – from government services to enterprises – under a model where data remains under Ukrainian jurisdiction. MK Consulting and De Novo are already exploring partnerships to deploy ML Cloud in other data centers and organizations, both in Ukraine and internationally, offering a template for how AI/ML platforms can be implemented with full compliance and cost-efficiency around the globe.

Services provided

•  Cloud platform architecture design: Designed the optimal architecture combining hardware (NVIDIA GPU infrastructure) and software components. This included planning the virtualization environment (VMware vSphere clusters) and container orchestration (Kubernetes via Tanzu) needed to support AI/ML workloads at scale, as well as incorporating NVIDIA’s AI Enterprise suite for GPU virtualization.
•  Infrastructure development and deployment: Built and configured the cloud environment for ML Cloud. This involved setting up the VMware virtualization layers, deploying Kubernetes clusters, and integrating the necessary network, storage, and security configurations in De Novo’s data center. We implemented GPU sharing technology (vGPU) to allow flexible partitioning of GPU resources (e.g. 1/8, 1/4 of a GPU) for different user workload. The team also established CI/CD pipelines for automating environment updates and ensured high availability and redundancy for critical components.
•  MLOps tools integration: Installed and integrated a suite of MLOps and data science tools into the platform. This included Kubeflow (for managing notebooks, pipelines, and model serving using KServe), MLFlow (for experiment tracking and model registry), MinIO (for scalable object storage of datasets and model artifacts), and annotation tools like CVAT and Doccano for computer vision and text data labeling. We configured these tools to work seamlessly together – for example, enabling Kubeflow notebooks to spawn with GPU access, linking CVAT to MinIO storage, and connecting MLFlow to Kubeflow pipelines – creating a unified ML workflow environment.
•  Testing and performance tuning: Conducted extensive testing of the ML Cloud platform. This included stress-testing with typical AI workloads (training neural network models, running inference on large datasets) to ensure the GPU scheduling and Kubernetes orchestration performed reliably under load. We fine-tuned resource allocation, optimized container configurations, and validated that multi-GPU training jobs and high memory workloads run efficiently. Security tests were also performed to verify data isolation between tenants and compliance with governmental standards.
•  Deployment and knowledge transfer: Collaborated with De Novo’s team to launch the platform into production. We prepared comprehensive documentation and provided training sessions for De Novo’s engineers on operating and supporting the ML Cloud environment. This knowledge transfer included how to onboard new customers, manage GPU quotas, monitor system health (using Grafana dashboards, Prometheus alerts), and troubleshoot common issues in the ML toolchain.

GOT QUESTIONS?