AI Platforms with Kubeflow
Our DevOps resources impact
Customer workforce augmentation
Complex integration with Authentication tools
Designed scale to thousands of projects with shared networking scheme
The Challenge
Mobile and fixed network operators in Europe, managing thousands of projects in Google Cloud created Machine Learning project with the requirement to take on existing Machine Learning solutions and build them in multi tenant secure integrated environment allowing sharing of resources and distribution and scaling of workloads.
With deadlines coming to a close solution needed fast iterations to get to production quality and make it compliant in restricted production environments.
The Solution
Solid Potential DevOps engineers were acting in an existing customer team, expanding its capabilities and bringing expertise in integration testing and security restrictions for production ready solutions.
As part of this fast paced project, we have adapted custom Kubeflow based solution, streamlined it's build time and deployment efficiency and improved maintainability by utilising advanced jsonnet and Kaptain features.
As this was a large scale Kubernetes deployment it required rigorous performance testing and support of existing Machine Learning solutions. Kubernetes clusters included large GPU machines and part of Solid Potentials Engineers job was to optimise workloads, on per use case basis, to fully utilise clusters’ capacity.
Last stage integration push was needed to onboard solution into IaC automation paradigms and merge custom deployment stack into existing SDLC automation in a coherent way. Large deployment required complex integration with authentication tools and customers’ secret storage to provide secure code driven secret delivery and in cluster secret access.
Parts of the Google Vertex AI were used in the final solution and Solid Potential Engineers were responsible for the integration of added features into the deployed environment. The installation was described as a common terraform template and prepared to be used independently in any customer projects.
Finally solution was adapted to and tested within VPC SC restricted environment.