Solutions Architect - Cloud and MLOps solutions
Nebius
- Ottawa, ON
- Permanent
- Full-time
Nebius is headquartered in the Netherlands, with hubs in Finland, Serbia, and Israel.Data center in Europe:
Our own data center in Finland features server racks designed in-house for ML-specific high load, with power-efficient solutions, including a free-cooling system.500+ professionals:
Our mature team of engineers has a proven track record in developing sophisticated cloud and ML solutions and designing cutting-edge hardware.The roleWe are seeking a highly skilled and customer-focused professional to join our team as a Solutions Architect specializing in Cloud and MLOps. As a Solutions Architect, you will play a pivotal role in designing and implementing cutting-edge solutions for our clients, leveraging cloud technologies for ML/AI teams and becoming a trusted technical advisor for building their pipelines.You’re welcome to work remotely from Canada.In this position, your responsibility will be to:
- Act as a trusted advisor to our clients, providing technical expertise and guidance throughout the engagement. Conduct PoC, workshops, presentations, and training sessions to educate clients on GPU cloud technologies and best practices.
- Collaborate with clients to understand their business requirements and develop solution architecture that align with their needs: design and document Infrastructure as code solutions, documentation and technical how-tos in collaboration with support engineers and technical writers.
- Help customers to optimize pipeline performance and scalability to ensure efficient utilization of cloud resources and services powered by Nebius AI.
- Act as a single point of expertise of customer scenarios for product, technical support, marketing teams.
- 5+ years of experience as a cloud solutions architect, system/network engineer, developer or a similar technical role with a focus on cloud computing
- Strong hands-on experience with IaC and configuration management tools (preferably Terraform/Asible), Kubernetes, skills of writing code in Python
- Solid understanding of GPU computing practices for ML training and inference workloads, GPU software stack components, including drivers, libraries (e.g. CUDA, OpenCL)
- Excellent communication skills
- Customer-centric mindset
- Fluent English
- Hands-on experience with HPC/ML orchestration frameworks (e.g. Slurm, Kubeflow)
- Hands-on experience with deep learning frameworks (e.g. TensorFlow, PyTorch)
- Solid understanding of cloud ML tools landscape from industry leaders (NVIDIA, AWS, Azure, Google)