ML Platform Engineer - GPU Infrastructure

Warren, MI, US Mid Level MLOps Engineer

Interested in this MLOps Engineer role at Optimal Inc.?

Apply Now →

Skills & Technologies

AwsAzureDockerGcpKubernetesPython

About This Role

AI job market dashboard showing open roles by category

Job Title: ML Platform Engineer \- GPU Infrastructure

Job Summary

Support team by designing, implementing, and maintaining the automation and ML workload enablement layer of the GPU cluster platform. This role focuses on optimizing GPU compute environments for AI/ML training and Isaac Sim simulation workloads, integrating GPU jobs into CI/CD pipelines, standardizing runtime environments, and supporting reliable storage and artifact management.

Required Experience

3\+ years of experience in ML Platform Engineering, DevOps, Infrastructure Engineering, or related field

Bachelor's or Master's degree in Systems Engineering, Computer Science, Computer Engineering, or related discipline

Responsibilities

Support GPU cluster platforms for AI/ML and simulation workloads

Optimize GPU compute environments for ML training and Isaac Sim execution

Integrate GPU workload execution into CI/CD pipelines

Standardize runtime environments using containers and automation tools

Manage storage, artifacts, and workload outputs

Troubleshoot and improve platform reliability, scalability, and performance

Collaborate with ML, infrastructure, and engineering teams

Required Skills

Experience with Linux, Kubernetes, Docker, and GPU infrastructure

Knowledge of CI/CD tools and automation scripting (Python/Bash)

Experience supporting AI/ML workloads and distributed systems

Familiarity with NVIDIA GPU technologies and containerized environments

Strong troubleshooting and performance optimization skills

Preferred Skills

Experience with Isaac Sim or simulation workloads

Exposure to cloud platforms (AWS, Azure, or GCP)

Knowledge of monitoring and observability tools such as Grafana or Prometheus

Role Details

Company Optimal Inc.
Title ML Platform Engineer - GPU Infrastructure
Location Warren, MI, US
Category MLOps Engineer
Experience Mid Level
Salary Not disclosed
Remote No

About This Role

MLOps Engineers build the infrastructure that keeps ML models running in production. They own CI/CD pipelines for model deployment, monitoring for data drift and model degradation, and the tooling that lets data scientists ship faster. If ML Engineers build the models, MLOps Engineers build the roads those models travel on.

The job is fundamentally about reliability and velocity. Data scientists want to iterate fast. Product teams want stable predictions. Your job is to make both happen simultaneously. That means building deployment pipelines that catch regressions before they hit production, monitoring systems that alert on data drift before it degrades model performance, and self-service tooling that lets data scientists deploy without filing a ticket.

Across the 3,823 AI roles we're tracking, MLOps Engineer positions make up 1% of the market. At Optimal Inc., this role fits into their broader AI and engineering organization.

MLOps demand tracks closely with production ML adoption. As more companies move models from notebooks to production, the need for MLOps grows. The role is well-established at large tech companies and growing fast at mid-stage startups that are hitting the 'our models work in notebooks but break in production' phase.

What the Work Looks Like

A typical week involves: debugging a model deployment that's serving stale predictions, building a new monitoring dashboard for a feature team, writing Terraform for GPU-enabled inference clusters, reviewing pull requests for the ML platform's CI/CD pipeline, and meeting with data scientists to understand their pain points. You're the bridge between ML and infrastructure.

MLOps demand tracks closely with production ML adoption. As more companies move models from notebooks to production, the need for MLOps grows. The role is well-established at large tech companies and growing fast at mid-stage startups that are hitting the 'our models work in notebooks but break in production' phase.

Skills Required

Aws (31% of roles) Azure (24% of roles) Docker (11% of roles) Gcp (19% of roles) Kubernetes (12% of roles) Python (52% of roles)

Kubernetes, Docker, and cloud infrastructure are baseline. Most roles want experience with ML-specific tooling: MLflow, Kubeflow, Weights & Biases, or similar. Strong DevOps fundamentals matter more than ML theory. You need to understand model serving (TorchServe, Triton, vLLM), monitoring (Prometheus, Grafana), and infrastructure-as-code (Terraform, Pulumi).

GPU infrastructure knowledge is increasingly valuable as LLM inference becomes a major cost center. Understanding GPU scheduling, multi-node training setups, and inference optimization (quantization, batching, caching) puts you in the top tier. Experience with model registries and feature stores rounds out the profile.

Good MLOps postings specify their ML stack, infrastructure scale, and the problems they're solving (deployment velocity, cost optimization, monitoring gaps). Red flag: companies that want MLOps but don't have any models in production yet. You'll end up doing general DevOps instead.

Compensation Benchmarks

MLOps Engineer roles pay a median of $217,200 based on 87 positions with disclosed compensation. Mid-level AI roles across all categories have a median of $165,000.

Across all AI roles, the market median is $200,100. Top-quartile compensation starts at $253,500. The 90th percentile reaches $307,500. For comparison, the highest-paying categories include AI Engineering Manager ($275,000) and AI Safety ($274,200). By seniority level: Entry: $97,880; Mid: $165,000; Senior: $227,400; Director: $247,800; VP: $250,000.

Optimal Inc. AI Hiring

Optimal Inc. has 3 open AI roles right now. They're hiring across Research Engineer, MLOps Engineer, AI/ML Engineer. Based in Warren, MI, US.

Location Context

Across all AI roles, 15% (590 positions) offer remote work, while 3,217 require on-site attendance. Top AI hiring metros: New York (2,643 roles, $211,000 median); San Francisco (2,168 roles, $253,000 median); Los Angeles (1,792 roles, $191,580 median).

Career Path

Common paths into MLOps Engineer roles include DevOps Engineer, Platform Engineer, Data Engineer.

From here, career progression typically leads toward ML Platform Lead, Infrastructure Architect, Engineering Manager.

DevOps engineers with ML curiosity have the shortest path. You already understand deployment, monitoring, and infrastructure. Add ML-specific knowledge (model serving, data pipelines, experiment tracking) and you're competitive. The career ceiling is high: ML Platform Lead roles at top companies pay well because the infrastructure complexity is enormous.

What to Expect in Interviews

Interviews emphasize infrastructure and reliability. Expect questions about CI/CD for ML models, monitoring for data drift, and how you'd design a model serving platform that handles 10K requests per second. Coding rounds focus on Python and infrastructure-as-code (Terraform, Helm). Be ready to discuss tradeoffs between different model serving frameworks and how you'd handle rollback when a new model degrades performance.

When evaluating opportunities: Good MLOps postings specify their ML stack, infrastructure scale, and the problems they're solving (deployment velocity, cost optimization, monitoring gaps). Red flag: companies that want MLOps but don't have any models in production yet. You'll end up doing general DevOps instead.

AI Hiring Overview

The AI job market has 3,823 open positions tracked in our dataset. By seniority: 112 entry-level, 1,798 mid-level, 1,516 senior, and 397 leadership roles (Director, VP, C-Level). Remote roles make up 15% of the market (590 positions). The remaining 3,217 roles require on-site or hybrid attendance.

The market median for AI roles is $200,100. Top-quartile compensation starts at $253,500. The 90th percentile reaches $307,500. Highest-paying categories: AI Engineering Manager ($275,000 median, 41 roles); AI Safety ($274,200 median, 55 roles); Research Engineer ($260,000 median, 434 roles).

MLOps demand tracks closely with production ML adoption. As more companies move models from notebooks to production, the need for MLOps grows. The role is well-established at large tech companies and growing fast at mid-stage startups that are hitting the 'our models work in notebooks but break in production' phase.

The AI Job Market Today

The AI job market spans 3,823 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (2,629), Data Scientist (322), AI Software Engineer (279). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.

The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (112) are outnumbered by mid-level (1,798) and senior (1,516) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 397 positions, representing the bottleneck between technical execution and organizational strategy.

Remote work availability sits at 15% of all AI roles (590 positions), with 3,217 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.

AI compensation is structured in clear tiers. The market median sits at $200,100. Top-quartile roles start at $253,500, and the 90th percentile reaches $307,500. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.

Category matters for compensation. AI Engineering Manager roles lead at $275,000 median, while Prompt Engineer roles sit at $140,000. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.

The most in-demand skills across all AI postings: Python (1,979 postings), Aws (1,190 postings), Azure (899 postings), Rag (839 postings), Gcp (726 postings), Pytorch (595 postings), Prompt Engineering (595 postings), Claude (540 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.

Frequently Asked Questions

Based on 87 roles with disclosed compensation, the median salary for MLOps Engineer positions is $217,200. Actual compensation varies by seniority, location, and company stage.
Kubernetes, Docker, and cloud infrastructure are baseline. Most roles want experience with ML-specific tooling: MLflow, Kubeflow, Weights & Biases, or similar. Strong DevOps fundamentals matter more than ML theory. You need to understand model serving (TorchServe, Triton, vLLM), monitoring (Prometheus, Grafana), and infrastructure-as-code (Terraform, Pulumi).
About 15% of the 3,823 AI roles we track offer remote work. Remote availability varies by company and seniority level, with senior and leadership roles more likely to offer location flexibility.
Optimal Inc. is among the companies actively hiring for AI and ML talent. Check our company profiles for detailed breakdowns of open roles, salary ranges, and hiring trends.
Common next steps from MLOps Engineer positions include ML Platform Lead, Infrastructure Architect, Engineering Manager. Progression depends on whether you lean toward technical depth, people management, or product strategy.

Get Weekly AI Career Intelligence

Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.