Interested in this AI/ML Engineer role at Andersen?
Apply Now →Skills & Technologies
About This Role
Application Notice:
We encourage you to apply thoughtfully by selecting one position that best matches your qualifications and interests. You may submit up to two active applications at a time. Please consider your location choice carefully—we recommend applying where you envision building your future.
The Firm:
A New Era of AI\-Driven, Multidimensional Consulting
Today’s businesses face complex global challenges that demand more than conventional consulting services. Andersen Consulting offers a seamless, multidimensional approach that combines expertise in business transformation, artificial intelligence, cybersecurity, sustainability, and digital strategy with Andersen Global’s established tax and legal capabilities. This integration of emerging technologies positions Andersen Consulting as the partner of choice in the $1 trillion consulting industry.
The Role:
At Andersen Consulting, AI/LLMOps FDEs own the technical delivery of production AI systems end\-to\-end and sit across the table from engineering teams at global enterprises. The practice is early. The engineers who join now will define its architecture, its standards, and what world\-class AI delivery looks like for clients who trust us with their most consequential systems.
Enterprise AI programs follow a predictable arc: a successful POC, a budget approval, and then months of stalled production work before the initiative quietly dies. The gap between a notebook that impresses stakeholders and a system that runs the business under real enterprise load is an engineering problem. Its name is LLMOps.
LLMs have been serious enterprise tools for roughly three years. We are not looking for someone who has mastered a stable field. We are looking for someone who has been in the room while the field was being invented: someone who built RAG pipelines that broke in production, debugged agent loops that silently degraded after staging, and designed eval suites from scratch because nothing off the shelf measured what actually mattered.
What You'll Do
Day\-to\-day, you embed directly with client teams and own a defined technical workstream from design through production:
- Design and deploy RAG pipelines using pgvector, Pinecone, Qdrant, or Weaviate, with deliberate chunking strategies, hybrid search, and re\-ranking layers.
- Build multi\-step agent workflows using LangChain, LangGraph, LlamaIndex, or the Anthropic Claude SDK, with tool use, structured outputs, and memory.
- Implement guardrails at the correct system boundary: output schema enforcement, PII detection, content filtering, and agent behavior constraints calibrated for each client's regulatory context.
- Design eval suites in using off\-the\-shelf tools or custom Python harnesses that measure retrieval quality, answer faithfulness, hallucination rates, and latency under load.
- Instrument observability from day one: LLM call latency, retrieval quality metrics, agent decision traces, cost per query.
- Write production Python: typed, tested, linted, and deployable by another engineer.
- Deploy and serve models on AWS (Bedrock, SageMaker, EKS), Azure (Azure OpenAI Service, AKS), or GCP (Vertex AI, GKE) under latency\-sensitive enterprise conditions.
- Treat prompt and context engineering as an engineering discipline: versioned system prompts, few\-shot libraries, chain\-of\-thought elicitation, and context window budgeting, all of it tracked, tested, and iterated rather than adjusted informally.
- Build and deploy MCP servers to expose enterprise data sources, internal APIs, and tools to LLM agents in a standardized, auditable way.
- Contribute reusable accelerators, reference architectures, and internal tooling during bench time. Every asset you build should make the next engagement start faster
- Travel is a real part of this role. Client work regularly requires on\-site presence during discovery, architecture reviews, and go\-live.
What You'll Build
- Production RAG pipelines with hybrid search, re\-ranking, and query\-time monitoring, designed to hold up under real query distributions and corpus drift, not just benchmark datasets.
- Multi\-step agent orchestration systems with tool use, memory, and structured output validation, built to be reliable at enterprise load, not just demonstrable in a demo environment.
- Eval frameworks designed from scratch to measure what the client's system actually needs to get right: faithfulness, groundedness, latency percentiles, and failure mode frequency.
- Guardrail infrastructure positioned correctly in the call stack: input validation, output schema enforcement, and behavioral constraints for the specific regulatory context of each engagement.
- Observability stacks instrumented at the LLM call, retrieval, and agent decision layer, giving clients operational visibility instead of log files they can't act on.
- MCP servers exposing internal enterprise systems (databases, document stores, internal APIs) to LLM agents through a standardized, auditable interface.
- CI/CD pipelines for LLM systems with automated eval runs on prompt changes, model version regression testing, and deployment gating before anything reaches production.
The Requirements:
- 7\+ years of total experience in data engineering, MLOps, software engineering, or closely adjacent infrastructure roles, with evidence of end\-to\-end ownership: architecture through production, including the post\-launch work.
- 2–3 years of LLM\-specific engineering experience is credible at this level. That is the ceiling given when the field started, not a floor. What distinguishes senior candidates is judgment: knowing when an agent architecture is the right tool, how to design for observability before it is needed, and how to have the hard conversation when a POC is not production\-ready.
- Production Python. Typed, tested, and structured for multi\-engineer deployment. We will ask about your testing practices and code review standards.
- At least one production LLM system shipped: a RAG pipeline, agent workflow, or LLM\-powered application that handled real enterprise load, not just internal demos. The key question is what broke after launch and what you did about it.
- Hands\-on vector retrieval experience with pgvector, Pinecone, Qdrant, Weaviate, or Chroma, including hybrid search design, embedding quality diagnosis, and re\-ranking strategy selection, not just initial setup.
- Working knowledge of at least one agentic SDK (Anthropic Claude SDK, LangChain, LangGraph, LlamaIndex, or OpenAI SDK) in a non\-trivial use case involving tool use, memory, or structured output handling.
- Cloud AI infrastructure depth in at least one of AWS (Bedrock, SageMaker, EKS), Azure (Azure OpenAI Service, AKS), or GCP (Vertex AI, GKE).
- LLM observability experience: you have instrumented a system before, not just read the documentation.
- Enough communication clarity to explain an architecture decision to an engineering team and a business stakeholder in the same meeting. These are client\-facing roles.
- Travel up to 50%
Preferred Qualifications
- Model serving experience in latency\-constrained environments.
- Guardrail implementation experience in regulated data contexts: PHI, PII, or MNPI.
- RAGAS or custom eval harness experience.
- MCP server development. Still rare enough that hands\-on experience is high signal.
- Palantir Foundry or AIP experience, a meaningful differentiator as Andersen's Palantir practice scales alongside the LLMOps practice.
- Prior client\-facing or consulting experience. FDEs who understand stakeholder management and expectation\-setting ramp faster and derisk engagements earlier.
- Domain exposure in financial services (insurance, reinsurance, credit), healthcare, supply chain, or manufacturing.
Compensation and Benefits
Our firm offers a competitive base salary and comprehensive benefits package designed to support the well\-being, growth, and long\-term success of our people. We are committed to recognizing individual contributions and providing resources that enable our employees to thrive both personally and professionally. Salary Range: For individuals hired to work in Chicago, the expected salary range for this role is $195,875 \- 266,171\. Actual compensation will be determined based on the candidate’s qualifications, experience, and skill set. Benefits: Employees (and their families) are eligible for medical, dental, vision, and basic life insurance coverage. Employees may enroll in the firm’s 401(k) plan upon hire. We offer 200 hours of paid time off annually, along with twelve paid holidays each calendar year. For a full listing of benefit offerings, please visit https://www.andersen.com/careers.
*Applicants must be currently authorized to work in the United States on a full\-time basis upon hire. Andersen will not consider candidates for this position who require sponsorship for employment visa status now or in the future (e.g., H\-1B status).* *Andersen Tax is an equal opportunity employer committed to fostering an inclusive workplace. We evaluate all applicants and employees without regard to race, color, religion, national origin, ancestry, sex (including pregnancy, childbirth, and related medical conditions), sexual orientation, gender identity or expression, age, disability, genetic information, marital status, military or veteran status, or any other characteristic protected under applicable federal, state, or local law. All qualified applicants, including those with criminal histories, will be considered in a manner consistent with applicable law. We provide reasonable accommodations to qualified individuals with disabilities as required by law.* ANDERSEN TAX LLC NOTICE FOR JOB APPLICANTS
Salary Context
This $195K-$266K range is above the 75th percentile for AI/ML Engineer roles in our dataset (median: $100K across 15465 roles with salary data).
View full AI/ML Engineer salary data →Role Details
About This Role
AI/ML Engineers build and deploy machine learning models in production. They work across the full ML lifecycle: data pipelines, model training, evaluation, and serving infrastructure. The role has evolved significantly over the past two years. Where ML Engineers once spent most of their time on model architecture, the job now tilts heavily toward inference optimization, cost management, and integrating LLM capabilities into existing systems. Companies want engineers who can ship production systems, and the experimenter-only role is fading fast.
Day-to-day, you're writing training pipelines, debugging data quality issues, setting up evaluation frameworks, and figuring out why your model performs differently in staging than it did on your dev set. The best ML engineers are obsessive about reproducibility and measurement. They instrument everything. They know that a model is only as good as the data feeding it and the infrastructure serving it.
Across the 26,159 AI roles we're tracking, AI/ML Engineer positions make up 91% of the market. At Andersen, this role fits into their broader AI and engineering organization.
Demand for AI/ML Engineers has been strong and consistent. Unlike some AI roles that spike with hype cycles, ML engineering is a foundational need. Every company deploying AI models needs people who can keep them running, and the gap between research prototypes and production systems keeps growing.
What the Work Looks Like
A typical week might include: debugging a data pipeline that's silently dropping 3% of training examples, running A/B tests on a new model version, writing documentation for a feature flag system that lets you roll back model deployments, and reviewing a junior engineer's PR for a new evaluation metric. Meetings tend to be cross-functional since ML touches product, engineering, and data teams.
Demand for AI/ML Engineers has been strong and consistent. Unlike some AI roles that spike with hype cycles, ML engineering is a foundational need. Every company deploying AI models needs people who can keep them running, and the gap between research prototypes and production systems keeps growing.
Skills Required
Python and PyTorch dominate the requirements. Most roles expect experience with cloud platforms (AWS, GCP, or Azure) and familiarity with ML frameworks like TensorFlow or JAX. RAG (Retrieval-Augmented Generation) has become a top-3 skill requirement as companies integrate LLMs into their products. Docker and Kubernetes show up in about a third of postings, reflecting the production focus of the role.
Beyond the core stack, employers increasingly want experience with experiment tracking tools (MLflow, Weights & Biases), feature stores, and vector databases. Fine-tuning experience is valuable but less common than you'd think from reading Twitter. Most production LLM work is RAG and prompt engineering, not fine-tuning. If you have both, you're in a strong position.
Companies that are serious about AI/ML hiring tend to post specific infrastructure details in the job description: the frameworks they use, their model serving stack, their data pipeline tools. Vague postings that just say 'ML experience required' without specifics are often companies that haven't figured out what they need yet.
Compensation Benchmarks
AI/ML Engineer roles pay a median of $166,983 based on 13,781 positions with disclosed compensation. Mid-level AI roles across all categories have a median of $131,300. This role's midpoint ($231K) sits 38% above the category median. Disclosed range: $195K to $266K.
Across all AI roles, the market median is $184,000. Top-quartile compensation starts at $244,000. The 90th percentile reaches $309,400. For comparison, the highest-paying categories include AI Engineering Manager ($293,500) and AI Architect ($292,900). By seniority level: Entry: $76,880; Mid: $131,300; Senior: $227,400; Director: $244,288; VP: $234,620.
Andersen AI Hiring
Andersen has 9 open AI roles right now. They're hiring across AI/ML Engineer. Positions span San Francisco, CA, US, Chicago, IL, US, New York, NY, US. Compensation range: $202K - $266K.
Location Context
AI roles in San Francisco pay a median of $244,000 across 1,059 tracked positions. That's 33% above the national median.
Career Path
Common paths into AI/ML Engineer roles include Data Scientist, Software Engineer, Research Engineer.
From here, career progression typically leads toward ML Architect, AI Engineering Manager, Principal ML Engineer.
The fastest path into ML engineering is through software engineering with a self-directed ML education. A CS degree helps, but production engineering skills matter more than academic credentials. Build something that works, deploy it, and measure it. That portfolio project is worth more than a Coursera certificate. For career growth, the fork comes around the senior level: go deep on technical complexity (staff/principal track) or move into managing ML teams.
What to Expect in Interviews
Expect system design questions around ML pipelines: how you'd build a training pipeline for a specific use case, handle data drift, or design A/B testing infrastructure for model deployments. Coding rounds typically involve Python, with emphasis on data manipulation (pandas, numpy) and algorithm implementation. Take-home assignments often ask you to build an end-to-end ML pipeline from raw data to deployed model.
When evaluating opportunities: Companies that are serious about AI/ML hiring tend to post specific infrastructure details in the job description: the frameworks they use, their model serving stack, their data pipeline tools. Vague postings that just say 'ML experience required' without specifics are often companies that haven't figured out what they need yet.
AI Hiring Overview
The AI job market has 26,159 open positions tracked in our dataset. By seniority: 2,416 entry-level, 16,247 mid-level, 5,153 senior, and 2,343 leadership roles (Director, VP, C-Level). Remote roles make up 7% of the market (1,863 positions). The remaining 24,200 roles require on-site or hybrid attendance.
The market median for AI roles is $184,000. Top-quartile compensation starts at $244,000. The 90th percentile reaches $309,400. Highest-paying categories: AI Engineering Manager ($293,500 median, 28 roles); AI Architect ($292,900 median, 108 roles); AI Safety ($274,200 median, 19 roles).
Demand for AI/ML Engineers has been strong and consistent. Unlike some AI roles that spike with hype cycles, ML engineering is a foundational need. Every company deploying AI models needs people who can keep them running, and the gap between research prototypes and production systems keeps growing.
The AI Job Market Today
The AI job market spans 26,159 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (23,752), AI Software Engineer (598), AI Product Manager (594). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.
The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (2,416) are outnumbered by mid-level (16,247) and senior (5,153) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 2,343 positions, representing the bottleneck between technical execution and organizational strategy.
Remote work availability sits at 7% of all AI roles (1,863 positions), with 24,200 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.
AI compensation is structured in clear tiers. The market median sits at $184,000. Top-quartile roles start at $244,000, and the 90th percentile reaches $309,400. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.
Category matters for compensation. AI Engineering Manager roles lead at $293,500 median, while Prompt Engineer roles sit at $122,200. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.
The most in-demand skills across all AI postings: Rag (16,749 postings), Aws (8,932 postings), Rust (7,660 postings), Python (3,815 postings), Azure (2,678 postings), Gcp (2,247 postings), Prompt Engineering (1,469 postings), Openai (1,269 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.
Frequently Asked Questions
Get Weekly AI Career Intelligence
Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.