AI safety. The NIST AI Safety framework, created under the 2023 Executive Order, defines the standards this field works toward. AI safety went from a niche academic concern to a funded corporate priority in under three years. Dedicated safety roles grew 67% year-over-year. Anthropic, OpenAI, and DeepMind now employ hundreds of safety researchers and engineers. The EU AI Act created compliance requirements that enterprises can't ignore. US executive orders on AI safety formalized government expectations.
The result: a new category of AI careers focused specifically on making AI systems safe, reliable, and aligned with human values. Here's what the landscape looks like and how to get in.
The Five Pillars of AI Safety Careers
AI safety work splits into five distinct areas. Each has different skill requirements, compensation ranges, and entry paths.
1. AI Alignment Research
The most technical and theoretical pillar. Alignment researchers work on ensuring AI systems do what we intend them to do, especially as models become more capable.
What the work involves: Developing and testing alignment. arXiv AI safety papers track the latest research from labs focused on alignment and interpretability. Alignment techniques (RLHF, constitutional AI, debate, iterated amplification), studying model behavior and failure modes, building tools for understanding what models know and why they produce specific outputs, and publishing research that advances the field. Where the jobs are: Anthropic, OpenAI, DeepMind, and academic labs (MATS, Redwood Research, ARC, MIRI). Some AI-native companies like Cohere and AI21 have smaller alignment teams. Compensation: $160K-$280K base salary. At AI labs, total compensation including equity can reach $400K-$500K for senior researchers. Academic positions: $80K-$160K. Requirements: Nearly always requires a PhD or equivalent research experience. Strong ML fundamentals, mathematical maturity, and publication record. This is the hardest pillar to enter without advanced academic credentials.2. Red Teaming and Adversarial Testing
The offense-minded side of AI safety. Red teamers try to break AI systems, finding failure modes, biases, and vulnerabilities before they affect users.
What the work involves: Designing adversarial attacks against language models (jailbreaks, prompt injection, data extraction), building automated testing frameworks that probe model boundaries, testing for bias across demographic groups, evaluating models for harmful output generation, and stress-testing safety filters. Where the jobs are: AI labs (all major ones have dedicated red teams), Big Tech companies shipping AI products, government agencies (NIST, AISI), and specialized security firms. Compensation: $140K-$230K base salary. Total comp at AI labs: $220K-$400K for senior roles. Government positions: $100K-$160K. Requirements: Strong ML engineering skills, creativity in adversarial thinking, and systematic documentation ability. No PhD required. Security background is valuable. This is the most accessible technical safety pillar for engineers without research backgrounds.3. Safety Engineering
Building the guardrails, monitoring systems, and safety infrastructure that keeps AI systems behaving correctly in production.
What the work involves: Implementing content filtering and safety classifiers, building monitoring systems that detect model drift and unsafe outputs, designing fallback mechanisms for when models fail, creating testing infrastructure for safety evaluation, and developing tools for prompt injection detection and prevention. Where the jobs are: Any company deploying AI at scale. AI labs, Big Tech, and enterprise AI companies all need safety engineers. This is the fastest-growing pillar by job count. Compensation: $150K-$250K base salary. Total comp at Big Tech: $230K-$420K for senior roles. AI labs: $220K-$400K. Requirements: Strong software engineering skills with ML knowledge. Experience with production ML systems. Understanding of security principles. No PhD required. This pillar is most accessible to ML engineers and software engineers.4. AI Policy and Governance
The regulatory and institutional side of AI safety. Policy roles shape how AI is governed at company, national, and international levels.
What the work involves: Analyzing and interpreting AI regulations (EU AI Act, US state laws), developing corporate AI governance frameworks, creating AI risk assessment methodologies, advising policymakers on AI capabilities and risks, and writing technical standards for AI safety. Where the jobs are: Government agencies (NIST, AISI, EU AI Office), think tanks (RAND, Brookings, CSET), AI labs (policy teams), Big Tech (government affairs and responsible AI teams), and consulting firms. Compensation: Government: $100K-$180K base. Think tanks: $90K-$150K. Corporate: $140K-$220K base. Consulting: $160K-$280K. Requirements: Policy, legal, or technical background. Technical literacy is essential but you don't need to be an engineer. Strong writing and analytical skills. Many successful policy professionals have JDs or public policy degrees combined with technical AI understanding.5. AI Ethics and Fairness
Evaluating AI systems for bias, discrimination, and societal impact. Overlaps with safety engineering but focuses more on social implications.
What the work involves: Auditing AI systems for demographic bias, designing fairness metrics and evaluation frameworks, conducting impact assessments for AI deployments, developing ethical guidelines for AI use, and working with stakeholders (regulators, community groups, users) on responsible deployment. Where the jobs are: Big Tech responsible AI teams, government agencies, academic institutions, nonprofit organizations (AI Now Institute, Partnership on AI), and consulting firms specializing in AI audit. Compensation: Corporate: $130K-$210K base. Nonprofit: $70K-$130K. Government: $90K-$160K. Consulting: $140K-$240K. Requirements: Varies widely. Technical roles need ML engineering skills. Research roles need social science or ethics backgrounds. Audit roles benefit from accounting or compliance backgrounds. The most effective practitioners combine technical understanding with social science perspectives.Breaking Into AI Safety
For ML Engineers
You have the strongest starting position for safety engineering and red teaming roles.
Step 1: Audit your existing skills against safety requirements. Production ML experience, evaluation methodology, and system design directly transfer. Step 2: Add safety-specific knowledge. Study adversarial ML techniques, bias evaluation methods, and alignment research basics. The MATS (ML Alignment Theory Scholars) program and AI Safety Camp offer structured learning. Step 3: Build safety projects. Create an adversarial testing framework for an open-source model. Build a bias evaluation pipeline. Develop a content safety classifier. These portfolio pieces directly demonstrate relevant skills. Step 4: Contribute to open-source safety work. Projects like Guardrails AI, NeMo Guardrails, and various red teaming tools welcome contributions. Step 5: Apply for safety engineering or red teaming roles. Your ML production experience combined with safety-specific projects makes a strong application.Timeline: 3-6 months of focused preparation while employed.
For Software Engineers (Non-ML)
Safety engineering is the most natural entry point. The role requires strong engineering skills with ML knowledge added on top.
Step 1: Build ML fundamentals. Complete a foundational ML course and understand how language models work at a conceptual level. Step 2: Study AI safety from a systems perspective. How do you build monitoring for model outputs? How do you implement content filters? How do you design failover systems for AI features? These are engineering problems. Step 3: Build safety infrastructure projects. A monitoring dashboard for model behavior, an automated evaluation pipeline, or a prompt injection detection system.Timeline: 6-12 months.
For Policy/Legal Professionals
AI governance and policy roles need your expertise, with technical literacy added.
Step 1: Develop technical literacy. You don't need to train models. You need to understand what AI systems can and can't do, what risks they create, and what technical mitigations exist. Courses from Fast.ai and DeepLearning.AI provide appropriate depth. Step 2: Study the regulatory landscape. The EU AI Act, NIST AI Risk Management Framework, and US state-level AI laws form the foundation. Step 3: Write. Policy briefs, regulatory analyses, or blog posts about AI governance topics. Publishing demonstrates your ability to analyze and communicate AI policy issues. Step 4: Target organizations where your policy expertise plus technical literacy creates unique value. Government agencies, think tanks, and corporate governance teams all need this combination.Timeline: 3-6 months of technical study while leveraging existing policy expertise.
For Researchers (Non-AI)
Alignment research draws from philosophy, cognitive science, mathematics, and other fields. If you have research training in adjacent fields, alignment research is accessible.
Step 1: Study alignment fundamentals. Read "Superintelligence" by Bostrom, "The Alignment Problem" by Christian, and current alignment research from Anthropic, DeepMind, and independent labs. Step 2: Build ML skills sufficient to conduct alignment experiments. This doesn't require the depth of an ML engineer, but you need to understand model training, evaluation, and behavior. Step 3: Participate in structured programs. MATS, AI Safety Camp, and Redwood Research programs provide mentored entry into alignment research. Step 4: Publish. Even informal blog posts analyzing alignment problems demonstrate thinking ability. Formal publications are better.Timeline: 12-18 months.
Where AI Safety Is Headed
Regulatory Acceleration
The EU AI Act is being enforced. The US is expanding AI executive orders into concrete requirements. China has its own AI regulations. Every major economy is creating AI governance frameworks. This means sustained, growing demand for compliance, governance, and policy professionals.
Safety Engineering as Standard Practice
Safety engineering is following the path of security engineering. It started as an afterthought, became a specialty, and is now integrated into standard engineering practice. In 2-3 years, "AI safety" won't be a separate team at most companies. It'll be a required competency for all AI engineers, with safety specialists setting standards and building tools.
Scaling with Model Capability
AI safety budgets at major labs doubled in 2025. As models become more capable, safety work becomes more critical. Unlike some AI specializations that may consolidate, safety requirements grow with model capability. A more powerful model needs more safety testing, not less.
Automated Safety Tools
Manual red teaming doesn't scale. The field is moving toward automated safety evaluation: tools that can test models for thousands of failure modes, monitor production outputs in real-time, and flag concerning patterns without human review of every interaction. Engineers who build these tools are in high demand.
Salary Growth Trajectory
AI safety compensation is growing faster than general AI engineering compensation. Safety-specific roles saw 18-22% salary growth from 2024 to 2026, compared to 12-15% for general AI engineering roles. The premium reflects three factors: the supply of qualified safety professionals is limited, regulatory pressure creates urgent demand, and AI labs compete aggressively for safety talent.
For early-career professionals choosing a specialization, AI safety offers strong compensation today with above-average growth projections. The field is young enough that entering now positions you as a senior professional in 3-5 years, when demand will be significantly higher than it is today.
Is AI Safety a Good Career Long-Term?
The trajectory is strongly positive. Regulatory requirements are expanding. Corporate AI safety budgets are growing. Model capabilities are increasing, making safety more important. And public attention to AI risks means executives allocate resources to safety teams.
The main career risk is that AI safety becomes so mainstream that it stops being a distinct specialty and gets absorbed into general AI engineering. That's not really a risk, though. If safety becomes a standard part of every AI engineer's job, the people who specialized in it early will be the ones teaching everyone else. That's a strong position to be in.
Resources for Getting Started
Courses and Programs
- MATS (ML Alignment Theory Scholars): The premier structured program for alignment research. Competitive admission, mentored research projects.
- AI Safety Camp: Intensive program focused on safety research projects. Lower barrier to entry than MATS.
- Redwood Research REMIX: Research fellowship focused on interpretability and alignment.
- AGI Safety Fundamentals (BlueDot Impact): Free course covering alignment theory and technical safety.
Key Research Groups
- Anthropic Safety Team: Leading work on constitutional AI, interpretability, and RLHF
- OpenAI Superalignment: Focused on aligning future AI systems
- DeepMind Safety Team: Broad safety research including evaluation and red teaming
- ARC (Alignment Research Center): Independent alignment research
- MIRI (Machine Intelligence Research Institute): Foundational alignment theory
Open-Source Safety Projects
- Guardrails AI: Framework for adding safety constraints to LLM outputs
- NeMo Guardrails (NVIDIA): Programmable safety rails for LLM applications
- Inspect (UK AISI): AI safety evaluation framework
- Anthropic's model evals: Open-source evaluation suites for dangerous capabilities
Essential Reading
- "The Alignment Problem" by Brian Christian: Accessible overview of alignment challenges
- "Superintelligence" by Nick Bostrom: Foundational text on existential AI risk
- Anthropic's research blog: Current alignment and interpretability research
- DeepMind safety research publications: Broad coverage of technical safety
- NIST AI Risk Management Framework: The practical governance standard