Interested in this AI Safety role at Common Sense Media?
Apply Now →About This Role
Evaluations Partner Manager (Youth AI Safety Institute)
Common Sense Media is the leading nonprofit organization dedicated to improving the lives of kids and families by providing the research\-backed information, education, and independent voice they need to thrive in the age of apps, algorithms, and AI. We rate, educate, and advocate for policies to protect and prepare kids online. Our ratings, research, and resources reach more than 150 million users globally, over 1\.4 million educators, and more than 100,000 schools worldwide every year. Learn more at commonsense.org.
The Opportunity
Launched in May 2026, the Youth AI Safety Institute is Common Sense Media's newest addition to its programmatic pillar. The Institute establishes safety standards, builds open\-source evaluations that AI developers can run against their models, independently tests AI products, and publishes the results to provide transparency and accountability. It is an independent research and testing ground dedicated to ensuring that the AI used by children is safe and developmentally appropriate.
The Evaluations Partner Manager will organize the operational execution of the Institute's evaluation work: managing the day\-to\-day workflow between the Institute and its external evaluator partners, ensuring that standards are received, rubrics are developed, evaluations are run, and results make their way back to the Institute in usable form.
This role sits at the operational center of the Institute's work, with a critical feedback loop to the Standards Analyst and a working interface with Common Sense Media's internal product and data teams. The Evaluations Partner Manager keeps the evaluation pipeline running: tracking progress, surfacing issues, routing feedback, and ensuring nothing falls through the cracks across multiple concurrent evaluation tracks.
Location: San Francisco, California
Reports To: Head of AI \& Digital Assessments
Salary: $90,000–$110,000
Type: Full\-time, exempt
What You'll Do
Evaluation Workflow Coordination
- Serve as the Institute's primary staff\-to\-staff operational contact for external evaluator partners, managing the flow of work across 3–4 concurrent evaluation partnerships at any given time.
- Track the status of each evaluation track (from standards handoff through rubric development, evaluation execution, and results delivery) maintaining clear visibility into where each project stands.
- Ensure evaluator partners have what they need from the Institute to proceed, and that the Institute receives deliverables on time and in expected formats.
- Develop and maintain systems for tracking evaluation timelines, deliverables, and open items across all active partnerships.
Standards\-to\-Rubric Handoff \& Feedback Management
- Coordinate the handoff of Institute standards to evaluator partners, ensuring materials are complete, clearly documented, and appropriately contextualized for evaluation use.
- Surface and triage feedback from evaluators when standards raise questions of testability, scope, or clarity, routing issues to the Standards Analyst with enough context to act on them efficiently.
- Work closely with the Standards Analyst to track open feedback items, document how they are resolved, and ensure revised standards or clarifications make their way back to evaluators in a timely manner.
- Maintain a clear record of the standards\-to\-rubric translation process across all evaluation partnerships, including issues raised and how they were addressed.
Data \& Results Management
- Coordinate the receipt of evaluation results and data from external partners, ensuring outputs arrive in formats that are usable by the Institute and its internal product and data teams.
- Understand the Institute's data systems and integration mechanisms, sufficient to identify when something isn't working as expected and know who to loop in.
- Serve as the operational liaison between external evaluators and Common Sense Media's internal product and data teams, translating between workflow needs on both sides.
- Track and organize evaluation outputs to support the Institute's reporting, publication, and standards\-refinement work.
Program Tracking \& Reporting
=============================
- Maintain real\-time visibility into the status of all active evaluation projects, with clear documentation of milestones, blockers, and upcoming deadlines.
===========================================================================================================================================================
- Produce regular status updates for the Head of AI \& Digital Assessments, flagging issues that require escalation or decision.
==============================================================================================================================
- Identify and proactively surface operational bottlenecks before they delay evaluation timelines.
================================================================================================
What We're Looking For:
Required Qualifications
===================================================
- Education: Bachelor's degree in a relevant field (project management, public policy, computer science, research methods, or related).
- Experience: 3–5 years of experience in program coordination, research operations, or project management, preferably in a technical, research, or policy context.
- Workflow Management: Demonstrated ability to manage multiple concurrent project tracks with external partners, keeping complex workflows on schedule and well\-documented.
- Technical Comprehension: Comfortable working with data systems and technical workflows: able to understand how data moves between systems, follow technical documentation, and communicate clearly with technical teams
- Communication: Strong written and verbal communication skills, with the ability to translate between technical and non\-technical stakeholders clearly and efficiently.
- Organizational Excellence: Exceptional attention to detail and follow\-through, with a track record of keeping complex, multi\-party projects running smoothly.
Preferred Qualifications
- Experience coordinating research or evaluation workflows in an academic, policy, or technology context.
- Familiarity with AI evaluation, platform safety research, or technology risk assessment.
- Experience working with external research or technical partners in a coordination or liaison capacity.
- Ability to work directly with data formats, API documentation, and evaluation pipelines; capable of identifying integration issues and communicating precisely with engineering or data teams.
- Background in child safety, educational technology, or youth\-focused research.
- Experience working in mission\-driven or nonprofit organizations.
Core Competencies
- Operational Rigor: Tracks the details across multiple concurrent workstreams without losing sight of the bigger picture.
- Proactive Problem\-Solving: Identifies issues early, surfaces them clearly, and follows through until they're resolved.
- Clear Communication: Translates between external evaluators, internal staff, and technical teams with precision and without creating confusion.
- Collaborative Approach: Works fluidly across a range of internal and external partners, serving as a reliable operational hub without needing to own strategic decisions.
- Systems Thinking: Understands how the pieces connect (standards, rubrics, evaluations, data, publishing) and manages their own work accordingly.
- Adaptability: Comfortable operating in a fast\-moving environment where evaluation methods, partners, and priorities continue to evolve.
What We Offer
- The chance to work with talented, passionate professionals.
- A great health and welfare benefits package, including medical, dental, vision, a matching 401(k), and other key benefits.
- An organization that offers work/life balance.
- The opportunity to really make a difference in the lives of kids and families!
*Common Sense Media provides equal employment opportunities to all qualified individuals and prohibits discrimination and harassment of any type without regard to race, color, religion, sex, gender identity, sexual orientation, pregnancy, age, national origin, physical or mental disability, military or veteran status, genetic information, or any other protected classification or characteristic protected by federal, state, or local laws.*
*Common Sense Media will also consider for employment qualified applicants with arrest and conviction records. However, job offers are made on the condition that the applicant subsequently passes a criminal background check. If the background check indicates a prior criminal conviction, we will conduct an individualized assessment to determine whether the conviction should result in denial of employment. Pursuant to the San Francisco Fair Chance Ordinance, we will consider employment for qualified applicants with arrest and conviction records.*
Salary Context
This $90K-$110K range is below the median for AI Safety roles in our dataset (median: $209K across 13 roles with salary data).
Role Details
About This Role
This role sits at the intersection of AI and engineering, building systems that bring machine learning capabilities into production environments. The scope varies by company, but the common thread is applying AI technology to solve real business problems at scale. Most AI roles today require a combination of software engineering fundamentals and domain-specific ML knowledge, with the exact mix depending on the team's maturity and the product they're building.
The AI job market is evolving fast. New role categories emerge as companies figure out what they need to ship AI-powered products. What matters most is the ability to learn quickly, build working systems, and iterate based on real-world performance data. The specific title matters less than the skills you bring and the problems you can solve. Companies are past the experimentation phase and want engineers who can deliver production-quality systems that work reliably at scale.
Across the 3,823 AI roles we're tracking, AI Safety positions make up 0% of the market. At Common Sense Media, this role fits into their broader AI and engineering organization.
AI hiring keeps growing across industries. Companies in tech, finance, healthcare, and retail are all building AI teams. The strongest demand is for people who can bridge the gap between AI research and production engineering. The shift toward generative AI has created new role types (LLM Engineer, Prompt Engineer, AI Agent Developer) that didn't exist three years ago, while traditional roles (Data Scientist, ML Engineer) have evolved to incorporate LLM capabilities.
What the Work Looks Like
Day-to-day work involves a mix of building, debugging, and collaborating. You'll write code, review pull requests, participate in design discussions, and work with cross-functional teams (product, design, data) to define what AI features should do and how they should behave. Expect to spend time on both technical implementation and communication. Most AI teams operate in two-week sprint cycles, with regular demos and retrospectives. The ratio of heads-down coding to meetings and reviews varies by seniority, with senior roles spending more time on architecture decisions and mentorship.
AI hiring keeps growing across industries. Companies in tech, finance, healthcare, and retail are all building AI teams. The strongest demand is for people who can bridge the gap between AI research and production engineering. The shift toward generative AI has created new role types (LLM Engineer, Prompt Engineer, AI Agent Developer) that didn't exist three years ago, while traditional roles (Data Scientist, ML Engineer) have evolved to incorporate LLM capabilities.
Skills in Demand for This Role
Python and cloud platform experience are common requirements. Specific skill needs vary by company and focus area, but familiarity with ML frameworks, data pipelines, and API design covers the basics for most roles. RAG (Retrieval-Augmented Generation), vector databases, and LLM API integration are increasingly standard requirements across role types.
Beyond the core stack, communication skills matter more than many technical candidates realize. The ability to explain AI capabilities and limitations to non-technical stakeholders is a differentiator at every level. Technical writing, documentation, and clear thinking about tradeoffs are underrated skills in AI roles. Experience with evaluation methodology (how to measure whether an AI system is working well) is becoming a core requirement, especially for roles that involve LLM integration.
Look for job postings that specify the problems you'll work on, the tech stack, and the team structure. Vague postings that list every AI buzzword are often a sign the company hasn't figured out what they need. Strong postings describe the product context, the team you'd join, and the specific challenges you'd tackle.
Compensation Benchmarks
AI Safety roles pay a median of $274,200 based on 55 positions with disclosed compensation. Mid-level AI roles across all categories have a median of $165,000. This role's midpoint ($100K) sits 64% below the category median. Disclosed range: $90K to $110K.
Across all AI roles, the market median is $200,100. Top-quartile compensation starts at $253,500. The 90th percentile reaches $307,500. For comparison, the highest-paying categories include AI Engineering Manager ($275,000) and Research Engineer ($260,000). By seniority level: Entry: $97,880; Mid: $165,000; Senior: $227,400; Director: $247,800; VP: $250,000.
Common Sense Media AI Hiring
Common Sense Media has 4 open AI roles right now. They're hiring across AI Safety. Based in San Francisco, CA, US. Compensation range: $82K - $110K.
Location Context
AI roles in San Francisco pay a median of $253,000 across 2,168 tracked positions. That's 26% above the national median.
Career Path
Common paths into AI Safety roles include Software Engineer, Data Scientist, Data Analyst.
From here, career progression typically leads toward Senior Engineer, AI Architect, Engineering Manager, Principal Engineer.
Focus on building things that work. A deployed project that solves a real problem is worth more than any certification. Contribute to open-source, build portfolio projects, and invest in fundamentals (software engineering, statistics, systems design) rather than chasing the latest framework. The AI field moves fast, but the engineers who succeed long-term are the ones with strong fundamentals who can adapt to new tools and paradigms as they emerge.
What to Expect in Interviews
AI interviews typically combine coding challenges (Python-focused), system design questions tailored to the role, and discussions about your experience with relevant tools and frameworks. Strong candidates demonstrate both technical depth and the ability to make pragmatic engineering tradeoffs. Prepare portfolio projects that demonstrate end-to-end capability rather than isolated skills.
When evaluating opportunities: Look for job postings that specify the problems you'll work on, the tech stack, and the team structure. Vague postings that list every AI buzzword are often a sign the company hasn't figured out what they need. Strong postings describe the product context, the team you'd join, and the specific challenges you'd tackle.
AI Hiring Overview
The AI job market has 3,823 open positions tracked in our dataset. By seniority: 112 entry-level, 1,798 mid-level, 1,516 senior, and 397 leadership roles (Director, VP, C-Level). Remote roles make up 15% of the market (590 positions). The remaining 3,217 roles require on-site or hybrid attendance.
The market median for AI roles is $200,100. Top-quartile compensation starts at $253,500. The 90th percentile reaches $307,500. Highest-paying categories: AI Engineering Manager ($275,000 median, 41 roles); AI Safety ($274,200 median, 55 roles); Research Engineer ($260,000 median, 434 roles).
AI hiring keeps growing across industries. Companies in tech, finance, healthcare, and retail are all building AI teams. The strongest demand is for people who can bridge the gap between AI research and production engineering. The shift toward generative AI has created new role types (LLM Engineer, Prompt Engineer, AI Agent Developer) that didn't exist three years ago, while traditional roles (Data Scientist, ML Engineer) have evolved to incorporate LLM capabilities.
The AI Job Market Today
The AI job market spans 3,823 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (2,629), Data Scientist (322), AI Software Engineer (279). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.
The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (112) are outnumbered by mid-level (1,798) and senior (1,516) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 397 positions, representing the bottleneck between technical execution and organizational strategy.
Remote work availability sits at 15% of all AI roles (590 positions), with 3,217 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.
AI compensation is structured in clear tiers. The market median sits at $200,100. Top-quartile roles start at $253,500, and the 90th percentile reaches $307,500. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.
Category matters for compensation. AI Engineering Manager roles lead at $275,000 median, while Prompt Engineer roles sit at $140,000. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.
The most in-demand skills across all AI postings: Python (1,979 postings), Aws (1,190 postings), Azure (899 postings), Rag (839 postings), Gcp (726 postings), Pytorch (595 postings), Prompt Engineering (595 postings), Claude (540 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.
Frequently Asked Questions
Get Weekly AI Career Intelligence
Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.