Data Scientist Engineer II - Data Lab

Washington, DC, US Mid Level Data Scientist

Interested in this Data Scientist role at Washington Metropolitan Area Transit Authority?

Apply Now →

Skills & Technologies

DemandtoolsPythonRag

About This Role

AI job market dashboard showing open roles by category

Job Description

### DEPARTMENT MARKETING STATEMENT

Are you a forward\-thinking Data Engineer with a passion for building scalable and efficient data solutions? Do you want to make a meaningful impact in your city? Join the Data Lab, WMATA's cross\-functional data science and engineering team on a mission to transform public transit in the DC metropolitan area through data\-informed solutions.

We¿re excited to add a data engineer to our team who can take proof\-of\-concept ETL/ELT pipelines from prototype to full production deployment. In this role, you will build data applications that transform raw operational data (e.g. vehicle locations, faregate transactions, or passenger counts) into clean sources of truth used throughout the Authority. You will be responsible for optimizing performance and efficiency of pipelines while ensuring data quality and integrity through robust testing and validation. We value strong attention to detail, proactive ownership, and professional accountability; the ideal candidate learns new concepts and adapts to new workflows without extensive guidance, has a demonstrated history of applying engineering best practices, and hands\-on experience deploying and operating machine learning services.

Join a team dedicated to both innovation and public service. At WMATA's Data Lab, your work will directly contribute to making transit smarter, more accessible, and efficient for everyone.

MINIMUM QUALIFICATIONS

Minimum Education

  • Bachelor's degree in Civil/Transportation Engineering, Data Science, Economics, Operations Research, Computer Science, or a related field

+ In lieu of a Bachelor's Degree, a High School Diploma/GED and four (4\) years of data science experience, in addition to the experience stated below, will be considered

Minimum Experience

  • Minimum of four (4\) years of relevant data science experience. Experience must include at least two (2\) year of experience using programming languages or database systems to develop data flows, data models, and/or data marts.

Minimum Certification/Licensure

  • N/A

Preferred Qualifications

  • Master's degree in civil/Transportation Engineering, Data Science, Economics, Operations Research, Computer Science, or a related field
  • Advanced experience using R, Python, SQL or similar tools to extract, manipulate, and clean data; Advanced experience in relational database management and/or application development.

Medical Group

Satisfactorily complete the medical examination for this position, if required. The incumbent must be able to perform the essential functions of this position either with or without reasonable accommodations.

SUMMARY

The Data Scientist/Engineer II contributes to research, data science, and data engineering activities of the Data Lab within the Department of Performance, Data and Research. The incumbent supports internal task teams and academic research partners executing a work program of quantitative analysis and research, and development and maintenance of data infrastructure. The primary duties of the Data Scientist/Engineer II include contributing to development of data modeling and analysis, the development of data pipelines, and data documentation. Additional duties include supporting development of documentation and supporting project teams with integration and analysis of large and complex datasets. These critical data assets support peer offices and other teams developing data\-informed, action\-oriented, and equity\-focused recommendations and insights for executive leaders and managers.

ESSENTIAL FUNCTIONS

Principal Job Duties

  • Participates in project teams of data scientists, data engineers, consultants, and academic research partners collaboratively executing a portfolio of quantitative research and development initiatives to support critical business functions including planning, performance, scheduling, operations, budgeting, maintenance, and safety and enable data\-informed decision\-making throughout the Authority.
  • Supports the integration, and maintenance of third\-party datasets, data transformation services, and data software platforms to ensure agency analysts have the best data and tools to support data\-informed decision making.'

Other Duties

  • Implements, maintains, supports, and develops documentation for internal projects and helps empower analysts and other data users to develop actionable intelligence confidently and consistently for specific business uses.
  • Implements and supports data pipelines, including the capture, transformation, aggregation, and publication of new and existing data sets to provide analyst teams and other internal customers with advanced tools to support business needs.
  • Assists senior staff with the design and development of advanced data models using a variety of statistical and data science methodologies to provide analytics for a variety of real\-time, performance, maintenance, and planning applications.
  • Aids in the creation and maintenance of architecture diagrams, system documentation, data models, mapping documents, business rules, data flow diagrams and other design related artifacts in a manner consistent with team standards to ensure understanding of teammates and as needed, other audiences, both technical and non\-technical.
  • Contributes to the development and deployment of tools and datasets for partner teams within Planning and Performance and others to support advanced analytics and data\-informed decision making throughout the agency.
  • Develops automated test scripts, conducts root cause analyses in response to data issues, and implements cost effective resolutions for data anomalies to monitor and report on data quality, validity, accuracy, and usability, identifies opportunities for corrections, and implements fixes that ensure internal customers have the highest quality data for decision making.
  • Maintains and promotes awareness and accountability with safety policies and procedures while performing job functions. Promotes a positive safety culture and encourages reporting of safety concerns consistent with our Agency Safety Plan, other regulatory requirements within the Safety Management System and just culture principles.

The functions listed are not intended to limit specific duties and responsibilities of any particular position. Nor is it intended to limit in any way the right of managers and supervisors to assign, direct and control the work of employees under their supervision.

Evaluation Criteria

Consideration will be given to applicants whose resumes demonstrate the required education and experience. Applicants should include all relevant education and work experience.

Evaluation criteria may include one or more of the following:

  • Skills and/or behavioral assessment
  • Personal interview
  • Verification of education and experience (including certifications and licenses)
  • Criminal Background Check (a criminal conviction is not an automatic bar to employment)
  • Medical examination including a drug and alcohol screening (for safety sensitive positions)
  • Review of a current motor vehicle report

Closing

WMATA is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, national origin, disability, status as a protected veteran, or any other status protected by applicable federal law.

This posting is an announcement of a vacant position under recruitment. It is not intended to replace the official job description. Job descriptions are available upon confirmation of an interview.

Role Details

Title Data Scientist Engineer II - Data Lab
Location Washington, DC, US
Category Data Scientist
Experience Mid Level
Salary Not disclosed
Remote No

About This Role

Data Scientists extract insights and build predictive models from data. In the AI era, many roles now include LLM-powered analytics, automated reporting, and integration with generative AI tools. The role has evolved from 'the person who runs SQL queries' to 'the person who builds AI-powered data products.'

Modern data science roles fall into two camps: analytics-focused (insights, dashboards, experimentation) and ML-focused (building predictive models, recommendation systems, NLP features). The best data scientists can operate in both modes. The AI shift means that even analytics-focused roles now involve building automated insight pipelines using LLMs, going well beyond one-off reports.

Across the 26,159 AI roles we're tracking, Data Scientist positions make up 2% of the market. At Washington Metropolitan Area Transit Authority, this role fits into their broader AI and engineering organization.

Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.

What the Work Looks Like

A typical week includes: analyzing experiment results for a product feature launch, building a predictive model for customer churn, creating an automated reporting pipeline using LLM-powered summarization, presenting insights to stakeholders, and cleaning data (always cleaning data). The ratio of analysis to engineering varies by company, but expect both.

Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.

Skills Required

Demandtools Python (15% of roles) Rag (64% of roles)

Python, SQL, and statistical modeling are the foundation. Increasingly, roles want experience with LLMs for data analysis, automated insight generation, and building AI-powered data products. Familiarity with cloud data platforms (Snowflake, BigQuery, Databricks) and ML frameworks (scikit-learn, PyTorch) covers most job requirements.

Experimentation design and causal inference are underrated skills that separate strong candidates. Companies care about whether their product changes cause improvements, and can distinguish causation from correlation. A/B testing methodology, Bayesian statistics, and the ability to communicate uncertainty to non-technical stakeholders are high-value skills.

Good postings specify the data stack, the types of problems you'll work on, and the team structure. Look for companies that differentiate between analytics and ML data science. Vague 'data scientist' postings that list every skill under the sun usually mean the company doesn't know what they need.

Compensation Benchmarks

Data Scientist roles pay a median of $204,700 based on 441 positions with disclosed compensation. Mid-level AI roles across all categories have a median of $131,300.

Across all AI roles, the market median is $184,000. Top-quartile compensation starts at $244,000. The 90th percentile reaches $309,400. For comparison, the highest-paying categories include AI Engineering Manager ($293,500) and AI Architect ($292,900). By seniority level: Entry: $76,880; Mid: $131,300; Senior: $227,400; Director: $244,288; VP: $234,620.

Washington Metropolitan Area Transit Authority AI Hiring

Washington Metropolitan Area Transit Authority has 2 open AI roles right now. They're hiring across AI/ML Engineer, Data Scientist. Positions span MD, US, Washington, DC, US.

Location Context

Across all AI roles, 7% (1,863 positions) offer remote work, while 24,200 require on-site attendance. Top AI hiring metros: Los Angeles (1,695 roles, $178,000 median); New York (1,670 roles, $200,000 median); San Francisco (1,059 roles, $244,000 median).

Career Path

Common paths into Data Scientist roles include Data Analyst, Statistician, Quantitative Researcher.

From here, career progression typically leads toward Senior Data Scientist, ML Engineer, AI Product Manager.

Start with statistics and SQL. Build a real analysis project on public data that demonstrates insight generation alongside model building. The market values data scientists who can communicate findings clearly to business stakeholders. If you want to move toward ML engineering, invest in software engineering fundamentals and production deployment skills.

What to Expect in Interviews

Interviews combine statistics, coding, and business acumen. SQL is almost always tested, often with complex joins and window functions. Expect a case study round where you're given a business problem and asked to design an analysis plan. Coding rounds focus on pandas, statistical modeling, and visualization. The strongest differentiator is how well you communicate insights to non-technical stakeholders during presentation rounds.

When evaluating opportunities: Good postings specify the data stack, the types of problems you'll work on, and the team structure. Look for companies that differentiate between analytics and ML data science. Vague 'data scientist' postings that list every skill under the sun usually mean the company doesn't know what they need.

AI Hiring Overview

The AI job market has 26,159 open positions tracked in our dataset. By seniority: 2,416 entry-level, 16,247 mid-level, 5,153 senior, and 2,343 leadership roles (Director, VP, C-Level). Remote roles make up 7% of the market (1,863 positions). The remaining 24,200 roles require on-site or hybrid attendance.

The market median for AI roles is $184,000. Top-quartile compensation starts at $244,000. The 90th percentile reaches $309,400. Highest-paying categories: AI Engineering Manager ($293,500 median, 28 roles); AI Architect ($292,900 median, 108 roles); AI Safety ($274,200 median, 19 roles).

Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.

The AI Job Market Today

The AI job market spans 26,159 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (23,752), AI Software Engineer (598), AI Product Manager (594). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.

The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (2,416) are outnumbered by mid-level (16,247) and senior (5,153) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 2,343 positions, representing the bottleneck between technical execution and organizational strategy.

Remote work availability sits at 7% of all AI roles (1,863 positions), with 24,200 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.

AI compensation is structured in clear tiers. The market median sits at $184,000. Top-quartile roles start at $244,000, and the 90th percentile reaches $309,400. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.

Category matters for compensation. AI Engineering Manager roles lead at $293,500 median, while Prompt Engineer roles sit at $122,200. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.

The most in-demand skills across all AI postings: Rag (16,749 postings), Aws (8,932 postings), Rust (7,660 postings), Python (3,815 postings), Azure (2,678 postings), Gcp (2,247 postings), Prompt Engineering (1,469 postings), Openai (1,269 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.

Frequently Asked Questions

Based on 441 roles with disclosed compensation, the median salary for Data Scientist positions is $204,700. Actual compensation varies by seniority, location, and company stage.
Python, SQL, and statistical modeling are the foundation. Increasingly, roles want experience with LLMs for data analysis, automated insight generation, and building AI-powered data products. Familiarity with cloud data platforms (Snowflake, BigQuery, Databricks) and ML frameworks (scikit-learn, PyTorch) covers most job requirements.
About 7% of the 26,159 AI roles we track offer remote work. Remote availability varies by company and seniority level, with senior and leadership roles more likely to offer location flexibility.
Washington Metropolitan Area Transit Authority is among the companies actively hiring for AI and ML talent. Check our company profiles for detailed breakdowns of open roles, salary ranges, and hiring trends.
Common next steps from Data Scientist positions include Senior Data Scientist, ML Engineer, AI Product Manager. Progression depends on whether you lean toward technical depth, people management, or product strategy.

Get Weekly AI Career Intelligence

Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.