Senior AI & Data Engineering Lead - Senior Vice President

$176K - $265K Jersey City, NJ, US Senior Data Engineer

Interested in this Data Engineer role at Information Technology Senior Management Forum?

Apply Now →

Skills & Technologies

AwsAzureGcpPython

About This Role

AI job market dashboard showing open roles by category

Posted Date

6/05/2026

Description

This job description outlines a senior\-level role for a data architect or lead data engineer within a Data Services team. The position is centered on building and managing the data infrastructure required to support large\-scale Generative AI and Machine Learning initiatives. Below is a detailed breakdown of the responsibilities and the skills required for such a role

#### Expanded Responsibilities

This role combines deep technical expertise in data engineering with strategic thinking and leadership. The core responsibilities can be broken down into three main pillars:

#### 1\. Strategic AI Enablement

This goes beyond just building databases; it's about designing the entire data foundation for the company's AI strategy.

  • Data Ecosystem Architecture: You will be responsible for the high\-level design of the data platform. This includes:

+ Data Lake/Lakehouse Design: Implementing a central repository to store vast amounts of structured, semi\-structured, and unstructured data from various sources. This could involve technologies like AWS S3, Azure Data Lake Storage, or Google Cloud Storage.

+ Federated Querying: Leveraging technologies like Starburst (commercial Trino) to create a virtual data warehouse. This allows data consumers (analysts, data scientists, AI models) to query data across different sources (e.g., data lakes, relational databases, NoSQL databases) with a single SQL query, without needing to move or copy the data.

+ Scalability and Performance: Ensuring the architecture can scale horizontally to handle petabytes of data and a high volume of concurrent queries, which is critical for pre\-training large language models (LLMs).

#### 2\. Advanced AI Ops \& Data Pipelines

This is the hands\-on engineering aspect of the role, focused on the movement and processing of data.

  • High\-Throughput Data Pipelines: You will lead the development of the data "plumbing" that powers the AI systems. This includes:

+ Batch Processing: Using Apache Spark for large\-scale data transformation, cleaning, and feature engineering on historical data.

+ Real\-time Stream Processing: Using Apache Kafka as a messaging bus to ingest real\-time data from sources like application logs, IoT devices, or clickstreams. Apache Flink would be used for complex event processing on these streams (e.g., fraud detection, real\-time recommendations).

  • Optimization and Reliability: Your pipelines must be not only fast but also resilient. This involves:

+ Low Latency: Tuning jobs and infrastructure to minimize the time it takes for data to travel from source to destination.

+ High Availability: Implementing failover mechanisms, monitoring, and alerting to ensure the data pipelines are always running and the AI models have uninterrupted access to fresh data.

+ CI/CD for Data: Implementing DevOps and AI Ops best practices for data pipelines, including automated testing, deployment, and data quality checks.

#### 3\. AI Governance \& Leadership

This pillar focuses on the "people" and "process" aspects of the role, ensuring data is used responsibly and effectively.

  • Data Governance for AI: As AI systems become more critical, the data they use must be trustworthy. You will establish frameworks for:

+ Data Quality: Implementing automated checks and monitoring to ensure data is accurate, complete, and consistent.

+ Data Provenance \& Lineage: Creating systems to track where data comes from, how it has been transformed, and how it is used. This is crucial for debugging models and for regulatory compliance.

+ Data Security: Working with security teams to implement access controls, data masking, and encryption to protect sensitive information, especially in the context of training AI models.

  • Team Leadership and Mentorship: This is a leadership role where you will be expected to:

+ Mentor Data Engineers: Guide junior and mid\-level engineers, conduct code reviews, and establish best practices for the team.

+ Foster Innovation: Stay up\-to\-date with the latest technologies and methodologies in the data and AI space and encourage a culture of experimentation and continuous improvement.

+ Cross\-functional Collaboration: Work closely with data scientists, ML engineers, platform engineers, and business stakeholders to understand their needs and deliver effective data solutions.

Qualifications:

  • 10\+ years of relevant experience
  • Experience in implementing projects
  • Experience in systems analysis and programming of software applications
  • Demonstrated Subject Matter Expert (SME) in area(s) of Applications Development
  • Demonstrated knowledge of client core business functions
  • Demonstrated leadership, project management, and development skills
  • Relationship and consensus building skills

Education:

  • Bachelor’s degree/University degree or equivalent experience
  • Master’s degree preferred

#### Required Skills

To succeed in this role, a candidate would need a blend of technical depth, strategic vision, and leadership qualities.

Big Data Technologies

  • Processing Frameworks: Expert\-level knowledge of Apache Spark. Strong experience with Apache Flink and Apache Kafka.
  • Query Engines: Deep understanding and hands\-on experience with Trino (Starburst).
  • Orchestration: Experience with workflow management tools like Airflow or Prefect.

Data Architecture:

  • Data Modeling: Strong understanding of data modeling concepts for both analytical and operational systems.
  • Platform Design: Proven experience designing and building scalable data lakes, data warehouses, and lakehouse architectures.
  • Cloud Expertise: Proficiency with at least one major cloud provider (AWS, GCP, Azure) and their data services (e.g., S3, Glue, EMR, BigQuery, Databricks).

Governance \& Security:

  • Data Governance: Experience implementing data quality frameworks, data lineage solutions, and data cataloging tools.
  • Security: Knowledge of data security best practices, including encryption, masking, and role\-based access control (RBAC).

Programming:

  • Python: Expert\-level proficiency.
  • SQL: Expert\-level proficiency for complex analytical queries.
  • Scala/Java: Often beneficial for deep work in Spark or Flink.

Soft Skills:

  • Leadership: Proven ability to lead complex technical projects and mentor engineers.
  • Strategic Thinking: Ability to connect data strategy to broader business and technology objectives.
  • Communication: Excellent verbal and written communication skills to articulate complex technical concepts to both technical and non\-technical audiences.
  • Problem\-Solving: Strong analytical and troubleshooting skills.

\-

#### Job Family Group:

Technology

\-

#### Job Family:

Applications Development

\-

#### Time Type:

Full time

\-

#### Primary Location:

Jersey City New Jersey United States

\-

#### Primary Location Full Time Salary Range:

$176,720\.00 \- $265,080\.00

In addition to salary, Citi’s offerings may also include, for eligible employees, discretionary and formulaic incentive and retention awards. Citi offers competitive employee benefits, including: medical, dental \& vision coverage; 401(k); life, accident, and disability insurance; and wellness programs. Citi also offers paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays. For additional information regarding Citi employee benefits, please visit citibenefits.com. Available offerings may vary by jurisdiction, job level, and date of hire.

\-

#### Most Relevant Skills

Please see the requirements listed above.

\-

#### Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

\-

#### Anticipated Posting Close Date:

Jun 11, 2026

\-

*Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.*

*If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review* *Accessibility at Citi**.*

*View Citi’s* *EEO Policy Statement* *and the* *Know Your Rights* *poster.*

Salary

176,720\.00 \- 265,080\.00 Annual

Type

Full\-time

Salary Context

This $176K-$265K range is above the 75th percentile for Data Engineer roles in our dataset (median: $168K across 41 roles with salary data).

Role Details

Title Senior AI & Data Engineering Lead - Senior Vice President
Location Jersey City, NJ, US
Category Data Engineer
Experience Senior
Salary $176K - $265K
Remote No

About This Role

Data Engineers build the pipelines that feed AI models. They design ETL workflows, manage data lakes, and ensure training and inference data is clean, timely, and accessible. Without good data engineering, AI projects fail. It's that simple.

The AI era has expanded the data engineer's scope far beyond batch ETL jobs. You're building real-time embedding pipelines for RAG systems, managing vector databases, ensuring training data quality at scale, and building the infrastructure that lets ML teams iterate on data as fast as they iterate on models. Data quality is the biggest predictor of model quality, and you're the person responsible for it.

Across the 3,824 AI roles we're tracking, Data Engineer positions make up 2% of the market. At Information Technology Senior Management Forum, this role fits into their broader AI and engineering organization.

Data Engineer demand in AI contexts is strong and growing. Every company building AI needs clean, reliable data pipelines. The shift toward real-time AI applications (chatbots, recommendation engines, agent systems) means data engineering is more critical than ever. Companies are willing to pay premium salaries for data engineers with AI/ML pipeline experience.

What the Work Looks Like

A typical week includes: debugging a data pipeline that's producing stale embeddings for the RAG system, optimizing a Spark job that processes training data, building a data quality monitoring dashboard, meeting with the ML team to understand their next data requirements, and writing dbt models that transform raw event data into ML-ready features. The work is deeply technical and high-impact.

Data Engineer demand in AI contexts is strong and growing. Every company building AI needs clean, reliable data pipelines. The shift toward real-time AI applications (chatbots, recommendation engines, agent systems) means data engineering is more critical than ever. Companies are willing to pay premium salaries for data engineers with AI/ML pipeline experience.

Skills Required

Aws (31% of roles) Azure (23% of roles) Gcp (19% of roles) Python (51% of roles)

SQL, Python, and distributed systems (Spark, Airflow, dbt) are core. Cloud data platforms (Snowflake, BigQuery, Redshift) are increasingly standard. Many AI-focused roles also want familiarity with vector databases and embedding pipelines. Understanding data modeling, pipeline orchestration, and data quality frameworks covers the essentials.

AI-specific data engineering skills include: building feature stores, managing training data versioning, implementing data lineage tracking, and building real-time embedding pipelines. Experience with streaming systems (Kafka, Flink) is valuable for real-time AI applications. Understanding ML data requirements (balanced datasets, data augmentation, evaluation set construction) makes you much more effective working with ML teams.

Strong postings specify the data stack, mention ML pipeline work, and describe the scale of data you'll be working with. Look for companies that understand the connection between data quality and model quality. Avoid roles that conflate data engineering with data analysis.

Compensation Benchmarks

Data Engineer roles pay a median of $208,300 based on 254 positions with disclosed compensation. This role's midpoint ($220K) sits 6% above the category median. Disclosed range: $176K to $265K.

Across all AI roles, the market median is $200,000. Top-quartile compensation starts at $253,000. The 90th percentile reaches $307,500. For comparison, the highest-paying categories include AI Engineering Manager ($293,500) and AI Safety ($274,200). By seniority level: Entry: $97,380; Mid: $160,000; Senior: $227,400; Director: $243,000; VP: $250,000.

Information Technology Senior Management Forum AI Hiring

Information Technology Senior Management Forum has 33 open AI roles right now. They're hiring across Data Scientist, Data Engineer, AI Software Engineer, AI/ML Engineer. Positions span McLean, VA, US, Jersey City, NJ, US, Irving, TX, US. Compensation range: $126K - $392K.

Location Context

Across all AI roles, 16% (613 positions) offer remote work, while 3,187 require on-site attendance. Top AI hiring metros: New York (2,448 roles, $210,000 median); San Francisco (1,990 roles, $253,000 median); Los Angeles (1,686 roles, $189,000 median).

Career Path

Common paths into Data Engineer roles include Backend Engineer, Database Administrator, Analytics Engineer.

From here, career progression typically leads toward Senior Data Engineer, ML Engineer, Data Platform Lead.

Master SQL and Python first. Then learn a distributed processing framework (Spark or its modern alternatives) and a pipeline orchestrator (Airflow, Dagster, Prefect). Build a portfolio project that demonstrates end-to-end pipeline construction: ingest, transform, validate, serve. If you want to specialize in AI data engineering, add vector databases and embedding pipelines to your skill set.

What to Expect in Interviews

Expect SQL deep-dives (query optimization, partitioning strategies, data modeling), Python coding focused on data pipeline patterns, and system design questions about building scalable ETL workflows. Companies with ML teams will ask about feature stores, embedding pipelines, and training data management. Be ready to discuss data quality monitoring, pipeline orchestration, and how you'd handle schema evolution in a production data lake.

When evaluating opportunities: Strong postings specify the data stack, mention ML pipeline work, and describe the scale of data you'll be working with. Look for companies that understand the connection between data quality and model quality. Avoid roles that conflate data engineering with data analysis.

AI Hiring Overview

The AI job market has 3,824 open positions tracked in our dataset. By seniority: 119 entry-level, 1,813 mid-level, 1,472 senior, and 420 leadership roles (Director, VP, C-Level). Remote roles make up 16% of the market (613 positions). The remaining 3,187 roles require on-site or hybrid attendance.

The market median for AI roles is $200,000. Top-quartile compensation starts at $253,000. The 90th percentile reaches $307,500. Highest-paying categories: AI Engineering Manager ($293,500 median, 31 roles); AI Safety ($274,200 median, 51 roles); Research Engineer ($260,000 median, 401 roles).

Data Engineer demand in AI contexts is strong and growing. Every company building AI needs clean, reliable data pipelines. The shift toward real-time AI applications (chatbots, recommendation engines, agent systems) means data engineering is more critical than ever. Companies are willing to pay premium salaries for data engineers with AI/ML pipeline experience.

The AI Job Market Today

The AI job market spans 3,824 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (2,702), Data Scientist (281), AI Software Engineer (258). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.

The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (119) are outnumbered by mid-level (1,813) and senior (1,472) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 420 positions, representing the bottleneck between technical execution and organizational strategy.

Remote work availability sits at 16% of all AI roles (613 positions), with 3,187 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.

AI compensation is structured in clear tiers. The market median sits at $200,000. Top-quartile roles start at $253,000, and the 90th percentile reaches $307,500. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.

Category matters for compensation. AI Engineering Manager roles lead at $293,500 median, while Prompt Engineer roles sit at $142,800. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.

The most in-demand skills across all AI postings: Python (1,968 postings), Aws (1,203 postings), Azure (882 postings), Rag (877 postings), Gcp (735 postings), Prompt Engineering (587 postings), Pytorch (586 postings), Claude (554 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.

Frequently Asked Questions

Based on 254 roles with disclosed compensation, the median salary for Data Engineer positions is $208,300. Actual compensation varies by seniority, location, and company stage.
SQL, Python, and distributed systems (Spark, Airflow, dbt) are core. Cloud data platforms (Snowflake, BigQuery, Redshift) are increasingly standard. Many AI-focused roles also want familiarity with vector databases and embedding pipelines. Understanding data modeling, pipeline orchestration, and data quality frameworks covers the essentials.
About 16% of the 3,824 AI roles we track offer remote work. Remote availability varies by company and seniority level, with senior and leadership roles more likely to offer location flexibility.
Information Technology Senior Management Forum is among the companies actively hiring for AI and ML talent. Check our company profiles for detailed breakdowns of open roles, salary ranges, and hiring trends.
Common next steps from Data Engineer positions include Senior Data Engineer, ML Engineer, Data Platform Lead. Progression depends on whether you lean toward technical depth, people management, or product strategy.

Get Weekly AI Career Intelligence

Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.