Principal Data Engineer, LLM/AI Platforms (Remote)

$195K - $290K Remote Senior Data Engineer

Interested in this Data Engineer role at CrowdStrike?

Apply Now →

Skills & Technologies

AwsDockerGcpKubernetesLangchainLlamaindexMlflowPrompt EngineeringPythonRag

About This Role

AI job market dashboard showing open roles by category

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI\-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission\-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About the Role:

-------------------

CrowdStrike is looking for a Principal Data Engineer with deep expertise in Large Language Models (LLMs) and AI platforms to join our growing Data Science Platform Engineering Team. You will be a key leader, responsible for designing, building, and deploying cutting\-edge data infrastructure that powers our next generation of AI\-driven security products. This role requires significant hands\-on experience in LLM integration, agentic workflows, and agent harnessing to deliver high\-impact, scalable solutions. You will champion engineering excellence, focusing on shipping fast, writing elegant, high\-quality code, and actively mentoring and strengthening the team's technical knowledge and capabilities.

The scale of our systems and data are approaching Exabytes in size. Experience with extremely large\-scale systems, including DevSecOps patterns, practices, and standards are important for this work.

What You'll Do:

-------------------

  • Architect, implement, and optimize data platforms and pipelines specifically designed to support LLMs, Retrieval\-Augmented Generation (RAG), and sophisticated AI agentic systems at Exabyte scale.
  • Drive the adoption and deployment of agentic workflows and agent harnessing techniques to create autonomous, data\-driven security features.
  • Design and implement highly scalable, fault\-tolerant, and cost\-effective data solutions, emphasizing rapid iteration and high\-quality deployment.
  • Write elegant, production\-ready code with a focus on performance, maintainability, and testing rigor, ensuring the ability to ship fast without compromising quality.
  • Provide technical leadership and deep expertise in data modeling, normalization, and semantic cataloging for AI/ML workloads.
  • Establish best practices for MLOps/DataOps surrounding LLMs, including monitoring, observability, and zero\-touch recovery mechanisms for AI services.
  • Actively mentor engineers, conducting technical workshops, leading design reviews, and strengthening the team's knowledge in cutting\-edge AI platform technologies.
  • Collaborate across the organization with Data Scientists, Product Managers, and other engineering teams to transform research prototypes into robust, production\-grade services.
  • Own the end\-to\-end lifecycle of critical data services: development, testing, deployment, and monitoring.

Tech Stack *(Expertise in several key areas is expected):*

------------------------------------------------------------------

  • MLOps Tools (MLflow, Sagemaker, Vertex AI)
  • Experience with common agentic workflow frameworks (e.g., LangChain, LlamaIndex).
  • Expert\-level proficiency in a high\-level coding language (Python, or JVM technologies).
  • Deep experience with distributed data processing frameworks (e.g., Spark, Dask, Flink).
  • Strong expertise with cloud platforms (AWS, GCP, or OCI) and related data services.
  • Containerization and orchestration mastery (Docker, Kubernetes).
  • Message queuing and streaming technologies (Kafka, Pulsar).
  • Data Warehousing (Snowflake, BigQuery) and Data Orchestration (Airflow, Kubeflow).

What You'll Need:

---------------------

  • Master’s degree or PhD in Computer Science, Data Engineering, or a related STEM field, or equivalent practical experience.
  • 10\+ years of progressive experience in Data Engineering/Platform Engineering, with at least 3 years focused on architecting and building platforms for AI/ML or Data Science at massive scale.
  • Demonstrable hands\-on experience in LLM engineering (fine\-tuning, prompt engineering, deployment), RAG, and developing agentic workflows.
  • Proven track record of designing and delivering large\-scale distributed systems (sharding, partitioning, concurrency).
  • Exceptional ability to write clean, elegant, performant, and well\-tested code, coupled with a proactive mindset for delivering results quickly.
  • A thorough understanding of engineering practices, including effective peer code reviews, resilient architecture design, and comprehensive testing paradigms.
  • Prior experience in a Principal or Staff level engineering role, demonstrating technical leadership and mentorship capabilities.

Bonus Points:

-----------------

  • Direct experience building, deploying, and managing LLMs in a production environment.
  • Prior experience in the cybersecurity, intelligence, or high\-compliance industries.
  • Contributions to open\-source projects related to data or AI/ML.

\#LI\-RC1

\#LI\-Remote

Benefits of Working at CrowdStrike:

  • Market leader in compensation and equity awards
  • Comprehensive physical and mental wellness programs
  • Competitive vacation and holidays for recharge
  • Paid parental and adoption leaves
  • Professional development opportunities for all employees regardless of level or role
  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections
  • Vibrant office culture with world class amenities
  • Great Place to Work Certified™ across the globe

CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program.

CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy\-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions\-including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay\-offs, return from lay\-off, terminations and social/recreational programs\-on valid job requirements.

If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at [email protected] for further assistance.

Find out more about your rights as an applicant.

CrowdStrike participates in the E\-Verify program.

Notice of E\-Verify Participation

Right to Work

CrowdStrike, Inc. is committed to fair and equitable compensation practices. Placement within the pay range is dependent on a variety of factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location. The base salary range for this position for all U.S. candidates is $195,000 \- $290,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off.

For detailed information about the U.S. benefits package, please click here.

Expected Close Date of Job Posting is:07\-28\-2026

Salary Context

This $195K-$290K range is above the 75th percentile for Data Engineer roles in our dataset (median: $160K across 37 roles with salary data).

Role Details

Company CrowdStrike
Title Principal Data Engineer, LLM/AI Platforms (Remote)
Location Remote, US
Category Data Engineer
Experience Senior
Salary $195K - $290K
Remote Yes

About This Role

Data Engineers build the pipelines that feed AI models. They design ETL workflows, manage data lakes, and ensure training and inference data is clean, timely, and accessible. Without good data engineering, AI projects fail. It's that simple.

The AI era has expanded the data engineer's scope far beyond batch ETL jobs. You're building real-time embedding pipelines for RAG systems, managing vector databases, ensuring training data quality at scale, and building the infrastructure that lets ML teams iterate on data as fast as they iterate on models. Data quality is the biggest predictor of model quality, and you're the person responsible for it.

Across the 3,823 AI roles we're tracking, Data Engineer positions make up 1% of the market. At CrowdStrike, this role fits into their broader AI and engineering organization.

Data Engineer demand in AI contexts is strong and growing. Every company building AI needs clean, reliable data pipelines. The shift toward real-time AI applications (chatbots, recommendation engines, agent systems) means data engineering is more critical than ever. Companies are willing to pay premium salaries for data engineers with AI/ML pipeline experience.

What the Work Looks Like

A typical week includes: debugging a data pipeline that's producing stale embeddings for the RAG system, optimizing a Spark job that processes training data, building a data quality monitoring dashboard, meeting with the ML team to understand their next data requirements, and writing dbt models that transform raw event data into ML-ready features. The work is deeply technical and high-impact.

Data Engineer demand in AI contexts is strong and growing. Every company building AI needs clean, reliable data pipelines. The shift toward real-time AI applications (chatbots, recommendation engines, agent systems) means data engineering is more critical than ever. Companies are willing to pay premium salaries for data engineers with AI/ML pipeline experience.

Skills Required

Aws (31% of roles) Docker (11% of roles) Gcp (19% of roles) Kubernetes (12% of roles) Langchain (11% of roles) Llamaindex (4% of roles) Mlflow (4% of roles) Prompt Engineering (16% of roles) Python (52% of roles) Rag (22% of roles)

SQL, Python, and distributed systems (Spark, Airflow, dbt) are core. Cloud data platforms (Snowflake, BigQuery, Redshift) are increasingly standard. Many AI-focused roles also want familiarity with vector databases and embedding pipelines. Understanding data modeling, pipeline orchestration, and data quality frameworks covers the essentials.

AI-specific data engineering skills include: building feature stores, managing training data versioning, implementing data lineage tracking, and building real-time embedding pipelines. Experience with streaming systems (Kafka, Flink) is valuable for real-time AI applications. Understanding ML data requirements (balanced datasets, data augmentation, evaluation set construction) makes you much more effective working with ML teams.

Strong postings specify the data stack, mention ML pipeline work, and describe the scale of data you'll be working with. Look for companies that understand the connection between data quality and model quality. Avoid roles that conflate data engineering with data analysis.

Compensation Benchmarks

Data Engineer roles pay a median of $208,300 based on 266 positions with disclosed compensation. Senior-level AI roles across all categories have a median of $227,400. This role's midpoint ($242K) sits 16% above the category median. Disclosed range: $195K to $290K.

Across all AI roles, the market median is $200,100. Top-quartile compensation starts at $253,500. The 90th percentile reaches $307,500. For comparison, the highest-paying categories include AI Engineering Manager ($275,000) and AI Safety ($274,200). By seniority level: Entry: $97,880; Mid: $165,000; Senior: $227,400; Director: $247,800; VP: $250,000.

CrowdStrike AI Hiring

CrowdStrike has 7 open AI roles right now. They're hiring across Data Scientist, AI/ML Engineer, Data Engineer. Positions span Remote, US, CA, US. Compensation range: $180K - $290K.

Remote Work Context

Remote AI roles pay a median of $170,000 across 1,926 positions. About 15% of all AI roles offer remote work.

Career Path

Common paths into Data Engineer roles include Backend Engineer, Database Administrator, Analytics Engineer.

From here, career progression typically leads toward Senior Data Engineer, ML Engineer, Data Platform Lead.

Master SQL and Python first. Then learn a distributed processing framework (Spark or its modern alternatives) and a pipeline orchestrator (Airflow, Dagster, Prefect). Build a portfolio project that demonstrates end-to-end pipeline construction: ingest, transform, validate, serve. If you want to specialize in AI data engineering, add vector databases and embedding pipelines to your skill set.

What to Expect in Interviews

Expect SQL deep-dives (query optimization, partitioning strategies, data modeling), Python coding focused on data pipeline patterns, and system design questions about building scalable ETL workflows. Companies with ML teams will ask about feature stores, embedding pipelines, and training data management. Be ready to discuss data quality monitoring, pipeline orchestration, and how you'd handle schema evolution in a production data lake.

When evaluating opportunities: Strong postings specify the data stack, mention ML pipeline work, and describe the scale of data you'll be working with. Look for companies that understand the connection between data quality and model quality. Avoid roles that conflate data engineering with data analysis.

AI Hiring Overview

The AI job market has 3,823 open positions tracked in our dataset. By seniority: 112 entry-level, 1,798 mid-level, 1,516 senior, and 397 leadership roles (Director, VP, C-Level). Remote roles make up 15% of the market (590 positions). The remaining 3,217 roles require on-site or hybrid attendance.

The market median for AI roles is $200,100. Top-quartile compensation starts at $253,500. The 90th percentile reaches $307,500. Highest-paying categories: AI Engineering Manager ($275,000 median, 41 roles); AI Safety ($274,200 median, 55 roles); Research Engineer ($260,000 median, 434 roles).

Data Engineer demand in AI contexts is strong and growing. Every company building AI needs clean, reliable data pipelines. The shift toward real-time AI applications (chatbots, recommendation engines, agent systems) means data engineering is more critical than ever. Companies are willing to pay premium salaries for data engineers with AI/ML pipeline experience.

The AI Job Market Today

The AI job market spans 3,823 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (2,629), Data Scientist (322), AI Software Engineer (279). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.

The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (112) are outnumbered by mid-level (1,798) and senior (1,516) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 397 positions, representing the bottleneck between technical execution and organizational strategy.

Remote work availability sits at 15% of all AI roles (590 positions), with 3,217 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.

AI compensation is structured in clear tiers. The market median sits at $200,100. Top-quartile roles start at $253,500, and the 90th percentile reaches $307,500. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.

Category matters for compensation. AI Engineering Manager roles lead at $275,000 median, while Prompt Engineer roles sit at $140,000. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.

The most in-demand skills across all AI postings: Python (1,979 postings), Aws (1,190 postings), Azure (899 postings), Rag (839 postings), Gcp (726 postings), Pytorch (595 postings), Prompt Engineering (595 postings), Claude (540 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.

Frequently Asked Questions

Based on 266 roles with disclosed compensation, the median salary for Data Engineer positions is $208,300. Actual compensation varies by seniority, location, and company stage.
SQL, Python, and distributed systems (Spark, Airflow, dbt) are core. Cloud data platforms (Snowflake, BigQuery, Redshift) are increasingly standard. Many AI-focused roles also want familiarity with vector databases and embedding pipelines. Understanding data modeling, pipeline orchestration, and data quality frameworks covers the essentials.
About 15% of the 3,823 AI roles we track offer remote work. Remote availability varies by company and seniority level, with senior and leadership roles more likely to offer location flexibility.
CrowdStrike is among the companies actively hiring for AI and ML talent. Check our company profiles for detailed breakdowns of open roles, salary ranges, and hiring trends.
Common next steps from Data Engineer positions include Senior Data Engineer, ML Engineer, Data Platform Lead. Progression depends on whether you lean toward technical depth, people management, or product strategy.

Get Weekly AI Career Intelligence

Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.