Senior Data Scientist - Big Data R&D, Identity Graph & KYC

$170K - $200K New York, NY, US Senior Data Scientist

Interested in this Data Scientist role at Socure?

Apply Now →

Skills & Technologies

AwsPythonPytorchTensorflow

About This Role

AI job market dashboard showing open roles by category

Why Socure?

---------------

Socure is building the identity trust infrastructure for the digital economy — verifying 100% of good identities in real time and stopping fraud before it starts. The mission is big, the problems are complex, and the impact is felt by businesses, governments, and millions of people every day.

We hire people who want that level of responsibility. People who move fast, think critically, act like owners, and care deeply about solving customer problems with precision. If you want predictability or narrow scope, this won’t be your place. If you want to help build the future of identity with a team that holds a high bar for itself — keep reading.

About the Role

------------------

The Big Data R\&D team develops cutting‑edge big data and graph‑based solutions for entity search, entity resolution, and identity matching that power Socure’s KYC and compliance products.

As a Senior Data Scientist I, you will lead the design and deployment of advanced ML and graph algorithms on large\-scale PII datasets, own end‑to‑end projects from problem definition through production validation, and serve as a key technical partner to Product, Engineering, and Client‑facing teams. You will help define standards for feature engineering, experimentation, and data quality across our identity graph stack, with substantial impact on coverage, accuracy, and fairness.

What You'll Do

------------------

  • Own the design, development, and evaluation of machine learning, statistical, and graph\-based algorithms for entity\-resolution, identity trust scoring, and anomaly detection on massive datasets.
  • Architect and optimize graph\-based identity representations (identity graph structure, linkage rules, clustering) to improve match rates, reduce false positives/negatives, and support downstream fraud and KYC models.
  • Build and maintain scalable data pipelines and feature stores in Spark/PySpark (or Scala), including data normalization, deduplication, and feature computation across large PII datasets in AWS/Databricks environments.
  • Lead A/B tests and offline/online experimentation for new models, features, and data sources; define success metrics, design experiments, and ensure rigorous validation before rollout.
  • Evaluate new internal and external data sources: explore signal quality, design backtests, quantify incremental value, and provide clear recommendations on vendor selection and integration.
  • Partner closely with product managers and engineers to translate ambiguous business and regulatory requirements (e.g., KYC coverage, watchlist matching) into concrete modeling and data roadmaps.
  • Provide deep analytical support to Socure’s compliance and regulatory product suite, including investigative analyses, root‑cause analysis for anomalies, and clear narratives for internal and external stakeholders.
  • Contribute to model governance and documentation: clearly explain model logic, data dependencies, limitations, and monitoring plans to internal risk/compliance stakeholders.
  • Mentor junior data scientists and engineers on best practices in data exploration, feature engineering, experimentation, and code quality.
  • Communicate complex technical concepts and trade‑offs in a concise, structured way to both technical and non‑technical audiences (e.g., product reviews, customer meetings, internal briefings).

What You Bring

------------------

  • Master’s degree with 3\+ years of relevant industry experience, or Ph.D. with 1\+ years of experience in applied ML / data science roles; background in Computer Science, Statistics, Mathematics, or related quantitative fields preferred.
  • Strong proficiency in Python (preferred) or Scala, including experience with ML libraries such as scikit‑learn, XGBoost, TensorFlow or PyTorch.
  • Extensive experience with Spark or PySpark and distributed data systems (e.g., AWS EMR, Databricks) working on very large, messy datasets.
  • Deep understanding of supervised and unsupervised learning, feature engineering, model evaluation, and experiment design (A/B testing, holdout strategies, stratification).
  • Experience developing production\-quality data pipelines and automated workflows using Airflow or similar orchestration tools.
  • Practical familiarity with graph databases and/or graph frameworks (Neo4j, AWS Neptune, GraphFrames, DGL, PyTorch Geometric) and graph algorithms for clustering, link prediction, and community detection is strongly preferred.
  • Solid SQL skills and experience working with large\-scale analytical data stores.
  • Experience in at least one of: identity verification, fraud detection, credit risk, or adjacent high‑stakes domains is a plus.
  • Demonstrated ability to lead medium‑to‑large projects end‑to‑end, make sound trade‑off decisions under ambiguity, and influence cross‑functional stakeholders with data and clear reasoning.

*Please note that sponsorship is not available at this time; and that you must be located within 45 miles of a talent hub to be considered.*

*Socure is an equal opportunity employer that values diversity in all its forms within our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.*

*If you need an accommodation during any stage of the application or hiring process—including interview or onboarding support—please reach out to your Socure recruiting partner directly.*

Follow Us!

YouTube \| LinkedIn \| X (Twitter) \| Facebook

Compensation Range: $170K \- $200K

Salary Context

This $170K-$200K range is above the median for Data Scientist roles in our dataset (median: $160K across 245 roles with salary data).

View full Data Scientist salary data →

Role Details

Company Socure
Title Senior Data Scientist - Big Data R&D, Identity Graph & KYC
Location New York, NY, US
Category Data Scientist
Experience Senior
Salary $170K - $200K
Remote No

About This Role

Data Scientists extract insights and build predictive models from data. In the AI era, many roles now include LLM-powered analytics, automated reporting, and integration with generative AI tools. The role has evolved from 'the person who runs SQL queries' to 'the person who builds AI-powered data products.'

Modern data science roles fall into two camps: analytics-focused (insights, dashboards, experimentation) and ML-focused (building predictive models, recommendation systems, NLP features). The best data scientists can operate in both modes. The AI shift means that even analytics-focused roles now involve building automated insight pipelines using LLMs, going well beyond one-off reports.

Across the 4,133 AI roles we're tracking, Data Scientist positions make up 8% of the market. At Socure, this role fits into their broader AI and engineering organization.

Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.

What the Work Looks Like

A typical week includes: analyzing experiment results for a product feature launch, building a predictive model for customer churn, creating an automated reporting pipeline using LLM-powered summarization, presenting insights to stakeholders, and cleaning data (always cleaning data). The ratio of analysis to engineering varies by company, but expect both.

Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.

Skills Required

Aws (32% of roles) Python (51% of roles) Pytorch (16% of roles) Tensorflow (13% of roles)

Python, SQL, and statistical modeling are the foundation. Increasingly, roles want experience with LLMs for data analysis, automated insight generation, and building AI-powered data products. Familiarity with cloud data platforms (Snowflake, BigQuery, Databricks) and ML frameworks (scikit-learn, PyTorch) covers most job requirements.

Experimentation design and causal inference are underrated skills that separate strong candidates. Companies care about whether their product changes cause improvements, and can distinguish causation from correlation. A/B testing methodology, Bayesian statistics, and the ability to communicate uncertainty to non-technical stakeholders are high-value skills.

Good postings specify the data stack, the types of problems you'll work on, and the team structure. Look for companies that differentiate between analytics and ML data science. Vague 'data scientist' postings that list every skill under the sun usually mean the company doesn't know what they need.

Compensation Benchmarks

Data Scientist roles pay a median of $198,000 based on 868 positions with disclosed compensation. Senior-level AI roles across all categories have a median of $227,400. This role's midpoint ($185K) sits 7% below the category median. Disclosed range: $170K to $200K.

Across all AI roles, the market median is $200,700. Top-quartile compensation starts at $254,000. The 90th percentile reaches $307,500. For comparison, the highest-paying categories include AI Safety ($274,200) and AI Engineering Manager ($268,700). By seniority level: Entry: $97,760; Mid: $165,778; Senior: $227,400; Director: $250,000; VP: $250,000.

Socure AI Hiring

Socure has 6 open AI roles right now. They're hiring across Data Scientist, AI/ML Engineer. Positions span New York, NY, US, Carson City, NV, US. Compensation range: $170K - $300K.

Location Context

AI roles in New York pay a median of $211,000 across 2,760 tracked positions. That's 5% above the national median.

Career Path

Common paths into Data Scientist roles include Data Analyst, Statistician, Quantitative Researcher.

From here, career progression typically leads toward Senior Data Scientist, ML Engineer, AI Product Manager.

Start with statistics and SQL. Build a real analysis project on public data that demonstrates insight generation alongside model building. The market values data scientists who can communicate findings clearly to business stakeholders. If you want to move toward ML engineering, invest in software engineering fundamentals and production deployment skills.

What to Expect in Interviews

Interviews combine statistics, coding, and business acumen. SQL is almost always tested, often with complex joins and window functions. Expect a case study round where you're given a business problem and asked to design an analysis plan. Coding rounds focus on pandas, statistical modeling, and visualization. The strongest differentiator is how well you communicate insights to non-technical stakeholders during presentation rounds.

When evaluating opportunities: Good postings specify the data stack, the types of problems you'll work on, and the team structure. Look for companies that differentiate between analytics and ML data science. Vague 'data scientist' postings that list every skill under the sun usually mean the company doesn't know what they need.

AI Hiring Overview

The AI job market has 4,133 open positions tracked in our dataset. By seniority: 106 entry-level, 1,901 mid-level, 1,663 senior, and 463 leadership roles (Director, VP, C-Level). Remote roles make up 14% of the market (583 positions). The remaining 3,532 roles require on-site or hybrid attendance.

The market median for AI roles is $200,700. Top-quartile compensation starts at $254,000. The 90th percentile reaches $307,500. Highest-paying categories: AI Safety ($274,200 median, 57 roles); AI Engineering Manager ($268,700 median, 42 roles); Research Engineer ($260,000 median, 442 roles).

Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.

The AI Job Market Today

The AI job market spans 4,133 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (2,865), Data Scientist (339), AI Software Engineer (313). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.

The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (106) are outnumbered by mid-level (1,901) and senior (1,663) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 463 positions, representing the bottleneck between technical execution and organizational strategy.

Remote work availability sits at 14% of all AI roles (583 positions), with 3,532 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.

AI compensation is structured in clear tiers. The market median sits at $200,700. Top-quartile roles start at $254,000, and the 90th percentile reaches $307,500. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.

Category matters for compensation. AI Safety roles lead at $274,200 median, while Prompt Engineer roles sit at $140,000. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.

The most in-demand skills across all AI postings: Python (2,128 postings), Aws (1,324 postings), Azure (1,003 postings), Rag (916 postings), Gcp (817 postings), Pytorch (655 postings), Prompt Engineering (639 postings), Claude (571 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.

Frequently Asked Questions

Based on 868 roles with disclosed compensation, the median salary for Data Scientist positions is $198,000. Actual compensation varies by seniority, location, and company stage.
Python, SQL, and statistical modeling are the foundation. Increasingly, roles want experience with LLMs for data analysis, automated insight generation, and building AI-powered data products. Familiarity with cloud data platforms (Snowflake, BigQuery, Databricks) and ML frameworks (scikit-learn, PyTorch) covers most job requirements.
About 14% of the 4,133 AI roles we track offer remote work. Remote availability varies by company and seniority level, with senior and leadership roles more likely to offer location flexibility.
Socure is among the companies actively hiring for AI and ML talent. Check our company profiles for detailed breakdowns of open roles, salary ranges, and hiring trends.
Common next steps from Data Scientist positions include Senior Data Scientist, ML Engineer, AI Product Manager. Progression depends on whether you lean toward technical depth, people management, or product strategy.

Get Weekly AI Career Intelligence

Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.