Interested in this Data Scientist role at Fusemachines?
Apply Now →Skills & Technologies
About This Role
### About Fusemachines
Founded in 2013, Fusemachines is a global provider of enterprise AI products and services, on a mission to democratize AI. Leveraging proprietary AI Studio and AI Engines, the company helps drive the clients’ AI Enterprise Transformation, regardless of where they are in their Digital AI journeys. With offices in North America, Asia, and Latin America, Fusemachines provides a suite of enterprise AI offerings and specialty services that allow organizations of any size to implement and scale AI. Fusemachines serves companies in industries such as retail, manufacturing, and government.
### Fusemachines continues to actively pursue the mission of democratizing AI for the masses by providing high\-quality AI education in underserved communities and helping organizations achieve their full potential with AI.
Salary Range: US$ 140,000\-190,000/year### Role Overview
We’re hiring a mid\-to\-senior Machine Learning Engineer / Data Scientist to build and deploy machine learning solutions that drive measurable business impact. You’ll work across the ML lifecycle—from problem framing and data exploration to model development, evaluation, deployment, and monitoring—often in partnership with client stakeholders and internal delivery teams.
You should be strong in core data science and applied machine learning, comfortable working with real\-world data, and capable of turning modeling work into production\-ready systems.
### Key Responsibilities
- Problem Framing \& Stakeholder Partnership
+ Translate business questions into ML problem statements (classification, regression, time series forecasting, clustering, anomaly detection, recommendation, etc.).
+ Collaborate with stakeholders to define success metrics, evaluation plans, and practical constraints (latency, interpretability, cost, data availability).
- Data Analysis \& Feature Engineering
+ Use SQL and Python to extract, join, and analyze data from relational databases and data warehouses.
+ Perform data profiling, missingness analysis, leakage checks, and exploratory analysis to guide modeling choices.
+ Build robust feature pipelines (aggregation, encoding, scaling, embeddings where appropriate) and document assumptions.
- Model Development (Core ML)
+ Train and tune supervised learning models for tabular data (e.g., logistic/linear models, tree\-based methods, gradient boosting such as XGBoost/LightGBM/CatBoost, and neural nets for structured data).
+ Apply strong tabular modeling practices: handling missing data, categorical encoding, leakage prevention, class imbalance strategies, calibration, and robust cross\-validation.
+ Build time series models (statistical and ML/DL approaches) and validate with proper backtesting.
+ Apply clustering and segmentation techniques (k\-means, hierarchical, DBSCAN, Gaussian mixtures) and evaluate stability and usefulness.
+ Apply statistics in practice (hypothesis testing, confidence intervals, sampling, experiment design) to support inference and decision\-making.
- Deep Learning
+ Build and train deep learning models using PyTorch or TensorFlow/Keras.
+ Use best practices for training (regularization, calibration, class imbalance handling, reproducibility, sound train/val/test design).
- Evaluation, Explainability, and Iteration
+ Choose appropriate metrics (AUC/F1/PR, RMSE/MAE/MAPE, calibration, lift, and business KPIs) and create evaluation reports.
+ Perform error analysis and interpretation (feature importance/SHAP, cohort slicing) and iterate based on evidence.
- Productionization \& MLOps (Project\-Dependent)
+ Package models for deployment (batch scoring pipelines or real\-time APIs) and collaborate with engineers on integration.
+ Implement practical MLOps: versioning, reproducible training, automated evaluation, monitoring for drift/performance, and retraining plans.
- Documentation \& Communication
+ Communicate tradeoffs and recommendations clearly to technical and non\-technical stakeholders.
+ Create documentation and lightweight demos that make results actionable.
### Success in This Role Looks Like
- You deliver models that perform well and move business metrics (revenue lift, cost reduction, risk reduction, improved forecast accuracy, operational efficiency).
- Your work is reproducible and production\-aware: clear data lineage, robust evaluation, and a credible path to deployment/monitoring.
- Stakeholders trust your judgment in selecting methods and communicating uncertainty honestly.
### Required Qualifications
- 3–8 years of experience in data science, machine learning engineering, or applied ML (mid\-to\-senior).
- Strong Python skills for data analysis and modeling (pandas/numpy/scikit\-learn or equivalent).
- Strong SQL skills (joins, window functions, aggregation, performance awareness).
- Solid foundation in statistics (hypothesis testing, uncertainty, bias/variance, sampling) and practical experimentation mindset.
- Hands\-on experience across multiple model types, including:
+ Classification \& regression
+ Time series forecasting
+ Clustering/segmentation
- Experience with deep learning in PyTorch or TensorFlow/Keras.
- Strong problem\-solving skills: ability to work with ambiguous goals and messy data.
- Clear communication skills and ability to translate analysis into decisions.
### Preferred Qualifications
- Experience with Databricks for applied ML (e.g., Spark, Delta Lake, MLflow, Databricks Jobs/Workflows).
- Experience deploying models to production (APIs, batch pipelines) and maintaining them over time (monitoring, retraining).
- Experience with orchestration tools (Airflow, Prefect, Dagster) and modern data stacks (Snowflake/BigQuery/Redshift/Databricks).
- Experience with cloud platforms (AWS/GCP/Azure/IBM) and containerization (Docker).
- Experience with responsible AI and governance best practices (privacy/PII handling, auditability, access controls).
- Consulting or client\-facing delivery experience.
Certifications (Strong Plus)
Candidates with at least one relevant certification are especially encouraged to apply:
- Cloud certifications: AWS, Google Cloud, Microsoft Azure, or IBM (data/AI/ML tracks)
- Databricks certifications (Data Scientist, Data Engineer, or related)
### Nice\-to\-Have
- Causal inference experience (e.g., quasi\-experimental methods, propensity scores, uplift/heterogeneous treatment effects, experimentation beyond A/B tests).
- Agentic development experience: designing and evaluating agentic workflows (tool use, planning, memory/state, guardrails) and integrating them into products.
- Deep familiarity with agentic coding tools and workflows for accelerated product development (e.g., AI\-assisted IDEs, code agents, automated testing/refactoring, repo\-aware assistants), including strong judgment on quality, security, and maintainability.
*Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws.* Important: Immigration Sponsorship Policy
Fusemachines is unable to proceed with candidates who require any form of work authorization or immigration support from the company. This restriction applies to all types of support, including:
- Direct Company Sponsorship: Such as H\-1B, J\-1, or TN visas.
- Employer of Record: Listing Fusemachines as the immigration employer on any government documentation.
- Written Documentation: Providing letters or other support for any work authorization (e.g., OPT, STEM OPT, CPT).
I0GDC3XJwj
Salary Context
This $140K-$190K range is below the median for Data Scientist roles in our dataset (median: $166K across 345 roles with salary data).
View full Data Scientist salary data →Role Details
About This Role
Data Scientists extract insights and build predictive models from data. In the AI era, many roles now include LLM-powered analytics, automated reporting, and integration with generative AI tools. The role has evolved from 'the person who runs SQL queries' to 'the person who builds AI-powered data products.'
Modern data science roles fall into two camps: analytics-focused (insights, dashboards, experimentation) and ML-focused (building predictive models, recommendation systems, NLP features). The best data scientists can operate in both modes. The AI shift means that even analytics-focused roles now involve building automated insight pipelines using LLMs, going well beyond one-off reports.
Across the 26,159 AI roles we're tracking, Data Scientist positions make up 2% of the market. At Fusemachines, this role fits into their broader AI and engineering organization.
Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.
What the Work Looks Like
A typical week includes: analyzing experiment results for a product feature launch, building a predictive model for customer churn, creating an automated reporting pipeline using LLM-powered summarization, presenting insights to stakeholders, and cleaning data (always cleaning data). The ratio of analysis to engineering varies by company, but expect both.
Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.
Skills Required
Python, SQL, and statistical modeling are the foundation. Increasingly, roles want experience with LLMs for data analysis, automated insight generation, and building AI-powered data products. Familiarity with cloud data platforms (Snowflake, BigQuery, Databricks) and ML frameworks (scikit-learn, PyTorch) covers most job requirements.
Experimentation design and causal inference are underrated skills that separate strong candidates. Companies care about whether their product changes cause improvements, and can distinguish causation from correlation. A/B testing methodology, Bayesian statistics, and the ability to communicate uncertainty to non-technical stakeholders are high-value skills.
Good postings specify the data stack, the types of problems you'll work on, and the team structure. Look for companies that differentiate between analytics and ML data science. Vague 'data scientist' postings that list every skill under the sun usually mean the company doesn't know what they need.
Compensation Benchmarks
Data Scientist roles pay a median of $204,700 based on 441 positions with disclosed compensation. Mid-level AI roles across all categories have a median of $131,300. This role's midpoint ($165K) sits 19% below the category median. Disclosed range: $140K to $190K.
Across all AI roles, the market median is $184,000. Top-quartile compensation starts at $244,000. The 90th percentile reaches $309,400. For comparison, the highest-paying categories include AI Engineering Manager ($293,500) and AI Architect ($292,900). By seniority level: Entry: $76,880; Mid: $131,300; Senior: $227,400; Director: $244,288; VP: $234,620.
Fusemachines AI Hiring
Fusemachines has 1 open AI role right now. They're hiring across Data Scientist. Based in New York, NY, US. Compensation range: $190K - $190K.
Location Context
AI roles in New York pay a median of $200,000 across 1,670 tracked positions. That's 9% above the national median.
Career Path
Common paths into Data Scientist roles include Data Analyst, Statistician, Quantitative Researcher.
From here, career progression typically leads toward Senior Data Scientist, ML Engineer, AI Product Manager.
Start with statistics and SQL. Build a real analysis project on public data that demonstrates insight generation alongside model building. The market values data scientists who can communicate findings clearly to business stakeholders. If you want to move toward ML engineering, invest in software engineering fundamentals and production deployment skills.
What to Expect in Interviews
Interviews combine statistics, coding, and business acumen. SQL is almost always tested, often with complex joins and window functions. Expect a case study round where you're given a business problem and asked to design an analysis plan. Coding rounds focus on pandas, statistical modeling, and visualization. The strongest differentiator is how well you communicate insights to non-technical stakeholders during presentation rounds.
When evaluating opportunities: Good postings specify the data stack, the types of problems you'll work on, and the team structure. Look for companies that differentiate between analytics and ML data science. Vague 'data scientist' postings that list every skill under the sun usually mean the company doesn't know what they need.
AI Hiring Overview
The AI job market has 26,159 open positions tracked in our dataset. By seniority: 2,416 entry-level, 16,247 mid-level, 5,153 senior, and 2,343 leadership roles (Director, VP, C-Level). Remote roles make up 7% of the market (1,863 positions). The remaining 24,200 roles require on-site or hybrid attendance.
The market median for AI roles is $184,000. Top-quartile compensation starts at $244,000. The 90th percentile reaches $309,400. Highest-paying categories: AI Engineering Manager ($293,500 median, 28 roles); AI Architect ($292,900 median, 108 roles); AI Safety ($274,200 median, 19 roles).
Data Scientist roles remain in high demand, though the definition keeps shifting. Companies increasingly want candidates who can bridge traditional statistics with modern ML and LLM capabilities. The 'pure insights' data scientist role is consolidating into analytics engineering, while the 'build models' data scientist role is merging with ML engineering.
The AI Job Market Today
The AI job market spans 26,159 open positions across 15 role categories. The largest categories by volume: AI/ML Engineer (23,752), AI Software Engineer (598), AI Product Manager (594). These three account for the majority of open positions, though smaller categories often have higher per-role compensation because of specialized skill requirements.
The seniority mix tells a story about where AI teams are in their maturity. Entry-level roles (2,416) are outnumbered by mid-level (16,247) and senior (5,153) positions, reflecting that most companies are past the 'build a team from scratch' phase and need experienced engineers who can ship production systems. Leadership roles (Director, VP, C-Level) total 2,343 positions, representing the bottleneck between technical execution and organizational strategy.
Remote work availability sits at 7% of all AI roles (1,863 positions), with 24,200 requiring on-site or hybrid attendance. The remote share has stabilized after the post-pandemic correction. Senior and specialized roles (Research Scientist, ML Architect) are more likely to be remote-eligible than entry-level positions, partly because experienced hires have more negotiating power and partly because these roles require less hands-on mentorship.
AI compensation is structured in clear tiers. The market median sits at $184,000. Top-quartile roles start at $244,000, and the 90th percentile reaches $309,400. These figures include base salary with disclosed compensation. Total compensation (including equity, bonuses, and sign-on) runs 20-40% higher at companies that offer those components.
Category matters for compensation. AI Engineering Manager roles lead at $293,500 median, while Prompt Engineer roles sit at $122,200. The spread between highest and lowest-paying categories reflects the premium on specialized technical skills versus broader analytical roles.
The most in-demand skills across all AI postings: Rag (16,749 postings), Aws (8,932 postings), Rust (7,660 postings), Python (3,815 postings), Azure (2,678 postings), Gcp (2,247 postings), Prompt Engineering (1,469 postings), Openai (1,269 postings). Python dominates, appearing in the vast majority of role descriptions regardless of category. Cloud platform experience (AWS, GCP, Azure) is the second most common requirement. The newer entrants to the top skills list (RAG, vector databases, LLM APIs) reflect the shift from traditional ML toward generative AI applications.
Frequently Asked Questions
Get Weekly AI Career Intelligence
Salary data, skills demand, and market signals from 16,000+ AI job postings. Every Monday.