Machine Learning Engineer – MLOps Lead

New Jersey, NJ · Information Technology

Job Title: Machine Learning Engineer – MLOps Lead
Duration: Contract role
Location: Remote, United States

Role Mission
You are being hired to productionize machine learning at scale — eliminating fragile pilot models, building hardened MLOps pipelines, and delivering compliant, monitored, and continuously improving ML systems that directly support business operations.
Your success is measured not by “knowing tools,” but by deploying, stabilizing, and scaling real ML systems in production.

First-Year Outcomes (What You Must Deliver)
Within First 30 Days

Fully assess current ML pipelines, data flows, and deployment architecture
Identify top 3 reliability, security, and performance risks in current ML lifecycle
Produce a documented MLOps modernization roadmap

Within 90 Days
You will:

Stand up standardized CI/CD pipelines for model training, validation, and deployment
Implement automated monitoring, alerting, and versioning across active production models
Deploy at least one business-critical ML model into hardened production pipelines
Establish security, audit, and compliance controls for model governance
Reduce model deployment cycle time by 30–50%

Within 180 Days
You will:

Operate a fully standardized enterprise MLOps framework (MLflow/Kubeflow/Airflow based)
Enable continuous retraining and automated rollback capability
Achieve ≥ 99.5% model uptime
Establish retraining cadence that improves model accuracy and reliability quarter-over-quarter
Mentor junior engineers and codify ML engineering standards

Ongoing Success Metrics

Metric	Target
Production model uptime	≥ 99.5%
Model deployment cycle time	↓ 30–50%
Automated pipeline coverage	100%
Compliance audit readiness	Continuous
Model accuracy improvement	QoQ measurable gains

What You Will Build

End-to-end MLOps pipelines (data → training → testing → deployment → monitoring → retraining)
Kubernetes-based model serving platforms
Cloud ML platforms (Vertex AI / SageMaker / Azure ML)
CI/CD automation for ML systems
Model observability and alerting using Prometheus / Grafana
Secure, version-controlled ML governance frameworks

Required Experience (Performance Evidence)
You must have:

Proven delivery of production ML pipelines (not just experiments)
Built CI/CD for ML models in Kubernetes environments
Implemented monitoring, retraining, and version governance
Delivered at least one enterprise-scale ML deployment
Hands-on experience with MLflow / Kubeflow / Airflow
Cloud ML production deployment (AWS, GCP, or Azure)
Strong Python engineering background

Machine Learning Engineer – MLOps Lead

Share This Job