Building scalable, reproducible AI systems with MLOps.
I build production ML systems end-to-end: data ingestion → training → evaluation → packaging → deployment → monitoring-ready handoff. My focus is on reproducibility (DVC + Git), automated delivery (CI/CD), containerized runtimes (Docker), orchestration (Airflow), and cloud object storage (AWS S3).
About (hands-on)
Practical ML engineering: versioned data, automated pipelines, deployable artifacts.
- Reproducibility over one-off runs
- Automation over manual runbooks
- Artifacts over screenshots
- Clear inputs/outputs per step
- GitHub (reviews, CI, releases)
- DVC (data/model versioning)
- Docker (portable runtime)
- Airflow (orchestration)
- AWS S3 (artifact storage)
Skills
Each skill reflects what I’ve built, shipped, or automated—no tool-only lists.
Core AI & Machine Learning
Modeling, training, evaluation, and deployment fundamentals applied in real pipelines.
MLOps & Production ML
Reproducible experiments, versioned data/model artifacts, automated delivery, and cloud-ready runtimes.
Data Science & Analytics
From raw data to insight: cleaning, EDA, visualization, and automated reporting.
Programming & Development
Engineer-first implementation skills for shipping reliable AI systems.
AI Applications & Intelligent Systems
Applied AI systems for automation, assistants, and real workflows.
Web & Application Development
Interfaces and integrations that ship AI into real user workflows.
Tools & Libraries
Daily drivers for data work, analytics, and integration.
Academic & Professional Strengths
How I work when building real systems.
Projects
Detailed, pipeline-first projects showing implementation, architecture, and outcomes.
End-to-End MLOps Pipeline Template (DVC + CI/CD + Docker + S3)
A reproducible ML project template that turns experiments into shippable artifacts.
ML projects often fail to move beyond notebooks because data/model versions drift, environments change, and deployment artifacts are inconsistent.
- Define a repeatable project structure (data/ → src/ → models/ → reports/)
- Version datasets and model artifacts with DVC; store remotes in S3
- Train + evaluate via a deterministic pipeline entrypoint (config-driven)
- Build an inference image with Docker for identical local/CI/cloud runtime
- Automate checks and releases with GitHub Actions (lint, tests, build, artifacts)
- Structured the pipeline around explicit inputs/outputs so a run can be reproduced from a single commit + DVC state.
- Added CI gates for formatting/linting and a build step that produces a deployment-ready Docker image.
- Stored data/model versions in S3-backed DVC remotes to support collaboration and rollback.
- Reproducible experiments across machines and CI runners.
- Clear path from training code to a deployable inference artifact.
- Maintainable repo layout recruiters can audit quickly.
Automated ML Workflow Orchestration with Apache Airflow
A scheduled DAG that runs training, evaluation gates, and batch scoring with retries and observability.
Manual runs don’t scale: teams need reliable scheduling, retries, and clear visibility into each step of the ML workflow.
- Ingest data from sources (API/files) and validate schema
- Preprocess + feature engineer to a versioned dataset snapshot
- Train candidate models and compute evaluation metrics
- Gate promotion based on evaluation thresholds
- Run batch inference and publish outputs (tables/reports/artifacts)
- Designed DAG tasks with clean boundaries and idempotent behavior so retries are safe.
- Used artifact versioning to ensure each run is traceable back to the exact dataset + params.
- Implemented evaluation gates to prevent regressions from being promoted.
- Reliable scheduled ML runs with clear step-by-step visibility.
- Reduced human error by removing manual ‘runbook’ steps.
- Easier handoff to production environments due to containerized execution.
Business Automation AI Agents (Lead Gen + Support + Reporting)
Tool-using automation agents that connect APIs, data pipelines, and structured outputs.
Business workflows often rely on repetitive manual steps: collecting leads, triaging support, and producing periodic reports from raw data.
- Collect signals from APIs/forms and normalize into a structured store
- Enrich and qualify records using deterministic rules + model outputs
- Generate structured outputs (summaries, action lists, CSV/PDF exports)
- Automate recurring reporting (including CBC report generation)
- Log runs and provide clear handoff/escalation paths for edge cases
- Designed agent workflows around verifiable steps (inputs → transformations → outputs), not free-form automation.
- Implemented robust data normalization to keep downstream automation predictable.
- Added export-ready reporting for stakeholders (tables + visuals) and automated scheduling hooks.
- Less manual effort for recurring operational tasks.
- Faster turnaround for reports and qualified lead lists.
- More consistent support triage with clearer audit trails.
Clinical Chatbot + AI-powered Web App Integration
A user-facing clinical assistant integrated into a web experience with safe UX patterns.
Clinical assistants must be reliable and easy to use: users need clear flows, consistent answers, and safe escalation when uncertainty is high.
- User interface designed for fast, clear clinical workflows
- Backend logic to route intents, log interactions, and apply safety checks
- Model-assisted responses with guardrails and fallback behavior
- Deployment-ready packaging and environment configuration
- Implemented structured intent handling and response templates to keep interactions consistent.
- Added logging and clear UI affordances for uncertainty and escalation.
- Built the integration to be modular so the AI layer can be swapped without rewriting the UI.
- Improved usability for clinical-style interactions through clear UX and predictable flows.
- Better engineering hygiene: modular backend integration and deployment-ready structure.
- A demonstrable AI app with real-world constraints (safety, traceability, UX).
GitHub & Code Quality
Clean structure, reproducibility, and real pipelines that reviewers can run.
GitHub & Code Quality
The goal is simple: repos that are easy to run, easy to audit, and clearly show production ML thinking.
- Repos structured for reproducibility: deterministic entrypoints, pinned dependencies, and clear configuration.
- Pipelines are real: data → train → evaluate → package → deliver (not notebook-only).
- Readable folder structure and documentation so reviewers can quickly run and audit results.
- Preference for automation: CI checks, scripted workflows, and containerized execution.
Certifications & Education
AI foundation + applied coursework for production ML and analytics.
BS Artificial Intelligence
AI systems, ML, data, and applied engineering
Strong foundation across ML/DL, intelligent systems, and practical project development.
Contact
Open to AI/ML engineering roles focused on production systems and automation.
- MLOps pipelines and orchestration
- Production ML + automation
- Data workflows and analytics systems
- AI apps with backend integration