AI & Data Pipeline Engineer · NYC
Building ML-driven data systems for financial services.
I'm Abir, an AI & Data Pipeline Engineer focused on credit risk modeling, explainable AI, and cloud-native ETL architecture.
Previously: Fixnox, HKexcel Education. MS Financial Technology from UConn.

Impact
What I deliver
Credit risk classifier on 10K synthetic records
Realistic FIX protocol messages for ETL testing
Consistent scores across 70+ IB students
End-to-end AWS systems shipped to production
How I support engineering teams
- Build end-to-end ML pipelines with explainable AI for regulatory transparency in financial services.
- Design cloud-native ETL architectures on AWS following medallion patterns for progressive data refinement.
- Deliver production-grade data systems with structured logging, validation, and error handling.
Experience
Professional experience
AutoShopIQ
AI & Data Pipeline Engineer
Sept 2025 – Present · Stamford, CT
- Architected an end-to-end, event-driven data ingestion pipeline using AWS S3, Lambda, and IAM to capture, process, and store automotive repair documents.
- Built Document AI workflows leveraging Amazon Textract, OCR tooling, and Python to extract and normalize unstructured repair data into canonical JSON schemas.
- Designed AI-ready data pipelines integrating validation, transformation, and PostgreSQL storage for downstream ML and recommendation systems.
- Improved pipeline reliability with structured logging, error handling, and retry mechanisms for real shop environments.
Fixnox
Data Engineer Intern
Jan 2024 – Dec 2024 · Sydney, Australia (Remote)
- Architected a batch ETL pipeline on AWS (S3, Glue, Athena) to ingest and analyze FIX protocol trade messages following a medallion architecture.
- Developed a Python-based FIX message data generator producing ~23,000 realistic trade logs partitioned in Hive-style format.
- Engineered ETL transformations with PySpark and Pandas to parse raw CSV trade logs into optimized Parquet with derived trading metrics.
HKexcel Education
Physics & Mathematics Instructor
Nov 2019 – Jul 2023 · Hong Kong
- Delivered instruction to 70+ IB Mathematics and Physics students, resulting in consistent scores above 80%.
- Translated complex concepts into digestible modules, increasing student comprehension and achievement by 25%.
Projects
Featured work
Credit Risk Scoring Engine with Explainable AI
End-to-end credit risk classifier using XGBoost on 10,000 synthetic credit records with SHAP explainability for regulatory transparency.
- ROC AUC of 0.77 on 2,000-record test set
- SHAP waterfall plots, force plots, and beeswarm visualizations
- Addresses ECOA, FCRA, GDPR Article 22 requirements
- Interactive Streamlit dashboard with real-time scoring
Monte Carlo Portfolio Optimization
Simulated buy-and-hold vs rebalancing strategies (2017–2022), improving risk-return tradeoff through monthly rebalancing with S&P 500 data.
- Analyzed stock investment strategies and risk-return tradeoffs
- Monthly rebalancing improved portfolio performance
Predicting Problematic Internet Use
Classification model for adolescent internet risk using the HBN dataset from the Child Mind Institute.
- Built on the Healthy Brain Network (HBN) dataset
- F1 score of 0.73 for internet addiction risk prediction
Education
Academic background
University of Connecticut
Aug 2023 – May 2025 · Storrs, CT
MS Financial Technology
Key Coursework
City University of Hong Kong
Sept 2015 – Oct 2019 · Hong Kong
BEng Mechatronic Engineering
Certifications
Associate Data Engineer in SQL
DataCamp
Python & Statistics for Financial Analysis
Coursera
Python for Data Science, AI & Development
Coursera
Multivariate Calculus for Machine Learning
Coursera
Skills
Technical stack
Tools and platforms I use most in data engineering and ML work.