Available for new opportunities

Avvaru Surya Teja

ML & Data Engineer with 5+ years building production-grade pipelines, deploying predictive models, and turning messy data into decisions — across fraud detection, supply chain analytics, and real-time systems.

Python · SQL · Spark SAP MM · Tableau · Power BI XGBoost · TensorFlow · scikit-learn AWS · Azure · GCP Apache Kafka · Flink M.S. Data Science — UMBC 3.88

Two modes. One engineer.

My background bridges two domains most people pick one of — data engineering and supply chain operations. I bring both together.

ML & Data Engineer

Building real-time fraud detection platforms, deploying XGBoost classifiers into streaming pipelines, and engineering scalable data infrastructure at DXC Technologies and Dell. I ship models to production — not just notebooks.

Fraud Detection Model Deployment Real-Time Pipelines MLOps
📦

Supply Chain Analyst

3+ years managing procurement operations, building inventory dashboards in SAP MM and Excel, and designing demand forecasting models that drove $120K+ in supply chain savings at Panther Motocorp and Cyient.

SAP MM Demand Forecasting Inventory Mgmt S&OP
🎓

Data Scientist (Academic)

M.S. Data Science from UMBC (GPA 3.88). Built CNNs, NLP pipelines, reinforcement learning agents, and statistical models across a wide range of domains. Strong quantitative foundation meets engineering execution.

Statistical Modeling Deep Learning NLP Optimization

Where I've built things.

5+ years across ML engineering, supply chain operations, and data analytics.

Jul 2025 — Present Current
ML & Data Engineer
DXC Technologies — Maryland, USA
  • Architected a real-time fraud detection platform on Apache Flink, Kafka & Cassandra — sub-second ML scoring of millions of financial transactions daily.
  • Deployed XGBoost & gradient boosting classifiers into live streaming pipelines; tuned precision-recall tradeoffs on highly imbalanced fraud datasets.
  • Built automated model drift monitoring with Datadog & Grafana — threshold-based retraining triggers, 40% reduction in incident response time.
  • Partnered with data science, product, and compliance teams to align model outputs with regulatory requirements.
Jan 2025 — May 2025 Internship
Data Engineer Intern
DXC Technologies — Maryland, USA
  • Built serverless data pipeline framework using AWS Glue & BigQuery with reusable ML feature engineering components.
  • Implemented dbt transformations and lineage tracking; containerized modules with Docker & Terraform.
  • Integrated Grafana & Datadog dashboards for real-time pipeline anomaly detection.
Nov 2019 — May 2023 Supply Chain
Assistant Manager — Strategic Sourcing & Vendor Development
Panther Motocorp Pvt Ltd — Hyderabad, India
  • Managed supply chain planning for automotive components across multi-supplier network in India & Southeast Asia.
  • Built Excel (VBA macros) & SAP MM dashboards for PR-to-PO tracking, lead times, and vendor KPIs.
  • Designed replenishment models identifying $120K+ in annual savings through optimized inventory positioning.
  • Built Python regression & forecasting models on SAP-exported data to project quarterly spend by commodity.
  • Reduced vendor onboarding cycle time by 18% through structured scorecard systems.
Jun 2021 — Jun 2022 Supply Chain
P2P Analyst
Cyient Ltd — Remote / Hyderabad, India
  • Managed end-to-end Procure-to-Pay in SAP MM across APAC & North America — PRs, PO lifecycle, inventory confirmations.
  • Built Excel trackers reducing procurement cycle time by 12%; Power BI Pareto analysis for replenishment strategy.
  • Enhanced SAP extract reports for improved supply analytics dashboard visibility.
May 2018 — Jul 2023 Data Engineering
Software Engineer II — Data & ML Engineering
Dell Technologies — India
  • Led Unified Customer Data Lake consolidating records across business units using Hadoop, Spark & AWS S3.
  • Optimized ETL/ELT pipelines in Spark & Airflow — 45% improvement in batch ingestion throughput across 25+ sources.
  • Integrated Kafka for real-time ingestion; Redshift models for customer segmentation & behavioral analytics.
  • Deployed Prometheus & ELK Stack monitoring — 60% faster incident detection.

The full toolkit.

Spanning ML engineering, data infrastructure, supply chain systems, and analytics.

Machine Learning & AI
scikit-learn TensorFlow XGBoost Random Forests Logistic Regression CNNs / RNNs Model Validation Drift Detection AUC-ROC
Languages
Python SQL Scala Bash R (basic) VBA / Excel Macros
Big Data & Streaming
Apache Spark Apache Kafka Apache Flink Hadoop Airflow dbt PySpark
Cloud Platforms
AWS S3 / Glue / EMR Azure Databricks GCP BigQuery GCP Dataflow Snowflake Redshift
Supply Chain & ERP
SAP MM PR / PO Lifecycle Safety Stock Modeling Demand Forecasting S&OP Inventory Mgmt BOM Vendor Scorecards
Analytics & BI
Tableau Power BI Looker Grafana Datadog Prometheus ELK Stack
DevOps & MLOps
Docker Kubernetes Terraform Jenkins GitHub Actions CI/CD
Statistical Methods
Regression Analysis Linear Programming Six Sigma Theory of Constraints A/B Testing Feature Engineering

Things I've shipped.

Production systems, academic research, and everything in between.

🔍
Production
Fraud Model Validation & Drift Monitoring Framework
Automated validation and drift detection for a real-time fraud pipeline — tracking AUC-ROC, precision, recall, F1 over time. Evaluated XGBoost classifiers for feature importance, bias, and fairness metrics for regulatory compliance.
PythonXGBoostscikit-learnKafkaDatadogGrafana
40% faster incident response
Production
Real-Time Clickstream & Behavioral Analytics Pipeline
Production-grade real-time pipeline for e-commerce clickstream data using Flink & Cassandra. Schema-optimized Avro storage on GCP Dataflow. Looker dashboards surfacing conversion trends for product and marketing teams.
Apache FlinkCassandraGCP DataflowAvroLookerScala
Data Quality
Automated Data Quality Framework for ML Feature Pipelines
Reusable dbt + SQL validation framework detecting schema drift, null spikes, and integrity errors in Snowflake feature tables. Datadog alerts notify model owners before degradation reaches production.
dbtSnowflakeSQLKubernetesTerraformDatadog
📦
Supply Chain
Spend Classification & Inventory Analytics Automation
Automated supplier spend classification using Python on SAP MM-exported data. Regression models for quarterly commodity spend forecasting and supply anomaly detection, reducing manual processing by 70%.
PythonPandasSAP MMExcel VBAPower BI
$120K+ annual savings identified
🧠
Academic
Malware Detection with ML — EMBER 2018
Binary classification on EMBER 2018 dataset using engineered static features. Trained XGBoost and Logistic Regression; evaluated with confusion matrix, F1, and ROC curve analysis.
PythonXGBoostscikit-learnEMBER 2018
🌊
Academic
Real-Time Twitter Sentiment Pipeline
Real-time NLP pipeline using Apache Spark Streaming and Tweepy for tweet polarity classification via TextBlob. Processed and stored results for downstream trend analysis.
Apache SparkTweepyTextBlobNLPPython

Academic foundation.

Master of Professional Studies
Data Science
University of Maryland, Baltimore County (UMBC)
Baltimore, MD, USA  ·  Aug 2023 — May 2025
3.88
/ 4.0 GPA
Key Coursework
Machine Learning Statistical Modeling Deep Learning Big Data Analytics Linear Programming Operations Research Data Mining
Bachelor of Technology
Mechanical Engineering
Godavari Institute of Engineering & Technology
India  ·  2015 — 2019
3.3
/ 4.0 GPA
Foundation Areas
Manufacturing Systems Production Planning Logistics Operations Mgmt CAD / CAM

Let's build something.

Open to ML engineering, data engineering, and supply chain analytics roles. Based in Maryland — open to relocation and remote.