Senior Data Analyst

Sanket
Chavan

โ— OPEN TO WORK

I build data systems that go from raw pipeline to boardroom dashboard โ€” and I don't stop at 5pm.

๐Ÿ“ Austin, TX

Sanket Chavan โ€” Senior Data Analyst

About Me

I'm Sanket โ€” a data engineer and analyst who genuinely enjoys turning messy, complex data into something a business can actually act on. I hold a Master of Science in Data Science from the University at Buffalo, and I've spent the last 5 years at Deutsche Bank building production-grade compliance data systems that real teams depend on every day.

At Deutsche Bank, I serve as Client Service Manager (CSM-1) for two enterprise compliance applications, owning them end-to-end across APAC, Europe, and the Americas. My work spans everything from building real-time ETL pipelines that monitor 131 daily broker feeds, to designing Power BI dashboards used by compliance officers and executive stakeholders, to engineering the automation frameworks that eliminated an entire category of manual work. I'm also the primary bridge between Compliance Officers and the Engineering team โ€” translating regulatory requirements into data architecture and delivering ad-hoc analysis when it matters most.

Outside of work, I don't switch off. I independently designed and built MktMoves โ€” a production-grade SEC filing intelligence platform that processes 2.6M+ institutional investment filings through a fault-tolerant Airflow pipeline and surfaces them through a clean Next.js frontend. It's the kind of financial intelligence that was previously only accessible through expensive data vendors. I built it solo, end-to-end, because I wanted to and because I could.

I'm currently relocating to Austin, TX, and looking for my next challenge as a Senior Data Analyst, Business Intelligence Engineer, or Data Engineer. Long-term, my goal is to run a team of data enthusiasts who build the kind of data infrastructure and products that drive real business outcomes โ€” not just reports nobody reads.

When I'm not deep in a SQL query or debugging a pipeline at midnight, you'll find me on a soccer field, gaming, planning the next road trip, or unwinding somewhere near the water. I believe the same qualities that make a good data person โ€” curiosity, attention to detail, and comfort with ambiguity โ€” also make for pretty good company.

Skills

Languages & Querying SQL, Python, Shell Scripting, R, PySpark
Databases & Cloud Oracle, PostgreSQL, MongoDB, MS-SQL, GCP, AWS
Data Engineering & ETL Apache Airflow, Control-M, Kafka, FastAPI, Spark
Data Modeling Star Schema, Materialized Views, Indexing, DAGs, Normalization, Workflow Automation
BI & Visualization Power BI, Tableau, Matplotlib, Plotly, Pandas, NumPy
ML & Analytics Regression, Classification, NLP, RAG, Scikit-learn, NLTK
Other Tools Ansible, Docker, Git, ServiceNow, JIRA, Postman, Jupyter

Experience

Deutsche Bank

Jan 2021 โ€“ Present

Senior Data Analyst

Employee Trading Compliance Applications

SQL Python Power BI GCP
  • As Client Service Manager (CSM-1), owned full end-to-end operational responsibility for two enterprise compliance applications, overseeing broker onboarding, infrastructure changes, capacity planning, and release management across APAC, Europe, and Americas.
  • Engineered a real-time broker feed monitoring system using Control-M job scheduling, Oracle SQL validation, and hash-based integrity checks to ingest and validate 131 daily feeds from 52 brokers. Integrated automated incident creation via ServiceNow APIs, cutting resolution time from 4โ€“5 days to under 2 hours and eliminating 100% of manual tracking previously flagged in regulatory audits.
  • Designed and deployed an automated reporting pipeline using Shell scripting and Oracle stored procedures, eliminating manual generation of 150+ weekly compliance reports, reducing L2 workload by 73% and report delivery time by 85%, scaled bank-wide across 18+ applications.
  • Replaced manual SSH-based production access with an Ansible playbook framework implementing RBAC, multi-level approvals, and full execution logging, achieving 100% elimination of privileged logins and full audit traceability across all compliance applications.
  • Built a Power BI analytics platform on top of ServiceNow Oracle data using Materialized Views, incremental refresh, and optimized SQL indexing, delivering 13 KPIs that produced a 400% improvement in L2 resolution time (5 days โ†’ 24 hours), adopted org-wide.
  • Acted as the technical bridge between Compliance Officers and the Engineering team, translating regulatory requirements into data architecture decisions and delivering ad-hoc analysis to support executive and audit stakeholders.
  • Led enablement of a 6-person L2 support team across 3 regions: authored KB articles, conducted KT sessions, and tracked performance via live dashboards, sustaining resolution time improvements and reducing repeated incidents.
  • Currently contributing to enterprise GCP migration, working hands-on with BigQuery, GKE, and Cloud Storage to modernize compliance data infrastructure from on-prem Oracle to cloud-native architecture.

People Tech Group

Sep 2019 โ€“ Nov 2020

Junior Data Scientist

Elliptica Data Platform

Python Kafka SQL Tableau MongoDB MS-SQL
  • Engineered fault-tolerant data connectors for relational and non-relational databases to capture real-time changes using Python and Kafka.
  • Tracked and streamed database changes to Tableau dashboards through Redshift Data Warehouse, enabling real-time visualization across multiple use cases.
  • Performed stress testing on data pipelines to ensure scalability and reliability under various load conditions.
  • Coordinated with cross-functional remote teams for system design and sprint planning.

Candidate Screening Chatbot

Rasa Python NLP PostgreSQL
  • Led backend development for an AI-powered Candidate Screening Chatbot using the Rasa open-source ML framework, designed to conduct preliminary interview rounds autonomously.
  • Integrated NLP-driven intent recognition and entity extraction with a PostgreSQL database for candidate data management.
  • Delivered a fully functional prototype within two months, capable of understanding user messages and generating context-appropriate responses.

University at Buffalo

Dec 2018 โ€“ Sep 2019

Data Science Research Assistant

Spatial-Temporal Analysis on GPS Data โ€” NIH Funded

Python R Tableau SSIS MS-SQL
  • Conducted spatial-temporal analysis on GPS data collected through an NIH-funded survey on travel behavior and social influence.
  • Built an ETL pipeline to clean, transform, and load data into MS-SQL using SSIS.
  • Constructed interactive maps in R to visualize user GPS trajectories and cluster most-visited locations.
  • Analyzed and visualized social influence on travel behavior at both individual and household levels.
  • Built predictive models using Markov Chain and Conditional Probabilistic Models to forecast user activity by day and time.
  • Designed a co-location detection program in Python and visualized the resulting social network as a Tableau dashboard.

Niagara Falls Bridge Commission

Sep 2017 โ€“ Dec 2017

Data Analyst Intern

Traffic Analysis โ€” Lewiston-Queenston Bridge

Python ML Excel
  • Performed exploratory data analysis on bridge crossing data to identify traffic patterns and seasonal trends.
  • Conducted a survey to study traveler behavior and sensitivity to toll cost changes.
  • Achieved a 13.11% reduction in peak-hour traffic using a Multinomial Logistic Regression model to recommend optimal toll pricing strategies.

Projects

Python Azure Synapse SQL Apache Spark Power BI Cosmos DB

NYC Taxi Data Analytics โ€” Azure Synapse

Processed three years and 100M+ records of NYC Yellow Taxi data using Azure Synapse dedicated SQL pools and Spark notebooks, then surfaced actionable insights through Power BI.

  • Configured dedicated SQL pools and Spark pools to manage and analyze 100M+ records across 2021โ€“2023
  • Ingested, transformed, and loaded 1TB+ of data using Serverless SQL Pool, Spark Pool, and automated Synapse pipelines
  • Enabled real-time analytics by integrating Synapse Link with Cosmos DB
  • Identified that 60% of trips were paid by credit card, with Queens showing a unique pattern of higher cash transactions
  • Built Power BI dashboards revealing Manhattan accounted for 40% of overall taxi demand, with peak demand on Fridays
Python PyTorch Scikit-learn Collaborative Filtering RBM Autoencoder

Movie Recommender System

Built and benchmarked multiple recommendation system approaches including collaborative filtering, deep learning, and matrix factorization to predict user movie preferences.

  • Developed Item-Based and User-Based Collaborative Filtering models with Cosine, Pearson, and MSD similarity metrics
  • Trained KNN and SVD++ models to predict user ratings for unseen movies
  • Achieved 75% accuracy in user preference prediction using a Restricted Boltzmann Machine (RBM)
  • Enhanced recommendation quality with an Autoencoder model, maintaining an average error of ~1 star rating
Python XGBoost LightGBM Random Forest Feature Engineering Bayesian Optimization

Movie Revenue Prediction

Predicted global movie revenue using advanced ensemble ML models and extensive feature engineering on a real-world dataset with significant missing data challenges.

  • Performed EDA to remove anomalies, engineered 39 new features, and imputed missing values using predictions from a Bayesian-optimized XGBoost model
  • Built and compared Linear Regression, Bayesian-optimized Random Forest, XGBoost, and LightGBM models
  • Achieved an RMSLE score of 0.96

Certifications

Microsoft Certified: Power BI Data Analyst Associate

Microsoft

Certified Tableau Desktop Specialist

Tableau / Salesforce

Education

Master of Science in Engineering (Data Science Focus)

University at Buffalo, SUNY

Buffalo, NY ยท Feb 2019

GPA: 3.73 / 4.0

Bachelor of Engineering in Mechanical Engineering

North Maharashtra University

Jalgaon, India ยท May 2016

GPA: 8.10 / 10.0

Testimonials

I was impressed with Sanket's ability to understand and solve problems efficiently. His choice of visualization library was very good, and I was impressed with the way he designed the probabilistic model and Tableau dashboard. Sanket would be a true asset for any positions requiring R, Python, and Tableau.

Dr. Qing He

Professor, University at Buffalo

Sanket's creative thinking, expertise, and positive can-do attitude make him an absolute pleasure to work with. He continually delivers results and goes above and beyond. His strengths in staying across issues, pro-actively offering solutions, and being adept at all aspects of communications make him a valuable contributor.

Prachi Sharma

Former Colleague