Data Scientist
- Technical Skills: Python, SQL, Tableau, R, Rust, JavaScript, HTML5, CSS
- Concepts : Regression, Classification, Clustering, Advanced NLP, Neural Networks, RAG Pipelines, LLMs.
- Data Visualization : Tableau, PowerBI, Streamlit, Plotly, RShiny
- Frameworks : HuggingFace, Tensorflow, Keras, Transformers.
- AI : ChatGPT prompt engineering, Claude, Grok, Copilot, Perplexity.
- Cloud : Microsoft Azure (Azure Fundamentals Certified), Docker, DataBricks, Azure DevOps, CI/CD Pipelines
- Soft Skills : Clear communications, Leadership, People Manager, AGILE frameworks, Public speaking
- Tools : Excel, VBA, PowerPoint, Word, Jira, SAP HANA.
Education
- M.S., Computer Science, Symbiosis International University (April 2020)
- B.S., Computer Applications, Symbiosis International University (April 2018)
Work Experience
Senior Data Scientist @ Accenture for San Diego Gas & Electric (May 2024 - Present)
- Created End to End CI/CD Pipelines to stitch multiple data sources and get actionable insights from 8 million rows, leading to improved effectiveness by identifying key drivers for Diverse Business Enterprise.
- Balanced covariates using advanced statistical techniques and applied K-Means clustering on cost probabilities from XGBoost, Random Forest propensity models, ensuring reliable forecasts saving up to 2 FTE’s work; ~ $200,000.
- Built 18 operational Power BI & Tableau dashboards backed with NLP dealing with open text data, focused on using base LLMs liked Roberta, BERT and layering them with RAG pipelines. Deployment using Azure DevOps and Docker Containers.
Data Science Manager @ HSBC (June 2023 - May 2024)
- Built an end-to-end tool for Employee Lifecycle to show case Employee Skills (using Selenium - GitHub scraping to extract job skills), Employee Mood (Quantifying open text surveys using Advanced NLP techniques like zero shot classification, custom embeddings to cater to organization lingo, downsizing dimensionality), Workforce Churn (Linear Regression, Lasso Regression, Optimization Techniques, Time Series Forecasting, SQL), Median Pay indicator (SHAP for explainability)
- Improved employee survey submissions from 60% to 75% by collaborating with senior management across customer success and finance teams, identifying and addressing customer pain points to drive engagement.
- Successfully deployed an NLP gray box tool set up using Azure from scratch for everyone in the organization to get accurate and reliable insights from all of the open text corpora they have in form of any readable file.
- Player a key role in Mentoring the onsite team of 28 folks for adaption of Advanced Python in day-to-day activities and using Machine Learning algorithms to enhance their reporting with impactful forecasted details.
- Secured 1 Regional award and 1 Global award along with 1 promotion.
Senior Data Scientist @ HSBC (July 2020 - June 2023)
- Led the initiative of refactoring existing reporting techniques and helped automate 835 PowerPoint cuts for each business using Python PPTX, APIs, in-memory SQL for faster querying which cut down time from 2 months of reporting to 22 hours of ready reports in the inbox of employees ranging from Peoples Manager to Business Heads.
- Created Statistical and Machine Learning Models to forecast the FTE & Cost at Employee Level which served as an organization wide benchmark base to track churn in form of monthly executive reports presented to the board. The board would take actions based off of the numbers (the forecast numbers were AB Tested and Audit screened before taking any decisions).
- Worked cross functionally with Data Science Leads at Microsoft, Glass Door and TechWolf to enhance the inhouse capability of Employee Engagement and see how the actual data is doing against the Market Data. (Market Comparative Analysis, Employee Mood Analysis, Glassdoor & Techwolf APIs).
- Secured 3 Global awards and 2 promotions.
Projects and Recent Contributions
- Kaggle: Detect AI Generated Texts using LLMs (repo)
- Neural Network & Deep Learning (repo)
- Machine Learning (repo)
- OpenCV Projects (repo)
- Auto Encoders (repo)
- Reinforcement Learning (repo)
- Full Stack Data Science Webapps (repo)