Data Scientist with 4+ years of experience in R&D analytics, specializing in leveraging healthcare datasets and data science techniques to deliver insights and solve business challenges for leading pharmaceutical firms.
View My GitHub Profile
Data Scientist
Experience
JULY 2023 - PRESENT
Data Science Associate Consultant | ZS Associates | Bengaluru
- Developing a GAN-based framework for Patient Digital Twins to simulate synthetic patient journeys aimed at clinical trial monitoring and risk prediction. This project utilizes Generative Adversarial Networks (GANs) and deep learning (PyTorch) to analyze multivariate, temporal health data and predict patient retention or dropout. The goal is to enhance early risk detection in clinical trials by creating high-fidelity synthetic patient data for simulation and intervention planning.
- Created a risk stratification model using Cox-Proportional Hazards regression to forecast adverse outcomes (ICU admission, IMV usage, readmission, mortality) in a COVID-19 patient cohort. Emphasized feature selection, addressing class imbalance, and deriving clinically significant insights to inform early intervention strategies. The findings are currently under review for publication in partnership with clinicians and a scientific review board.
- Developed a pipeline designed to identify fabrications and anomalies in clinical trials, which assisted the internal audit team in detecting risk signals, ultimately enhancing confidence in the data submitted for clinical trials.
- Created an LLM-powered SQL translation framework to automate query adaptation across various real-world healthcare datasets, facilitating seamless integration of new data sources without depending on standardized data models. Implemented an adaptive context-reduction approach to enhance query determinism, minimizing LLM hallucinations by dynamically filtering metadata relevant to schema domains.
- Implemented a solution utilizing a transformer-based model and K-Means clustering to identify complex patient journeys within a therapeutic area, revealing insights on nuanced pathways that can assist pharmaceutical brands in targeting the most pertinent patient populations.
- Achieved second place globally in the patient prediction challenge hosted by PrecisionFDA for predicting heart failures and hospital readmissions among US veterans. The initial phase of the challenge was assessed using synthetic data, while the top solutions were evaluated on real-world data behind FDA firewalls.
- Employed time-series models (Exponential Smoothing, ARIMA, and LSTM) to forecast demand quantities, optimizing market costs and enhancing supply chain efficiency.
FEBRUARY 2023 – JUNE 2023
Data Science Associate | ZS Associates | Bengaluru
- Developed a solution using hierarchical density-based clustering and BERT embeddings to extract key attributes from user reviews for multiple brands of beers. Implemented a scoring metric using sentiment analysis to identify popular brands based on the reviews.
- Analyzed claims data to generate metrics for identifying key healthcare providers and organizations which can be targeted by the brand to enable tailored digital therapies for rare mental disorders.
- Worked on EMR data to build a data quality framework utilizing business rules and anomaly detection modules. This helped identify data quality issues across univariate, multivariate, and temporal spaces. Implemented a subspace-monitoring based algorithm which can help detect temporal anomalies in sparse, irregular time series.
NOV 2020 –JANUARY 2023
Decision Analytics Associate | ZS Associates | Bengaluru
- Worked on the proof of concept for “prospective support arm” where the idea was to generate a synthetic arm using real-world EMR data which can help prospectively monitor patient characteristics based on the usage of different medications. This can help save millions of dollars during clinical trials and expedite the drug development phases by providing real-time insights for the patient characteristics for primary care medications within the therapeutic area.
- Conducted comprehensive analysis of disease prevalence, comorbid conditions, treatment landscape and physician specialties leveraging claims data to identify lucrative market opportunities for a top pharmaceutical brand.
- Developed a disambiguation methodology using fuzzy-matching, NER taggings and business rules to identify influential key opinion leaders within different therapeutic areas for strategic decision-making and marketing efforts.
JUNE 2019 – AUGUST 2020
Application Development Associate (SAP ABAP) | Accenture| Kolkata
- Developed tools for large data migration from client to SAP systems using LSMW and LTMC.
- Implemented ABAP report programs for analyzing work breakdown structures in SAP systems.
- Developed a report program for error identification during mass data transfers.
Skills and Certifications
Technical Skills
- Python, PySpark, SQL, C++
- Amazon Redshift, Amazon EC2, Git
Courses and Certifications
- Udacity Machine Learning Nanodegree
- Udacity Data Analytics Nanodegree
Awards and Recognitions
Received InGenius award (2023) for spearheading an anomaly detection framework, winning global second position in precisionFDA’s VCHAMPS challenge and playing a pivotal role in securing a $0.6M project through a compelling proposal.
Education
- JULY 2012 – JULY 2013
Secondary Examination | South Point High School | 89%
- JULY 2014 – JULY 2015
Higher Secondary Examination | South Point High School | 84.6%
- JULY 2015 – MAY 2019
Bachelor of Computer Science and Engineering | Kalinga Institute of Industrial Technology | 8.55 / 10