Senior Research Scientist
Data Science Research
- Currently working on the development of an interpretability system for large language models used in customer-facing Gen-AI applications and regulatory compliance models.
Data Science Modeler
Generative AI Team
- Designed the data map to pull the DCC Root Cause flags/notes from the Snowflake environment and performed exploratory data analysis (EDA) to understand the distribution of the various root causes.
- Replicated the UDAAP regulation compliance model using the GPT2 frameworks to detect agents’ adherence to regulations during customer interactions.
- Supported the development of the No Contact model, leveraging the LLaMa framework.
Anti-Money Laundering (AML) Team
- Developed a hybrid machine learning model – NLP (RoBERTa) and Non-NLP (XGBoost) for fraud detection and bias control.
- Automated the data preparation workflow in SAS, optimizing the ETL in Snowflake and improving efficiency.
- Validated the text preprocessing pipeline using RoBERTa and implemented bias control mechanisms, which includes the removal of sensitive business domain information from texts to mitigate model bias.
- Implemented Principal Component Analysis (PCA) to transform unbalanced numeric data samples and utilized SMOTE to balance oversampled data representation successfully, and Information value to weight the features.