About me
Machine Learning Researcher (NLP) || Data Science Modeler
Izunna Okpala
Senior ML Research Scientist
My research intersects knowledge acquisition and inference generation with machine learning and natural language processing. Specifically, I study how an automated system can extract information from data to help avert crises and detect fraudulent activities.
Published work
Ongoing projects
Presentations
Working hours
Resume
My Resume is simplified below
Discover
2023-DateSenior Research Scientist
Data Science Research
- Currently working on the development of an interpretability system for large language models used in customer-facing Gen-AI applications and regulatory compliance models.
Data Science Modeler
Generative AI Team
- Designed the data map to pull the DCC Root Cause flags/notes from the Snowflake environment and performed exploratory data analysis (EDA) to understand the distribution of the various root causes.
- Replicated the UDAAP regulation compliance model using the GPT2 frameworks to detect agents' adherence to regulations during customer interactions.
- Supported the development of the No Contact model, leveraging the LLaMa framework.
Anti-Money Laundering (AML) Team
- Developed a hybrid machine learning model – NLP (RoBERTa) and Non-NLP (XGBoost) for fraud detection and bias control.
- Automated the data preparation workflow in SAS, optimizing the ETL in Snowflake and improving efficiency.
- Validated the text preprocessing pipeline using RoBERTa and implemented bias control mechanisms, which includes the removal of sensitive business domain information from texts to mitigate model bias.
- Implemented Principal Component Analysis (PCA) to transform unbalanced numeric data samples and utilized SMOTE to balance oversampled data representation successfully, and Information value to weight the features.
University of Cincinnati
2020-2023Research Fellow/Data Scientist
Digital Futures (Smart Synergies Lab)
- Led the research and development of a human perception algorithm to gauge attitudes during crisis, leveraging leading NLP frameworks such as Transformers, SpaCy, CoreNLP, Gensim, and NLTK.
- Conducted in-depth research on recommender systems using the collaborative filtering and two-tower approach for X post and user recommendation.
- Designed a Fuzzy inference logic and incorporated it into the human perception pipeline, leveraging Sentence BERT (SBERT) to score similarities and enhance text prediction with comprehensive context understanding.
- Engineered a scalable, end-to-end data pipeline to harmonize data from diverse sources, including X, Dimensions.ai, and Lexis Nexis.
School of IT (Civic Tech Lab)
- Conducted machine learning research to detect similarities between 911, 311, and social media data for easy crisis response with Cincinnati open data and Twitter academic API.
- Developed an API module for an operational picture tool (PIVOT) used to map local beliefs towards the COVID-19 pandemic with the Django framework.
- Developed an NER and situational awareness flag from CrisisNLP dataset for understanding patterns in crisis.
Para Systems
2015-2020Data Engineer/Solution Architect
I deployed data-centric systems for performance evaluation/prediction with the Naïve Bayes classifier. I also worked with a team that integrated existing BI reports into machine learning readability formats like JSON with Pandas, Numpy packages, and ChartJS for growth analysis. As a BI Developer, I designed and deployed BI solutions in Tableau, Qlik Sense, PowerBI, and SAP Lumira. I and my team also integrated data from PostgreSQL, MySQL, and OracleDB into BI platforms. Para Systems LTD is a leading BI Development Company in Nigeria having delivered solutions for the likes of NNPC Group, First Bank of Nigeria Plc, Union Bank of Nigeria Plc, City Securities Ltd, BT, Norweb, Telecom, MTN, Zain and Visafone.- Data Analysis
- Software Development
- Business Intelligence(BI) Development
NINLAN
2016-2020Researcher (National Institute for Nigerian Languages)
Conducted academic research that addressed indigenous language archiving to eliminate language extinction in Nigeria with a focus on language data mining. Additionally, I developed software for digital language archiving and community involvement in Nigeria. There are over 450 indigenous languages in Nigeria.Researches carried out
- Development of Language Mining Platform for Nigerian Languages
- Machine Learning Algorithm for Nigerian Languages
- Text-to-Speech Synthesis in Nigerian Languages
Information Stash
2014-2018Senior Software Developer
Information Stash is the leading media platform for all tech-related newsfeed. They engage in the business of Web/Software Development and management. They are also at the forefront of providing research analytics to organizations or persons of interest. The company began operation in the year 2013 and has developed over the years on its engagement with clients.- University Portals
- Mobile App integration
- Tech Blog
Agro Stash
2017-2020Senior Software Developer
Agro Stash as the name suggests is an information Technology-driven Agricultural platform with the objective of empowering the youths in Nigeria through Agriculture. They intend to achieve this by offering various individuals who are ready to work with land spaces at a very reduced rate. At the harvesting season, all the products from the various farms can be sold through the Agro Stash platform and the proceeds given to the farmers.- Youth Empowerment
- Land Space provision
- E-commerce system
Services
Data Analysis
App Development
Branding
Web Design
Data Analysis
Collects, cleans, preprocesses, visualizes and studies data sets to help solve problems...
- Mobile App
- Desktop App
- Database Driven App
- Model
- Develop / User inerface
- Testing
- Integration
App Development
We offer BI Development, Software Development, Network Administration, Database Development, Database Management, and the transition from many manual platforms to digital systems.
Branding
Are you Influential, or you run n organization? Do you want to move from the backdoor to lime-light. We're offering you an opportunity to key into our program. We will brand your personality or business to a marketable solution.
Ready to order your project ?
Skills
Researcher
Certifications or Trainings.
- IBM Data Science Professional V2
- IBM Data Science Professional
- Clinical Research Conduct – CITI Program
- Responsible Conduct of Research – CITI Program
- HSR CORE for 2017 Program – CITI Program
- Applied Data Science II: Machine Learning & Statistical Analysis (with honors)
- Applied Data Science I: Scientific Computing & Python (with honors)
- Machine Learning (Stanford University)
- Certified ScrumMaster by Scrum Alliance.