Sentiment Analysis with SetFit on SST‑5

In this project, I leveraged SetFit, Hugging Face’s prompt‑free few‑shot classification framework, to tackle the challenging SST‑5 task—fine-grained sentiment analysis across five classes (very negative to very positive) What makes SetFit so compelling is its efficiency and simplicity: This project highlights how evolving NLP techniques—from static embeddings to contextual transformers—impact performance and interpretability. Fine-tuned transformer … Read more

Time Series Forecasting for Book Sales: My Journey with Nielsen BookScan Data

In this project, I dive into the practical world of time series analysis to forecast sales and demand—a crucial skill in today’s data-driven business environment. By working with real sales data from Nielsen BookScan, I have the opportunity to bridge the gap between theory and real-world application. My goal is to turn historical data into … Read more

Analysis of PureGym Reviews: Data Cleaning, Sentiment Analysis, and Topic Modeling

With over 2 million members and 600 gyms across the UK, Denmark, and Switzerland, PureGym has established itself as one of the world’s leading value fitness operators. Since its founding in 2008, the company has appealed to a broad customer base by offering flexible, affordable, and high-quality fitness facilities. Key to its success is a … Read more

Applying supervised learning to predict student dropout

Student retention is critical for educational institutions, impacting financial sustainability andacademic success. High dropout rates can lead to revenue losses and reputational damage.Study Group, a global education provider, aims to enhance student success by identifying atrisk students early and implementing proactive interventions. This study applies supervisedmachine learning techniques to predict dropout risks, enabling Study Group … Read more

Customer segmentation with clustering

Customer segmentation enables a business to group customers based on demographics (e.g., age, gender, education, occupation, marital status, and family size), geographics (e.g. country, time zone, language, and location), psychographics (e.g. lifestyle, values, personality, and attitudes), behaviour (e.g. purchase history, brand loyalty, and response to marketing activities), technographics (e.g. device type, browser type, and original source), … Read more

Ship Engine Anomaly Detection Model

A poorly maintained ship engine in the supply chain industry can lead to inefficiencies, increased fuel consumption, higher risks of malfunctions, and potential safety hazards. Issues with engines could lead to engine malfunctions, potential safety hazards, and downtime causing delayed deliveries, resulting in the breakdown of a ship’s overall functionality, consequently impacting the business, such … Read more

Credit Risk Analysis and Model Prediction for Personal Loan Applications

In this project, I performed a comprehensive analysis of a loan application dataset to create a model predicting the probability of loan default. The dataset includes applicant information such as age, income, employment background, and loan-specific details like loan amount, interest rate, and purpose. The goal was to develop a model that would enable financial … Read more

Vehicle Price Prediction Portfolio

In this project, The aim is to build a predictive model to estimate used vehicle prices based on various attributes such as brand, model, mileage, engine type, transmission, etc. I am utilizing a dataset containing vehicle information that is readily available on Kaggle. This dataset comprises 188,532 data points, each representing a unique vehicle listing, … Read more