I was tasked to chase down robots for an online auction site. Human bidders on the site are becoming increasingly frustrated with their inability to win auctions vs. their software-controlled counterparts. As a result, usage from the site's core customer base is plummeting.
In order to rebuild customer happiness, the site owners need to eliminate computer-generated bidding from their auctions. The goal of this project is to identify online auction bids that are placed by 'robots', helping the site owners easily flag these users for removal from their site to prevent unfair auction activity.
APPROACH BREAKDOWN
My approach to identify these bids included:
The machine learning models I used were: Logistic Regression, Random Forest, Gbtree, Xgboost, Lightgbm.
The most useful features I identified were:
Nowadays, correction of grammatical errors is a common topic, and there are many tools available for checking spelling/grammatical errors in English, Spanish or Russian. In term of Vietnamese, there is not any tool and research for handling this topic.
Vietnamese is not easy to learn and very complicated, even both Vietnamese locals and Vietnamese learners usually make spelling/grammatical errors in the text. There are several types of error, such as spelling mistakes, using wrong words.
This project is my attempt at building a tool using machine learning to detect and fix spelling/grammatical errors in Vietnamese. This is the most challenging project I have ever done.
APPROACH BREAKDOWN
Approach to build dataset:
First Approach:
Second Approach:
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. They might describe how many bedrooms, where it locates, the house age. Based on those criteria, we might want to know the approximate how much that much may cost.
APPROACH BREAKDOWN
Approach to Regression Prediction:
The machine learning models I used were: Lasso, Elastic Net, Support Vector Regression, Gbtree, Xgboost, Lightgbm.
The report is built to answer the following business questions:
Language Adventures is a story-based scavenger-hunt game system that is comprised of a mobile application and a web application.
I worked on this project with 5 other teammates as a Scrum team at the university. I was the team leader, who was responsible for conducting research on the technology, and participated in the whole software development process from design to implementation and delivery. I developed software functionalities on the mobile application to ensure the user journey is not interrupted. Collaborating with peers to perform error analysis and make improvement based on feedback from client and end-users.
PROJECT BREAKDOWN
Mobile Application
Website Application
The Technologies:
The MVP:
This is a post dataset that contains 59 posts occurring between 2018-01-12 and 2020-10-28 for a marketing agency. The company mainly does digital advertising. Many customers of the company are small-to-middle promising brands. We will extract the insights by analyzing many factors that influence the performance of these posts such as Reach, Impression, Like, Comment, Share, etc.
APPROACH BREAKDOWN
Approach to Post Analysis:
The report is built to answer the following business questions:
In this project, I developed a machine learning tool for classifying music audio into different genres. I built 2 different tools from 2 machine learning approaches and then compared their performances.
The dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 music genres, each represented by 100 tracks. The tracks are all 22050Hz Mono 16-bit audio files in .wav format.
APPROACH BREAKDOWN
First Approach:
Second Approach:
This is a project about a movies recommender system, which replicates the industry use case. I developed this recommender system when I practiced the multiple concepts and techniques of recommendation. The system is well organized and allows customization on various supportive algorithms/filterings, which you can easily modify and apply in different use cases or objectives (e.g. movies, books, shopping items, music, video games, etc).
PROJECT BREAKDOWN
Recommender metrics: RMSE, MAE, HR, CHR, ARHR, Coverage, Diversity, Novelty.
Recommender similarity score: Cosine, Multi-dimensional Cosine, Adjusted Cosine, Time Similarity, Mean Squared Difference, Pearson, Jaccard, Spearman Rank Correlation.
Recommender filters:
The first campaign aims to acquire and retain users, convert them into frequent users of MoMo, an online wallet platform. Could we evaluate the performance of the campaign? Is there any interesting pattern in the data? And finally can we do anything about the secret we just found?
The second campaign aims to know if the different MoMo ages have different serviceId cluster, and how we do cross-sale to customers.
APPROACH BREAKDOWN
The report is built to answer the following business questions for the first campaign:
The report is built to answer the following business questions for the second campaign:
This is a transnational data set that has information of 100k orders from 2016 to 2018 made at multiple marketplaces in Brazil. Its features allow viewing an order from multiple dimensions: from order status, price, payment and freight performance to customer location, product attributes and finally reviews written by customers.
APPROACH BREAKDOWN
Approach to E-Commerce Analytics:
The report is built to answer the following business questions:
A fast food chain plans to add a new item to its menu, however between 3 possible marketing campaigns they undecided the one that might bring the greatest effect on sales. Some basic data we have is MarketID, Market Size, Location, Store Age, Promotion, Week, Sales. We will find figure out which promotion/campaign is the best in this case.
APPROACH BREAKDOWN
Approach to A/B Test:
The report is built to answer the following business questions:
Rossmann operates over 3,000 drug stores in 7 European countries. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance.
APPROACH BREAKDOWN
Approach to Sale Forecasting:
The report is built to answer the following business questions:
Are you tired of manually copying and pasting values in a spreadsheet? Do you want to learn how to obtain interesting, real-time and even rare information from the internet with a simple script? When it comes to data science – more and more data comes from external sources, thus knowing how to extract and structure that data quickly is an essential skill that will set you apart in the job market.
APPROACH BREAKDOWN
Approach to Web Scraping:
The report is built to answer the following business questions:
DATA INSIGHTS ANALYSIS
Go is an abstract strategy board game popular in East Asian countries. The game is invented for two players, in which the aim is to surround more territory than the opponent.
This project replicated a mini version of the Go game that I developed for Window using C++. This is the very first IT project I have done since my first day of learning to code. It brought the laughing and memory every time I read back these 'naive' lines of code.
AWS system of an E-Commerce application Cadabra.com
SYSTEM BREAKDOWN
1. Order History:
2. Product Recommendations:
3. Transaction Rate Alarm:
4. Log Analysis:
5. Data Warehousing & Visualization: