Machine Learning


Evaluation of Regression Model

GitHub Repository:


Project Tasks

Task:

Employees' years of experience and salary information are given.

Years of Experience (x) Salary (y)
5 600
7 900
3 550
3 500
2 400
7 950
3 540
10 1200
6 900
4 550
8 1100
1 460
1 400
9 1000
1 380

Step 1: Create the linear regression model equation according to the given bias and weight.

Step 2: Estimate the salary for all years of experience in the table according to the model equation you have created.
Step 3: Calculate MSE, RMSE, MAE scores to measure the success of the model.

Requirements


Evaluation of Classification Model

GitHub Repository:


Project Tasks

Task 1:

A classification model has been created that predicts whether the customer is churn or not. The actual values ​​of 10 test data observations and the probability values ​​predicted by the model are given.

  Actual Value Model Probability Estimation
(Probability of belonging to class 1)
1 1 0.7
2 1 0.8
3 1 0.65
4 1 0.9
5 1 0.45
6 1 0.5
7 0 0.55
8 0 0.35
9 0 0.4
10 0 0.25
  Model Prediction
    Non-Churn (0) Churn (1)  
Actual Value Non-Churn (0) ? ? ?
Churn (1) ? ? ?
    ? ?  

Task 2:

A classification model has been created in order to detect fraudulent transactions during transactions made through the bank. The success of the model with 90.5% accuracy rate was found to be sufficient and the model was taken live. However, after going live, the output of the model was not as expected, and the business unit reported that the model was unsuccessful. The confusion matrix of the prediction results of the model is given below. According to this;

  Model Prediction
    Non-Fraud (0) Fraud (1)  
Actual Value Non-Fraud (0) 900 90 990
Fraud (1) 5 5 10
    905 95  

Requirements


Telco Customer Churn Prediction

GitHub Repository:


Business Problem
Develop a machine learning model that can predict customers who will leave the company.

Perform the necessary data analysis and feature engineering steps before developing the model.

Dataset Story
Telco churn data includes information about a fictitious telecom company that provided home phone and internet services to 7.043 California customers in the third quarter. It shows which customers have left, stayed or signed up for their service.

Variables


📝 Notes

Requirements


House Price Prediction Model

GitHub Repository:


Business Problem
It is desired to carry out a machine learning project regarding the prices of different types of houses, using the dataset containing the features and house prices of each house.

Dataset Story
There are 79 explanatory variables in this dataset of residential homes in Ames, Iowa. You can access the dataset and competition page of the project, which also has a competition on Kaggle, from the link below. Since the dataset belongs to a Kaggle competition, there are two different csv files: train and test. House prices are left blank in the test dataset, and you are expected to guess these values.

https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/overview/evaluation

Variables

Requirements


Talent Hunting Classification with Machine Learning using SCOUTIUM's Dataset

GitHub Repository:


Business Problem
Predicting which class (average, highlighted) players are based on the points given to the characteristics of the football players watched by the Scouts.

Dataset Story
The dataset consists of information containing the characteristics and scores of the football players evaluated by the scouts according to the characteristics of the football players observed in the matches from Scoutium.

Variables

scoutium_attributes.csv scoutium_potential_labels.csv

Requirements


Customer Segmentation with Unsupervised Learning using FLO's Dataset

GitHub Repository:


Business Problem
FLO wants to divide its customers into segments and determine marketing strategies according to these segments. To this end, customers' behaviors will be defined and groups will be created based on clusters in these behaviors.

Dataset Story
The dataset consists of the information obtained from the past shopping behaviors of customers who made their last purchases from FLO as OmniChannel (both online and offline shopper) in 2020-2021.

Variables

Requirements