7 kyu
Building an Adaboost model with Sklearn (Introductory Machine Learning)
272 of 273the_Hamster
Description:
Hi there! Welcome to "Building an Adaboost model with Sklearn (Introductory Machine Learning)".
If you get stuck at any point I strongly recommend you read the relevant documentation: http://scikit-learn.org/stable/index.html
This kata can be broken up into the following steps:
- Import the relevant sklearn libraries (i.e. AdaBoostClassifier, train_test_split)
- Create the "train_ada_boost" function (more details below)
- Split the data (X) into a test set and a training set (you MUST USE sklearns train_test_split for this)
- TRAIN an adaBoostClassfier (Once again, you MUST USE sklearn for this).
- Your "train_ada_boost" function must return a three element tuple (more details below).
The "train_ada_boost" function accepts the following arguments:
- X -- This would normally be a dataset (but in this kata it is a 1D numpy array)
- Target -- This is a 1D numpy array consisting of 1's and 0's. This argument is the set of values we are trying to predict with our model
- estimators -- KEYWORD ARGUMENT if no argument is passed in the default value should be set to 3.
- random_seed -- KEYWORD ARGUMENT if no argument is passed in the default value should be set to 0.
Your function should return a 3 element tuple consisting of the following values in this exact order:
- A TRAINED AdaBoostClassifer (return the actual model)
- A test set (1D numpy array, which was built by sklearns test_train_split function)
- A target array (1D numpy array, which was built by sklearns test_train_split function))
Details:
- Your model should be trained using the specified number of estimators, with a random state equal to seed.
- When splitting the data, be sure to set the random state equal to seed.
- ALL other parameters not mentioned here for both 'test_train_split' and 'AdaBoost' should be set to default values.
- Even though you are handling Numpy arrays there is no need to actually manipulate the arrays yourself (let sklearn to it for you!).
Good Luck!
*Note, in actual machine learning it is highly unlikely we would ever use AdaBoost with such a small number of estimators. However building models can be very time consuming and codewars.com has a 5000ms timeout.
Machine Learning
Algorithms
Tutorials
Data Science
Similar Kata:
Stats:
Created | Dec 1, 2017 |
Published | Dec 1, 2017 |
Warriors Trained | 1000 |
Total Skips | 26 |
Total Code Submissions | 1918 |
Total Times Completed | 273 |
Python Completions | 272 |
Total Stars | 67 |
% of votes with a positive feedback rating | 71% of 53 |
Total "Very Satisfied" Votes | 32 |
Total "Somewhat Satisfied" Votes | 11 |
Total "Not Satisfied" Votes | 10 |
Total Rank Assessments | 3 |
Average Assessed Rank | 7 kyu |
Highest Assessed Rank | 6 kyu |
Lowest Assessed Rank | 8 kyu |