This project provides an implementation of a Extra Trees Classifier algorithm from scratch, without libraries scikit-learn. i'll embark on a journey to create an Extra Trees Classifier from scratch, immersing ourselves in the intricate details of its inner workings.
By constructing each component step-by-step, we'll uncover the elegance and effectiveness of this ensemble learning technique. Our journey begins with crafting a clear pseudocode, ensuring a well-structured and efficient implementation.
Let's dive into the core algorithm ๐
Input: n_estimators, max_depth, min_samples_split
Output: Initialized ETC model
1. Set n_estimators, max_depth, dan min_samples_split into input params.
2. Initialize the array to store the decision trees.
3. Return the initialized Extra Trees model.
Input: Training data (X_train, y_train)
Output: Ensemble of decision trees
1. Loop as many times as n_estimators:
a. Take a random sample with replacement from the training data.
b. Build a decision tree using the subset of data taken.
c. Add the decision tree to the ensemble.
2. Return the ensemble of decision trees.
Input: Test data (X_test)
Output: Class prediction for each sample in X_test
1. Loop for each sample in X_test:
a. Perform prediction using each tree in the ensemble.
b. Collect the prediction results from all trees.
2. Perform majority voting for each sample:
a. Calculate the frequency of each class based on the prediction results.
b. Select the class with the highest frequency as the final prediction.
3. Return the class prediction for each sample in X_test.
- clone the repo
- take the dummy dataset or the dataset you want to train on.
- import package with command ->
from ExtraTree import ExtraTreesClassifier
- Example usage :
# Training etc = ExtraTreesClassifier( n_estimators=1, max_depth=2, min_samples_split=5) etc.fit(X_train, y_train) # predict predictions = etc.predict(X_test)