ADABOOST

AdaBoost

This blog post will provide you with a comprehensive overview of Adaboost, exploring the theory behind this probabilistic algorithm and demonstrating its implementation using Python libraries. Dive in to uncover the advantages and disadvantages of neural network, as well as its real-world applications across various domains. With that, enjoy your journey in QDO!

What is Adaboost

AdaBoost (Adaptive Boosting) is an ensemble learning technique that combines multiple weak classifiers (often decision trees) to create a strong classifier. It works by training the weak classifiers sequentially, giving more weight to misclassified instances at each step so that subsequent classifiers focus more on the harder cases. The final prediction is made by combining the weighted votes of all weak classifiers. AdaBoost is effective at reducing bias and variance, and it’s particularly good for binary classification problems. However, it can be sensitive to noisy data and outliers.

Concepts of AdaBoost

In the forest of AdaBoost, the tree is only made out of 1 node and 2 leaves

This tree is called "Stump", which represents a weak learner in the model because it is not great in making accurate decisions.

There are several main concepts on how AdaBoost operates that differs this algorithm from the others.

Combine weak learners known as stump
Some stumps contain more say in classifications
Previous stump impacts how the next stump is created

Lets say we have a Heart Disease dataset displayed as below

We first assign a sample weight for all the columns.

Initially all the records carries the same weight but its weightage might shift once the stump is created.

The first stump is created using the attribute with the lowest gini index.

We first calculate the total error created by this stump, which in this context is 1/8
We then calculate the amount of Say using this formula below

If the amount of say for all records are plot on a graph that applies Total Error on the x-axis and amount of say on the y-axis. The results of the graph would look like this.

A lower value in the total Error would result in a high amount of say while a higher value in total error would result in a lower value in the amount of say.

As mentioned earlier, AdaBoost creates the next stump by focusing on the error of the previous one. Hence, we must increase the sample weight of the records which creates the error within the stump using the formula below.

The records that results in a correct prediction will result in a lower weightage using the formula below.

The new weightage is then normalized to scale the value between 0 and 1 and replaces the old sample weightage.

Next, a new dataset is created in creation of the new stump.

Then a value between 0 and 1 is chosen. Lets say the value chosen is 0.72. Since 0.72 falls between the range of the 5th record from the old dataset, the 5th record is chosen to be the first record within the new dataset. The process repeats until the size of the new dataset is the same as the old dataset.

The new dataset is created and the entire process iterates until a certain amount of number.

When it receives a testing data, the data is then passed through all of the stumps and the total amount of say for each classification is accumulated

As displayed above, the total amount of say for the patient having heart disease is more than the total amount of say for the patient not having heart disease. Hence, the patient is classified as having heart disease.

Implementation of AdaBoost in python

Importing libraries

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

Importing libraries

iris = load_iris()

Importing libraries

X = iris.data
y = iris.target

Importing libraries

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 

random_state=42)

Importing libraries

base_estimator = DecisionTreeClassifier(max_depth=1)
adaboost = AdaBoostClassifier(estimator=base_estimator, n_estimators=30, 
learning_rate=0.5, random_state=42)

Importing libraries

adaboost.fit(X_train, y_train)

Importing libraries

y_pred = adaboost.predict(X_test)

Importing libraries

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.4f}')

Accuracy: 0.9667

Parameters that you can tune in AdaBoost

n_estimators: The number of weak learners (usually decision trees) to be combined. Increasing this value can improve performance but may lead to overfitting.

learning_rate: A weight applied to each classifier’s contribution. A smaller learning rate requires more estimators to achieve the same performance, but it helps in preventing overfitting.

base_estimator: The weak learner to be used (by default, it’s a decision tree with a maximum depth of 1). You can change it to other models, like deeper trees or linear models.

algorithm: Specifies the boosting algorithm, either "SAMME" (multiclass) or "SAMME.R" (real boosting, using probabilities from weak learners). "SAMME.R" is usually faster and performs better.

random_state: Controls the randomness for reproducibility.

max_depth, min_samples_split, min_samples_leaf (if using a decision tree as the base estimator): These control the complexity of the individual weak learners.

Advantages and disadvantages of AdaBoost

Advantages

1) High Accuracy

Combines multiple weak classifiers to build a strong model, often achieving high accuracy.

2) Adaptability

Focuses on hard-to-classify instances by giving them more weight, improving performance on challenging data points.

3) Versatile Base Learners

Can use various weak classifiers, though decision trees are most common.

Disadvantage

1) Sensitivity to Noisy Data and Outliers

Since it increases the weight of misclassified points, noisy data or outliers can overly influence the model.

2) Computationally Intensive

With many estimators, training can be slow.

3) Dependency on Weak Learner Performance

If the weak learner is too complex (like deep trees), it can lead to overfitting.

Implementation of AdaBoost in real life

1. Fraud Detection

Companies like PayPal use AdaBoost to detect fraudulent transactions. It helps identify unusual patterns in real-time by giving more focus to hard-to-classify transactions.

2. Face Detection

Apple have utilized AdaBoost in face detection algorithms, especially in earlier versions of their face recognition technology. It efficiently combines simple classifiers to detect human faces from images.

3. Customer Churn Prediction

Telecom companies like Verizon and e-commerce platforms like Amazon use AdaBoost to predict customer churn. It helps identify users likely to leave the service by analyzing historical data patterns.

QDO

ADABOOST

ADABOOST

AdaBoost

What is Adaboost

Concepts of AdaBoost

Combine weak learners known as stumpSome stumps contain more say in classificationsPrevious stump impacts how the next stump is created

Implementation of AdaBoost in python

Importing libraries

Importing libraries

Importing libraries

Importing libraries

Importing libraries

Importing libraries

Importing libraries

Importing libraries

accuracy = accuracy_score(y_test, y_pred)print(f'Accuracy: {accuracy:.4f}')

Accuracy: 0.9667

Parameters that you can tune in AdaBoost

Advantages and disadvantages of AdaBoost

Advantages

1) High Accuracy

Combines multiple weak classifiers to build a strong model, often achieving high accuracy.

2) Adaptability

Focuses on hard-to-classify instances by giving them more weight, improving performance on challenging data points.

3) Versatile Base Learners

Can use various weak classifiers, though decision trees are most common.

Disadvantage

1) Sensitivity to Noisy Data and Outliers

Since it increases the weight of misclassified points, noisy data or outliers can overly influence the model.

2) Computationally Intensive

With many estimators, training can be slow.

3) Dependency on Weak Learner Performance

If the weak learner is too complex (like deep trees), it can lead to overfitting.

Implementation of AdaBoost in real life

1. Fraud Detection

2. Face Detection

3. Customer Churn Prediction

Comments

Post a Comment

Popular posts from this blog

PRINCIPAL COMPONENT ANALYSIS (PCA)

LINEAR REGRESSION

DECISION TREE

Combine weak learners known as stump
Some stumps contain more say in classifications
Previous stump impacts how the next stump is created

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.4f}')