Table of contents
Intuition
In a voting ensemble, each model is trained independently on the same training data or different subsets of the training data. Once trained, these models make predictions on unseen data. The ensemble then combines these predictions using a voting mechanism to arrive at a final prediction.
The intuition behind a voting ensemble is that individual models may have different biases, strengths, or weaknesses. By combining their predictions, the ensemble can leverage the diversity and collective knowledge of the models to make more accurate and reliable predictions. It can help mitigate the impact of individual model errors and improve overall performance.
Underlying Assumptions
Independent Models: Each model is trained independently, without direct influence on others, and the models make uncorrelated or weakly correlated errors, allowing the ensemble to mitigate individual weaknesses and achieve significant improvements in accuracy.
Diversity: The ensemble benefits from diverse models with different learning strategies, architectures, or parameter settings, capturing varied aspects of the data and reducing biases and overfitting.
Competence: Individual models in the ensemble possess predictive power and accuracy, contributing to the ensemble's overall performance.
Types
Majority | Weighted | Soft |
each model in the ensemble predicts a class label, and the class label with the maximum count or frequency among the predictions is chosen as the final prediction. | Weighted voting assigns weights to each model's prediction based on its performance or confidence level. the final prediction is determined by taking a weighted average or weighted sum of the individual predictions. | Soft voting is used when models predict class probabilities or confidence scores for each class rather than discrete class labels. The final prediction is determined by aggregating these probabilities or scores across the models. |
For example, if we have three models and their predictions are [0, 1, 1], the class label with the highest count is 1, so the ensemble's final prediction would be 1. | Suppose we have three models with predictions [0.8, 0.6, 0.9], and corresponding weights [0.4, 0.3, 0.6]. The final prediction can be calculated as: Final Prediction = (0.4 * 0.8) + (0.3 * 0.6) + (0.6 * 0.9) | For example, let's assume we have three models, and their predicted probabilities for class 0 are [0.7, 0.6, 0.8], and for class 1 are [0.3, 0.4, 0.2]. We calculate the average probabilities for class 0 and class 1, respectively. The class with the highest average probability is selected as the final prediction. |
majority = VotingClassifier(estimators=[('lr', classifier1), ('knn', classifier2), ('dt', classifier3)], voting='hard') | weighted = VotingClassifier(estimators=[('lr', classifier1), ('knn', classifier2), ('dt', classifier3)], voting='soft', weights=[2, 1, 1]) | soft = VotingClassifier(estimators=[('lr', classifier1), ('knn', classifier2), ('dt', classifier3)], voting='soft') |
Code
Import the dataset, models and classifier for the voting ensemble.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
Preparing data for model training
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Apply data scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
Each model is a classifier and they will be combined using all three types of techniques.
# Define the individual classifiers
classifier1 = LogisticRegression(max_iter=1000)
classifier2 = KNeighborsClassifier()
classifier3 = DecisionTreeClassifier()
# Define the voting classifier using different voting mechanisms
majority = VotingClassifier(estimators=[('lr', classifier1), ('knn', classifier2), ('dt', classifier3)], voting='hard')
weighted = VotingClassifier(estimators=[('lr', classifier1), ('knn', classifier2), ('dt', classifier3)], voting='soft', weights=[2, 1, 1])
soft = VotingClassifier(estimators=[('lr', classifier1), ('knn', classifier2), ('dt', classifier3)], voting='soft')
# Train the voting classifiers
majority.fit(X_train_scaled, y_train)
weighted.fit(X_train_scaled, y_train)
soft.fit(X_train_scaled, y_train)
# Make predictions using the voting classifiers
y_pred_majority = majority.predict(X_test_scaled)
y_pred_weighted = weighted.predict(X_test_scaled)
y_pred_soft = soft.predict(X_test_scaled)
# Calculate and print the accuracy scores
accuracy_majority = accuracy_score(y_test, y_pred_majority)
accuracy_weighted = accuracy_score(y_test, y_pred_weighted)
accuracy_soft = accuracy_score(y_test, y_pred_soft)
print("Accuracy (Majority Voting): {:.2f}".format(accuracy_majority))
print("Accuracy (Weighted Voting): {:.2f}".format(accuracy_weighted))
print("Accuracy (Soft Voting): {:.2f}".format(accuracy_soft))