Member-only story
Random Forest for Binary Classification: Hands-On with Scikit-Learn
With Python and Google Colab

The Random Forest algorithm belongs to a sub-group of Ensemble Decision Trees. If you want to know more about Decision Trees, you can read my previous article here.

In a Random Forest model, multiple Decision Trees are built and combined, which results in a Random Forest of trees (usually a much more accurate decision tree). While growing the trees, the Random Forest method searches for the next node (or feature) in a random way, which increases the number of different trees created. When the trees are combined, it results in a model that does not overfit (as the features were searched randomly, and not only the best feature included), while maintaining the classification accuracy.
The best way to understand is to make our own project. If you have a database, you can follow along with this tutorial and just change the necessary code to include your dataset name and features names. If you want to try with sample data, you can download the breast-cancer.csv dataset from Kaggle that is used in this article.