Towards AI

The leading AI community and content platform focused on making AI accessible to all. Check out our…

Follow publication

Member-only story

Random Forest for Binary Classification: Hands-On with Scikit-Learn

With Python and Google Colab

Carla Martins
Towards AI
Published in
7 min readApr 8, 2022

--

The Random Forest algorithm belongs to a sub-group of Ensemble Decision Trees. If you want to know more about Decision Trees, you can read my previous article here.

https://pub.towardsai.net/decision-and-classification-tree-cart-for-binary-classification-hands-on-with-scikit-learn-b59474b2c039

In a Random Forest model, multiple Decision Trees are built and combined, which results in a Random Forest of trees (usually a much more accurate decision tree). While growing the trees, the Random Forest method searches for the next node (or feature) in a random way, which increases the number of different trees created. When the trees are combined, it results in a model that does not overfit (as the features were searched randomly, and not only the best feature included), while maintaining the classification accuracy.

The best way to understand is to make our own project. If you have a database, you can follow along with this tutorial and just change the necessary code to include your dataset name and features names. If you want to try with sample data, you can download the breast-cancer.csv dataset from Kaggle that is used in this article.

--

--

Published in Towards AI

The leading AI community and content platform focused on making AI accessible to all. Check out our new course platform: https://academy.towardsai.net/courses/beginner-to-advanced-llm-dev

Written by Carla Martins

Compulsive learner. Passionate about technology. Speaks C, R, Python, SQL, Haskell, Java and LaTeX. Interested in creating solutions.

Responses (3)

Write a response