ML Intro using Decision Trees and Random Forest

workshop ML intro using decision trees and random forest

Using Python, Jupyter and SKlearn

Course date duration: July 13th, 9:30 – 14:00, 30 min break included
Trainer: Tudor Lapusan
Location: Impact Hub Bucuresti, Timpuri Noi area
Price: 200 RON (including VAT)
Number of places: 20 no more places left
Languages: Python


Getting started with Machine Learning can seem a pretty hefty task to some people: understanding the algorithms, learning a bit of programming, deciding which libraries to use, getting some data to learn on, etc… But in reality if you’re actually setting your expectations right and willing to start small and learn step by step, learning the basics of ML it’s actually quite doable. After learning a bit it’s actually up to you to take your knowledge in the real world and apply and expand what you have learned.

This workshop aims to introduce you into ML world and to teach you how to solve classification and regression problems through the usage of decision trees and random forest algorithms. We will go from the theory to hands on in just a couple of hours aiming mostly to make you understand the main pipeline of an ML project, while of course learning a bit of ML:

    • Software and hardware requirements for a ML project
    • Common Python libraries for data analysis
    • Feature encoding and feature preprocessing
    • Exploratory Data Analysis (EDA)
    • Model validation
    • Model hyperparameter optimization
    • Tree based models for classification and regression:
      • Decision Tree
      • Random Forest
    • Repetitive model improvement

1. ML general intro (theory)

  • What is Machine Learning (ML): supervised/unsupervised training, most known algorithms
  • Machine Learning use cases
  • Existing Languages & libraries for ML
  • ML cloud solutions
  • Most relevant ML online courses that you should know about

2. Classification and Regression (theory and hands on session)

    Decision Tree and Random Forest (classification and regression)

  • Intro to Python, Pandas, Sklearn
  • Dataset description
  • Preparing your data for ML
  • Model validation
  • Explain what is a Decision Tree
    • How is it built and how it works
    • Explain node information : node impurity, node samples
    • Decision Tree hyperparameters
      • Explain the most important ones
      • Tips and tricks
    • Decision tree structure interpretation
    • Prediction interpretation
  • Explain what is a Random Forest
    • How is it built and how it works
    • Explain from where the name Random and its advantages/disadvantages
    • Random Forest hyperparameters
      • Explain the most important ones
      • Tips and tricks
    • Random Forest structure interpretationPrediction interpretation
Hands on exercises

To better understand all the phases involved in the development of an ML project, we will work with practical code examples on an open dataset. We will use the Python programming language, Jupyter notebook and also the well known Sklearn ML library.

Prior to the workshop, participants will receive all the necessary requirements for workshop setup, including a github link and the working dataset (open dataset will be used). We intend to work in cloud, so that no local setup will be required, we will use Lentiq for sharing the practical notebooks and running the exercises. 

workshop requirements
  • Python
  • Sklearn, Pandas
  • Jupyter notebook
  • Github account

Python is an easy language to learn but it can be a plus to have some basic knowledge. Same for sklearn, pandas and jupyter notebook.

The price for the workshop is 200 RON (including VAT).

1. Complete registration form: