Insights

Tue 08 Oct 2013

tags: r presentations machine learning rstats

I had the pleasure of presenting to the Dublin R User Group on machine learning. My talk source and examples are in a Github repo. The talk was aimed at introducing machine learning to the group as I hope to follow this talk up with more intermediate topics.

PDF version of this talk

Overview

The techniques covered included:

Data transformation
Model building
Model assessment and selection
Interpreting a confusion matrix
Interpreting a ROC plot
Approaches to handling prediction errors
Addressing feature selection
Kaggle and data science competitions
A review of various ML techniques:
- Associate rules
- Decision trees
- Random forests
- k Nearest Neighbors
- Neural networks
- Support vector machines
- Naive bayesian
OpenRefine and Rattle
Useful command line tools for a data scientist

Background

The classification of objects into groups is a everyday task for humans and this talk helps highlight how to develop models to allow machines to do these tasks. The talk provide a quick understanding of machine learning algorithms showing how they worked, so that people would know when and how to best apply them. Machine learning and particularly feature generation for those models is seen as a “black art” due to the fact that domain expertise is required. The talk was aimed at showing people how R can be used in the process of selecting, assessing and creating models. It gave four datasets to show examples of exploratory data analysis.

Introduction to Machine Learning with R

Dublin R User Group

Overview

Background

Talk slides