This repository contains the code for the University of Edinburgh School of Informatics course Data Mining and Exploration [INFR11007].
In this course we will be using Python 3 and the interactive notebook application Jupyter for all labs. Basic knowledge of python
, numpy
and working with notebooks
in the Jupyter environment is assumed for this course. If you haven't used python before, you are strongly advised to familiarise yourself with basic python syntax and working in the Jupyter environment. There are many excellent tutorials available on the web and you can choose the ones you like the most. If you are not sure which ones to choose, these are good starting points:
Introduction to Python for scientific computing
Introduction to Jupyter notebooks
The main packages that we will use are the following:
-
numpy: scientific computing by using array objects
-
pandas: data structures and data analysis tools
-
scikit-learn: machine learning library implementing many learning algorithms and useful tools
-
matplotlib: plotting library (similar to MATLAB's plot interface)
-
seaborn: data visualisation library which works on top of matplotlib
Detailed instructions for setting up a development environment for the course are given in this file.