GitHub - fpsom/IntroToMachineLearning: Material for the [BC]2 2019 workshop "Introduction to Machine Learning: opportunities for advancing omics data analysis"

Overview of the Material for the [BC]2 2019 workshop

When: September 9th, 09:00 to 16:00

Where: University of Basel, Kollegienhaus building, Petersplatz 1, CH-4001 Basel

Room: Regenzzimmer 111

Organisers and tutors

Amel Ghouila, H3ABioNet

Fotis Psomopoulos, INAB-CERTH, ELIXIR-GR

Overview

Machine learning has emerged as a discipline that enables computers to assist humans in making sense of large and complex data sets. With the drop-in cost of sequencing technologies, large amounts of omics data are being generated and made accessible to researchers. Analysing these complex high-volume data is not trivial and the use of classical tools cannot explore their full potential. Machine learning can thus be very useful in mining large omics datasets to uncover new insights that can advance the field of medicine and improve health care.

The aim of this tutorial is to introduce participants to the Machine learning (ML) taxonomy and common machine learning algorithms. The tutorial will cover the methods being used to analyse different omics data sets by providing a practical context through the use of basic but widely used R and Python libraries. The tutorial will comprise a number of hands on exercises and challenges, where the participants will acquire a first understanding of the standard ML processes as well as the practical skills in applying them on familiar problems and publicly available real-world data sets.

Learning objectives

Understand the ML taxonomy and the commonly used machine learning algorithms for analysing “omics” data
Understand differences between ML algorithms categories and to which kind of problem they can be applied
Understand different applications of ML in different -omics studies
Use some basic, widely used Python and R packages for ML
Interpret and visualize the results obtained from ML analyses on omics datasets
Apply the ML techniques to analyse their own datasets

Audience and requirements

This introductory tutorial is aimed towards bioinformaticians (graduate students and researchers) familiar with different omics data technologies that are interested in applying machine learning to analyse them.

Prerequisites

Previous experience in Bioinformatics analysis
Familiarity with any programming language (especially R) is preferable but not necessary

Maximum participants: 30

Schedule

Time	Details
09:00 - 09:15	Tutorial introduction. - Get to know each other. - Setup Link to material
Part I: Background
09:15 - 10:45	Introduction to ML / DM. - Data Mining. - Machine Learning basic concepts. - Taxonomy of ML and examples of algorithms. - Deep learning overview. Link to material
11:00 - 12:30	Applications of ML in Bioinformatics. - Examples of different ML/DM techniques that can be applied to different NGS data analysis pipelines. - How to choose the right ML technique? Link to material
Part II: Hands-on
13:15 - 14:45	Loading and exploring omics data. - What is Exploratory Data Analysis (EDA) and why is it useful? - Unsupervised Learning. - How could unsupervised learning be used to analyze omics data? Link to material
15:00 - 16:30	Supervised Learning *Classification. - How could supervised learning be used to analyze omics data. Regression. - What if the target variable is numerical rather than categorical? Link to material*
16:30	Closing, discussion and resource sharing

Other examples

If you finish all the exercices and wish to practice on more examples, here are a couple of good examples to help you get more familiar with the different ML techniques and packages.

RNASeq Analysis in R
Use the Iris R built-in data setto run clustering and also some supervised classification and compare results obtained by different methods.

Sources / References

The material in the workshop has been based on the following resources:

ELIXIR CODATA Advanced Bioinformatics Workshop
Machine Learning in R, by Hugo Bowne-Anderson and Jorge Perez de Acha Chavez
Practical Machine Learning in R, by Kyriakos Chatzidimitriou, Themistoklis Diamantopoulos, Michail Papamichail, and Andreas Symeonidis.
Linear models in R, by the Monash Bioinformatics Platform
Relevant blog posts from the R-Bloggers website.

Relevant literature includes:

Pattern Recognition and Machine Learning by Christopher M. Bishop.
Machine learning in bioinformatics, by Pedro Larrañaga et al.
Ten quick tips for machine learning in computational biology, by Davide Chicco
Statistics versus machine learning
Machine learning and systems genomics approaches for multi-omics data
A review on machine learning principles for multi-view biological data integration

License

This material is made available under the Creative Commons Attribution 4.0 International license. Please see LICENSE for more details.

Citation

Amel Ghouila, & Fotis E. Psomopoulos. (2019, September 9). Introduction to Machine Learning: Opportunities for advancing omics data analysis (Version v1.0.0). Zenodo. http://doi.org/10.5281/zenodo.3403768

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
data		data
episodes		episodes
static/images		static/images
.zenodo.json		.zenodo.json
LICENSE.md		LICENSE.md
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview of the Material for the [BC]2 2019 workshop

Organisers and tutors

Overview

Learning objectives

Audience and requirements

Prerequisites

Schedule

Other examples

Sources / References

License

Citation

About

Releases 1

Packages

Contributors 2

License

fpsom/IntroToMachineLearning

Folders and files

Latest commit

History

Repository files navigation

Overview of the Material for the [BC]2 2019 workshop

Organisers and tutors

Overview

Learning objectives

Audience and requirements

Prerequisites

Schedule

Other examples

Sources / References

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Packages