Skip to content

This is the Repository of the book "Applied Machine Learning with Python", published in its first edition in 2019.

License

Notifications You must be signed in to change notification settings

rldiazJitsik/Applied_Machine_Learning_with_Python

 
 

Repository files navigation

Helper functions for the book "Applied Machine Learning with Python"

PyPi Downloads CI status

This repository contains the Supplementary Material for the book "Applied Machine Learning with Python", written by Andrea Giussani. You can find details about the book on the BUP website.
The books was written with the following specific versions of some popular libraries:

How to use the EgeaML Library

The book provides a book-specific module, called egeaML.
Be sure you have created a virtualenv. Then run

pip install egeaML

Once installed you can load a structured label dataset - such as the well-known Boston dataset - as a pandas.DataFrame, as follows:

from egeaML.datareader import DataReader

raw_data = DataReader(
    filename='https://raw.githubusercontent.com/andreagiussani/datasets/master/egeaML/boston.csv',
    col_target='MEDV'
)

Please noe that the base code is evolving over time; in case you want to stick to the print version of the book, be sure you install the egeaML==0.2.3 version.

How to develop on the EgeaML

Please, clone on your local machine this repo, as follows:

git clone https://github.com/andreagiussani/Applied_Machine_Learning_with_Python.git

To install it into your local env, I recommend to create a virtualenv where you add the necessary requirements, running this command from your favourite terminal emulator:

pip install -r requirements.txt
pip install git+https://github.com/andreagiussani/Applied_Machine_Learning_with_Python.git

If, instead, you use the Anaconda system:

conda install --file requirements.txt
conda install git+https://github.com/andreagiussani/Applied_Machine_Learning_with_Python.git

If you have Python3 already installed in your local environment, you can run:

python3 -m pip install --upgrade pip
python3 -m pip install git+https://github.com/andreagiussani/Applied_Machine_Learning_with_Python.git

Unittest each method

As a developer, you should unittest your contribution. To do so, you simply need to create a dedicated folder inside the tests subfolder (or possibly extend an existing one), and test that your method exactly does what you expect. Please look at the following example to tke inspiration:

import unittest
import os
import pandas as pd

from egeaML.datareader import DataReader


class DataIngestionTestCase(unittest.TestCase):
    URL_STRING_NAME = 'https://raw.githubusercontent.com/andreagiussani/datasets/master/egeaML'
    FILENAME_STRING_NAME = 'boston.csv'

    def setUp(self):
        self.col_target = 'MEDV'
        self.filename = os.path.join(self.URL_STRING_NAME, self.FILENAME_STRING_NAME)
        self.columns = [
            'CRIM', 'ZN', 'INDUS', 'CHAS', 'NX', 'RM', 'AGE',
            'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV'
        ]
        self.raw_data = DataReader(filename=self.filename, col_target=self.col_target)

    def test__load_dataframe(self):
        df = self.raw_data()
        self.assertIsInstance(df, pd.DataFrame)
        self.assertEqual(df.shape[0], 506)
        self.assertEqual(df.shape[1], 14)

The above unittest checks that the output is of type pandas.DataFrame and verify the expected output satisfies some characteristics.

Extra Stuff

If you wish to use the egeaML library on a Jupyter notebook, you firstly need to install the jupyter library, and then running the following command

pip install jupyter
python3 -m ipykernel install --user --name=<YOUR_ENV>

where the name is the name you have assigned to your local environment. You are now ready to use all the feature of this helper!

Submitting Errata

If you have errata for the book, please submit them via the BUP website. In case of possible mistakes within the book-specific module, you can submit a fixed-version as a pull-request in this repository.

How to Cite this Book

@book{giussani2020,
	TITLE="Applied Machine Learning with Python",
	AUTHOR="Andrea Giussani",
	YEAR="2020",
	PUBLISHER="Bocconi University Press"
}

About

This is the Repository of the book "Applied Machine Learning with Python", published in its first edition in 2019.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.5%
  • Jupyter Notebook 2.5%