Skip to content

Latest commit

 

History

History
66 lines (48 loc) · 3.03 KB

10-summary.md

File metadata and controls

66 lines (48 loc) · 3.03 KB

1.10 Summary

Slides

Notes


📚 Summary of First Session - Machine Learning Zoomcamp

  1. 🚗 Introduction to Machine Learning with Cars Data
    We start with data about cars, including characteristics (features) and prices (target). A Machine Learning (ML) model can be used to extract patterns from known information (data) about some cars in order to predict car prices based on their characteristics.

  2. 🧠 Rules-Based Systems vs. Machine Learning

    • Rules-Based Systems: It is necessary to manually convert rules into code using a programming language and apply them to data. Extracting patterns manually can become complex and challenging.
    • Machine Learning: Instead of manually coding rules, ML models automatically extract patterns from data using Mathematics and Statistics.
  3. 🔍 Supervised Machine Learning
    In supervised learning, models learn from labeled data (with known outcomes) to make predictions on unseen data.

  4. 🛠️ CRISP-DM (Cross Industry Standard Process for Data Mining)
    A structured methodology for organizing ML projects, consisting of the following steps:

    • 💼 Business Understanding
    • 🔎 Data Understanding
    • 🧹 Data Preparation
    • 🤖 Modeling (choosing and training models, then selecting the best one)
    • 📊 Evaluation
    • 🚀 Deployment
      This process is iterative, allowing for continuous improvement.
  5. 🏆 Model Selection
    Split data into training, validation, and test sets. Train different models, validate them, select the best performing one, and then test it on the test set to ensure generalization.

  6. 💻 Setting Up the Environment
    Install necessary tools like Python, Numpy, Pandas, Matplotlib, Scikit-learn. Anaconda is the easiest option. Eventually create an AWS account for cloud resources.

  7. 🔢 Introduction to Numpy
    Numpy is crucial for manipulating numerical data, providing efficient operations on arrays and matrices.

  8. 🔗 Linear Algebra
    Covering all types of multiplication with vectors and matrices, including the creation of identity matrices using functions like np.eye().

  9. 📊 Introduction to Pandas
    Pandas is a Python library used for processing and analyzing tabular data efficiently.


⚠️ The notes are written by the community.
If you see an error here, please create a PR with a fix.
* [Notes from Maximilien Eyengue](https://github.com/maxim-eyengue/Python-Codes/blob/main/ML_Zoomcamp_2024/01_intro/Summary_Session_01.md)

Navigation