alternative
  • Home (current)
  • About
  • Tutorial
    Technologies
    C#
    Deep Learning
    Statistics for AIML
    Natural Language Processing
    Machine Learning
    SQL -Structured Query Language
    Python
    Ethical Hacking
    Placement Preparation
    Quantitative Aptitude
    View All Tutorial
  • Quiz
    C#
    SQL -Structured Query Language
    Quantitative Aptitude
    Java
    View All Quiz Course
  • Q & A
    C#
    Quantitative Aptitude
    Java
    View All Q & A course
  • Programs
  • Articles
    Identity And Access Management
    Artificial Intelligence & Machine Learning Project
    How to publish your local website on github pages with a custom domain name?
    How to download and install Xampp on Window Operating System ?
    How To Download And Install MySql Workbench
    How to install Pycharm ?
    How to install Python ?
    How to download and install Visual Studio IDE taking an example of C# (C Sharp)
    View All Post
  • Tools
    Program Compiler
    Sql Compiler
    Replace Multiple Text
    Meta Data From Multiple Url
  • Contact
  • User
    Login
    Register

Machine Learning - Supervised Learning - Overview Tutorial

Considering a Long List of Machine Learning Algorithms, given a Data Set, How Do You Decide Which One to Use?

There is no master algorithm for all situations. Choosing an algorithm depends on the following questions:

  • How much data do you have, and is it continuous or categorical?
  • Is the problem related to classification, association, clustering, or regression?
  • Predefined variables (labeled), unlabeled, or mix?
  • What is the goal?

Based on the above questions, the following algorithms can be used:

 

 

Cross-Validation

Cross-Validation in Machine Learning is a statistical resampling technique that uses different parts of the dataset to train and test a machine learning algorithm on different iterations. The aim of cross-validation is to test the model’s ability to predict a new set of data that was not used to train the model. Cross-validation avoids the overfitting of data.

K-Fold Cross Validation is the most popular resampling technique that divides the whole dataset into K sets of equal sizes.

If your company gives you 10GB of Data and you have a 4 GB Ram machine, the company doesn’t have much money, then what you will do?

You cannot do subsampling(as it will lose data) and cloud computing(the company doesn’t have money)

Then you can train the model on jupyter notebook on the same machine without the use of a cloud platform using Out Of Core ML - Vaex Library.

1] Stream Data (loading data in the form of chunks)

2] Extract Features

3] Train Model (It only works for an algorithm that has partial_fit method such as SGD Regressor, Naïve Bayes, and Passive aggressor classifier)



 

Supervised Learning 

 

  1. Linear regression [Regression] – Simple, Multiple, Polynomial, Lasso, Ridge and ElasticNet
  2. Logistic regression [Classification and Regression]
  3. Support Vector Machine/ Support Vector Regressor [Classification and Regression]
  4. Naive Bayes [Classification] –mostly used for text data
  5. Linear discriminant analysis 
  6. Decision Tree Classifier [Classification]
  7. k-nearest neighbor algorithm

Ensemble Learning

Voting Ensemble / Voting Classifier

  1. MaxVoting [Classification]
  2. Averaging [Regression]

Bagging

  1. Bagging Classifier [Classification and Regression]
  2. Random Forest [Classification]
  3. Extra Trees Classifier [Classification]

Boosting

  1. AdaBoost Classifier [Classification]
  2. Gradient Boosting Classifier [Classification]
  3. XGB Classifier [Classification]

Stacking

  1. Neural Networks (Multilayer perceptron) 
  2. Similarity learning
Machine Learning

Machine Learning

  • Introduction
  • Overview
    • Type Of Machine Learning
    • Batch Vs Online Machine Learning
    • Instance Vs Model Based Learning
    • Challenges in Machine Learning
    • Machine Learning Development Life Cycle
  • Machine Learning Development Life Cycle
    • Framing the Problem
    • Data Gathering
    • Understanding your Data
    • Exploratory Data Analysis (EDA)
    • Feature Engineering
    • Principal Component Analysis
    • Column Transformer
    • Machine Learning Pipelines
    • Mathematical Transformation
    • Binning and Binarization | Discretization | Quantile Binning | KMeans Binning
  • Supervised Learning
    • Overview
    • Linear Regression [Regression]
    • Multiple Linear Regression
    • Polynomial Linear Regression [Regression]
    • Bias Variance Trade Off
    • Regularization
    • LOGISTIC REGRESSION [Regression & Classification]
    • Polynomial Logistic Regression
    • Support Vector Machines / Support Vector Regressor
    • Naïve Bayes Classifier [classification]
    • Decision Tree
    • Entropy
    • Information Gain
    • K Nearest Neighbor (KNN)
    • Neural Network (MultiLayer Perceptron)
  • Ensemble Learning
    • Introduction to Ensemble Learning
    • Basic Ensemble Techniques
    • Advanced Ensemble Techniques
    • Random Forest Classifier
    • Boosting
  • UnSupervised Learning
    • Overview
    • K Mean Clustering

About Fresherbell

Best learning portal that provides you great learning experience of various technologies with modern compilation tools and technique

Important Links

Don't hesitate to give us a call or send us a contact form message

Terms & Conditions
Privacy Policy
Contact Us

Social Media

© Untitled. All rights reserved. Demo Images: Unsplash. Design: HTML5 UP.

Toggle