Machine Learning Introduction Beginners

Machine Learning: Introduction to Machine Learning for Beginners

Summary

Machine learning (ML) is an application of Artificial Intelligence (AI) that allows systems to automatically learn and improve from experience without being explicitly programmed for the task. This can be described as machine learning focuses on developing programs that can access data and use it to learn for themselves.

Machine learning is an important component of the growing field of data science and machine learning represents a major step forward in how computers can learn.

In this article we will try to give a simple introduction for those who want to understand machine learning.

What is Machine Learning?

Machine learning (ML) is an application of artificial intelligence (AI) that allows systems to automatically learn and improve from experience without being explicitly programmed for the task.

Machine learning can be broadly defined as the capability of a machine to imitate intelligent human behaviour.

What this means is that machine learning focuses on developing programs that can access data and use it to learn for themselves.

Machine learning is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.

Why use Machine Learning?

There are several reasons to use Machine Learning, including:

  • Machine learning enables analysis of massive amounts of data. 
  • The main objective of Machine Learning is to allow the program to learn automatically without human intervention or assistance and alter activities accordingly 
  • Machine learning is an important component of the growing field of data science. Within Data Science, you use statistical methods and machine learning algorithms that are trained to make classifications or predictions with the purpose of uncovering key insights within data 
  • For companies that have data in their core strategy, and depend on large quantities of data, need a way to efficiently and accurately analyse and make use of that data. Machine Learning is one the best ways to build models, strategise, and plan
  • Machine learning is the core of some companies’ business models, like in the case of Netflix’s suggestions algorithm or Google’s search engine. The practical applications of machine learning can drive positive business results

How to use Machine Learning?

Machine Learning algorithms can broadly be divided into three subtypes:

Supervised Learning

Uses pre-labeled data to train models. It means some data is already tagged with the correct answer. The machine has a “supervisor”, or a “teacher”, who gives the machine all the answers, like whether it’s a cat in the picture or a dog

Unsupervised Learning

You do not need to supervise the model; instead, the model will work on its own to discover information. Unsupervised learning means the machine is left on its own with, for example, a set of animal photos and a job to find out and categorize between the different animals

As you may have noticed, the main distinction between the supervised and unsupervised learning is the use of labeled datasets.

I really like this visualisation from machine learning for everyone that show the difference between supervised and unsupervised learning

And third and final, we have what is called

Reinforced Learning

To get the program to do what we want, the machine learning model gets either a reward or a penalty for the actions it performs. Its goal is to maximise the total reward. This means that reinforcement learning is reward-based learning which works on the system of feedback

What Skills Do You Need to Use Machine Learning?

Machine learning engineering uses software engineering concepts with analytical and data science.

  • Data Science skills: The data science concepts that machine learning engineers use include: understanding of programming languages such as Python, SQL, and Java, hypothesis testing, proficiency in mathematics, probability, and statistics. More about technical skills in data science in our post: Technical Skills to become a Data Scientist
  • Software Engineering skills: Some of the base concepts that machine learning engineers rely on include: writing and structuring algorithms, understanding data structures, and knowledge of computer architecture 
  • Additional skills: Machine Learning professionals also utilise deep learning, dynamic programming, natural language processing, audio and video processing, and reinforcement learning, among others

Machine Learning and Artificial Intelligence (AI): What’s the difference?

First, in short regarding Artificial intelligence (AI). AI refers to a computer system’s capacity to simulate human cognitive capabilities such as learning and problem-solving. A computer system that employs AI combines arithmetic and logic to imitate the reasoning that humans use to learn from new information and make decisions.

Is perhaps artificial intelligence and machine learning the same? Well, while AI and Machine Learning are very closely connected, they’re not the same. Machine learning is regarded as a subset of AI. Thus, Machine learning is an application of AI.

On a general level, we can differentiate AI and machine learning as:

AI is a bigger concept to create intelligent machines that can simulate human thinking capability and behaviour, whereas, machine learning is an application or subset of AI that allows machines to learn from data without being programmed explicitly.
Javatpoint: Difference between ML and AI

So when looking at AI vs Machine Learning, you could say that you’re looking at their interrelationship.

Use Cases for Machine Learning: Examples of Applications

To give a short introduction with a few examples of some use cases for machine learning

Machine Learning in Finance

Finance is one of the most critical sectors in the world, and with the use of machine learning, companies can now quickly analyse financial related matters and make better decisions.

Machine Learning has a wide range of applications and use areas in the financial sector, to name a few:

  • Fraud Detection: With the help of machine learning algorithms, companies can analyse big data and detect anomalies with higher precision and speed. The machine learning application helps with fraud detection for safe transactions 
  • Algorithmic Trading: By implementing smart machine learning applications, financial institutions can get a better understanding and make better predictions on their algorithmic trading. The machine learning algorithm is learning to make better trades.
  • Process Automation: Machine Learning solutions allow finance companies to replace manual work by automating repetitive tasks through intelligent process automation. For example chatbots and paperwork automation are two examples of process automation in finance using machine learning. This application is of course useful in a wide range of industries, not only finance

Machine Learning in Healthcare

Machine Learning in healthcare is very beneficial as machine learning was developed to deal with large data sets, and patient files are exactly that as it includes many data points that need thorough analysis and organising.

Some examples of use cases for machine learning in healthcare are:

  • Computer Assisted Diagnosis (CAD): Machine learning algorithms can help to determine and label the kind of disease or medical case that the medical staff are dealing with
  • Make Recommendations: Machine learning algorithms can advise and give medical information without the need to actively search for it. In other words, the application recognises patterns and can give recommendations for a patient. The system uses the patient history and can produce multiple potential treatment options. 
  • Predictive Approach to Treatment: Machine learning in healthcare can be used to successfully predict diseases and give patients a chance of starting the treatment early, being predictive. For example, signs of diabetes can be predicted using a machine learning algorithm

Machine Learning for Online Sales and Marketing

The better you can understand your customers, the better you can meet their demands, and the more you will sell. In online marketing and e-commerce, marketers use machine learning to find patterns in user activities on a website.

Some examples of machine learning for marketing include:

  • Make personalised product recommendations: Popular eCommerce giants like Amazon and Netflix are using machine learning algorithms to achieve it. For example, if you scroll through Amazon you will notice that it can give you quite sophisticated recommendations on other products you might like. 
  • Forecast Targeting: Predictive forecasting machine learning makes forecasts using various data sources, including sales history, customer searches, economic indicators, and demographic data
  • Identifying Styles of Popular Products and Predicting Trends: Machine learning applications can support in identifying customer behaviour and shopping patterns. This is crucial as it helps marketers to understand what impacts consumers’ buying decisions

Machine Learning for Self-Driving Cars

Much of the technology behind self-driving cars is based on machine learning, deep learning in particular. The development of self-driving cars is one of the most trendy and popular directions in the world of machine learning.

Self-driving cars are made possible by machine learning algorithms as they make it possible for a vehicle to collect data from cameras and other sensors and then interpret it and decide the following actions to perform.

Most Used Machine Learning Tools

There are several Machine Learning tools that are available in the market, below are some of the most used Cloud Services and Platforms, and Programming Libraries and Frameworks

Cloud Services and Platforms

Microsoft Azure Machine Learning

Microsoft Azure Machine Learning

Azure machine learning is a cloud platform that allows you to build, train, and manage the machine learning project lifecycle. You can build a model in Azure machine learning or use a library from an open-source platform, such as TensorFlow or PyTorch
IBM Watson Machine Learning

IBM Watson Machine Learning

IBM Watson is a cloud service that uses data to put machine learning and deep learning models into production. The machine learning service is a set of APIs that you can call from any programming language. IBM Watson ML support widely used machine learning frameworks, such as TensorFlow, Keras,, PyTorch, Apache Spark MLlib, and more.
Amazon Web Services Machine Learning

Amazon Machine Learning

According to Amazon Web Services (AWS), the Amazon machine learning platform is a managed service for developing Machine Learning models and making predictions. Additionally, AWS has other great machine learning offerings, like the Amazon SageMaker, which is a platform to help developers and data scientists create and use machine learning models
Google Cloud Machine Learning Engine

Google Cloud Machine Learning

You can use the Google Cloud ML Engine as an AI Platform to train your machine learning models at scale, to host your trained model in the cloud, and to use your model to make predictions about new data. It provides machine learning model training, building, deep learning and predictive modelling

Programming Libraries and Frameworks

Tensorflow Python Machine Learning Framework

TensorFlow

TensorFlow is an open source framework that has become a standard tool for Machine Learning. TensorFlow has an extensive ecosystem of tools, libraries, and community resources that lets data scientists quickly build and deploy machine learning applications. Main benefit of using TensorFlow is abstraction – allowing you to focus on the overall logic of the application rather than going into too much detail.

Read more about TensorFlow in our post on Python Top 10 Libraries
Keras Python Machine Learning Framework

Keras

Keras is a popular library that is used extensively for deep learning and neural network modules (similar to TensorFlow). Keras is a powerful and easy-to-use free open source Python library for developing and evaluating deep learning models. Deep learning is one of the major subfield of machine learning framework. Keras supports several backends, for example TensorFlow, and acts as an interface for the TensorFlow library.
OpenNN Machine Learning

OpenNN

Open Neural Networks Library (OpenNN) is a library that has neural networks written in C++ programming language. The entire OpenNN library can be downloaded for free from GitHub or SourceForge.
PyTorch Python Library

PyTorch

PyTorch is a scientific computing package that uses the graphics processing units. PyTorch is an open source machine learning framework based on the Torch library. It’s mainly used for developing and training neural network based deep learning models. PyTorch is very popular in research labs.
Apache Mahout Machine Learning Java

Apache Mahout

Apache Mahout is an open-source project mainly used to build scalable machine learning algorithms. Mahout applies popular machine learning techniques such as classification, clustering, and recommendation. In addition, Mahout uses the Apache Hadoop library to scale effectively in the cloud.
Apache Spark Big Data Java

Apache Spark MLib

MLlib is Apache Spark’s scalable machine learning library, with APIs in Java, Scala, Python, and R. The aim is to make practical machine learning scalable and easy. In general, Apache Spark comes with a group of tools that can be used for various features, such as structured data, graph data processing, and machine learning analysis

Challenges with Machine Learning

The most important task you need to do in the machine learning process is to train the algorithm with sufficient and valid data to achieve an accurate output.

Therefore, some of the major challenges that you might face while developing your machine learning model include:

01

Poor Quality of Data

Data plays a significant role in the machine learning process. Unclean and noisy data can make the whole process extremely difficult and cause our algorithm to make inaccurate or faulty predictions.

Therefore, data quality is essential to improve the output and ensure our machine learning program can train and learn from the right data. 

02

Not Enough Training Data

It generally takes a considerable amount of data for most algorithms to function correctly. Less amount of training data will create imprecise or too biased predictions.

A rule of thumb could be that a simple task needs thousands of examples to make something out of it, and for advanced tasks like image or speech recognition, it may need millions of examples.

03

Irrelevant Features

Feature Selection is one of the core concepts in machine learning that impacts your model’s performance. Therefore, irrelevant or partially relevant features can negatively impact model performance.

So basically, feature selection is the process where you automatically or manually select the features in your data set that contribute most to your prediction variable or output in which you are interested in.

04

Overfitting of Training Data

Overfitting refers to a model that models the training data too well. This happens when a model learns the detail and so-called noise in the training data to the extent that it negatively impacts the model’s performance on new real-life data.

Why is this a problem? Well, the noise or random occasions (outliers etc.) in the training set will be learned as concepts by the model, and it will try to execute these concepts on the new data set.

05

Underfitting of Training Data

Last, but definitely not least, underfitting of data happens when the data is unable to establish an accurate connection between input and output variables. This means that the data is too simple to establish a precise relationship.

However, a good thing is that it is often quite easy to detect given a good performance metric.

06

Complex Process

Machine learning is a complex process and Imperfections in the algorithm when data grows or the process is transforming could occur. Hence there are chances of error which makes the learning complex. You need regular monitoring and maintenance to keep the algorithm working.

Underfitting vs. Overfitting of training data

A rule of thumb is that the model is underfitting the training data when the models perform badly on the training set due to the fact that the model is incapable of finding and learning the relationship between the input (often our X) and the target value (often our Y)

Summary: Machine Learning Infographic

Let’s summarise some of the key points that we have looked at in this post in an infographic. Please feel free to save it for later use and share it with friends and colleagues.

Machine Learning Introduction for Beginners

FAQ: Machine Learning Introduction for Beginners

What is Machine Learning?

Machine learning (ML)  is an application of artificial intelligence (AI) that allows systems to automatically learn and improve from experience without being explicitly programmed for the task. This can be described as machine learning focuses on developing programs that can access data and use it to learn for themselves.

What is the difference between AI and machine learning?

Machine learning is a subset of AI which allows a machine to automatically learn from past data without programming explicitly. AI on the other hand is a wider concept to create intelligent machines that can simulate human thinking capability and behaviour

What are the three types of machine learning?

There are three types of machine learning algorithms: 

1. Supervised learning: The model has a “supervisor”, or a “teacher”, who gives all the answers

2. Unsupervised learning: The model will work on its own to discover information and find patterns

3. Reinforced learning: The model gets either a reward or a penalty for the actions it performs

How is machine learning being used?

There are numerous use areas for machine learning. Some examples are: 
Self-driving vehicles: The model can Interpret the data  and decide the following actions to perform 
Healthcare: For example Computer Assisted Diagnosis (CAD) and make predictions for treatment
Finance: Fraud Detection, Algorithmic trading, process automation, etc 
Marketing: Make personalised product recommendations and identifying trends

Share
Eric J.
Eric J.

Meet Eric, the data "guru" behind Datarundown. When he's not crunching numbers, you can find him running marathons, playing video games, and trying to win the Fantasy Premier League using his predictions model (not going so well).

Eric passionate about helping businesses make sense of their data and turning it into actionable insights. Follow along on Datarundown for all the latest insights and analysis from the data world.