What is Machine learning?

We live in a time when development of robotics, artificial intelligence and machine learning are in our daily life without noticing it. We can teach machines how to learn, and some machines can learn on their own. Machine learning assumes that a computer recognizes patterns using examples, rather than being programmed using specific rules. These patterns are contained in the data. Basically, can predict future behavior based on past data.

Machine learning is the semi-automated extraction of knowledge from data. It is an algorithm (a set of rules) that learn from complex functions (templates) from data without relying on rules-based programming, in other words a method of teaching computers to make predictions based on data. Machine learning is also referred to as predictive analytics.

There are three main component parts which we can break down.

  • Data analysis – machine learning always starts with data and your goal is to extract knowledge from that data. The question you are trying to answer might be answerable from that data.
  • Automated data extraction – machine learning requires some amount of automation. You are applying some process or algorithm to the data using a computer rather than trying to gather your insights from the data manually.
  • Semi-automated process – machine learning is not a fully automated process. It requires you to make many smart decisions in order for the process to be successful.

Main categories or types of machine learning.

  • Supervised learning – is also known as predictive modeling which is the process of making predictions using data. For example, there is a task in our mailbox which may predict whether each email message is spam or non-spam. This is supervised learning because there is a specific outcome we are trying to predict.
No spam email
  • Unsupervised learning – is the process of extracting structure from data or learning how to best represent data. It is an independent learning process where the target values are unlabeled/unknown and there is no output mapped from the input data. The system needs to learn by itself from the data input and detect hidden patterns. For example, when buying products online , if flour is in the shopping cart, then it suggests buying bread, yeast, etc. Unsupervised learning looks at the data and predicts the other attributes that are associated with the product.

How does it actually work?

At very high level there are the two main steps of supervised learning:

  • First – you train a machine learning model using your existing labeled data. Labeled data is data which has been labeled with the outcome, which in the case of the email example is whether each message is spam or non-spam. This is called “model training” because the model is learning the relationship between the attributes of the data and the outcome. These attributes might include the message text, the number of embedded links, the length of the message, and so on. 
  • Second – you make predictions on new data for which you don’t know the true outcome. In other words, when a new email message arrives you want your trained model to accurately predict whether the email is spam or non-spam without a human examining it. To summarize these two steps, you could say that the model is learning from past examples made up of inputs and outputs and then applying what it has learned to future inputs in order to predict future outputs. Because you are making predictions on unseen data, which is data that was not used to train the model it is often said that the primary goal of supervised learning is to build models that generalize.