Learning Model
time to make Saitama's head shine
Last updated
time to make Saitama's head shine
Last updated
The learning model contains of 2 components:
Hypothesis Set
Learning Algorithm
Firstly, a ML model creates the set of hypothesis. A model can be for e.g.
Linear Classification
Linear Regression
Logistic Regression
Depending on the learning scenario, a model to be used can be favored over another.
In the credit-card approval case, the 'Perceptron Learning Model' is the preferred for Linear Classification. Where d amount of x variables refers to input attributes e.g. "gender", "age", "salary"
The bank should approve credit if
and deny credit if
with the combination of both formulas to be potentially rewritten as
with each hypothesis looking like:
As mentioned, the model creates the hypothesis set (H). However, there are not many parameters we can control after picking a model. For example, we cannot simply alter the formula of the Perceptron Learning Model to produce multiple hypotheses (for the logic will be changed).
In fact, there are only 2 parameters within our control in the entire universe of ML:
Each weight is attached to each input variable (x)
In this case, the more important a variable is in the context of approving of credit-card application, the greater the value of weight should be given to influence the instance to exceed the threshold
This can be an arbitrary value/constant that the sum of all scoring for each variable each candidate must minimally meet to be considered approved
This can also be used as an ends to a mean, for example to use threshold to approve 250,000 people maximally in their credit-card applications
Through altering these 2 parameters, we can essentially create multiple hypotheses to eventually have a hypothesis set (H), with the best hypothesis (g).
The threshold is very much like a weight. Both are parameters (values that change) to be tweaked. Hence, we can treat threshold as a special 'weight' to greatly simplify the formula from
to
and similarly grouping into the summation notation, to
Do take a look at my GitBook Matrices and Linear Algebra Fundamentals if you do not know what matrices are!
At this juncture, the formula of the model we're using now is pretty neat, but its major problem is that its algebraic computation can be pretty brutal to implement. To illustrate this, lets take Alice from our example.
Name
Age
Gender
Salary
Debt
Default
Alice
23
Female
24000
-
1
By implementing our model's formula and slotting arbitrary weights, we'll get
However, if we vectorize both the weights and input variable x, we'll get
Ok in pure honesty it does not look that bad LOL, but will probably look worse with more input variables. Hence, the vectorized formula is written as
The learning algorithm simply updates weights.
Hence, for a Perceptron Learning Algorithm, it merely corrects misclassified points where
Where is derived from the target function.
For example,
from the hypothesis, while
from the target function,
Point n is thus misclassified
Where
and acute, product is negative,
and right, product is 0,
and obtuse, product is positive,
The product (y) from the dot product of Vector and x is negative, whereas it should be positive.
Hence, the Perceptron Learning Algorithm corrects the misidentified with
Since , we are simply adding x to the weight, where
With the new Vector being
The new angle formed with vector Vector and is , and the Perceptron Learning Algorithm has updated the weight to correctly classify the point N. 😵
The reverse is true as well.
-just skip the following if you understand the previous lol-
Given the scenario where,
from the hypothesis, while
from the target function,
For , thus n is misclassified
Where the product (y) from the dot product of Vector and x is positive, whereas it is currently negative.
Hence, the Perceptron Learning Algorithm corrects the misidentified with
Since , we are simply subtracting x from the weight, where
With the new Vector being
The new angle formed with the vector Vector and is , and the Perceptron Learning Algorithm has updated the weight to correctly classify the point N. 😵
If the training examples are truly linearly separable, all points can be correctly classified through repeating the above Perceptron Learning Algorithm for all points. This is despite misclassifying correctly classified points previously.
That is if the data set is truly linearly separable. If not, we can attempt to transform the data to make it linearly separable.