Terminologies
Last updated
Last updated
Before we fly, let's crawl through some commonly used terminologies.
The information used to learn
There can be more than 1 variable
For example, "Gender", "Age", and "Salary" in the credit-card example.
The desired output after learning
There can only be 1 output variable
For discrete, this can either be defaulting or not defaulting in credit-card
For continuous, this can be predicting the amount of credit to give
'f' here refers to the ideal hypothesis
A hypothesis maps a sample to its various y values. With this, the actual training examples are 'generated'
f : x ā y, means eating some x variables, and spit out y variables
f is unknown
I like analogies, especially this one from 3Blue1Brown (I believe). A function merely eats up data, and spits out data. Different function transforms data differently, in which can be interchangeably referred to as hypothesis.
Also, I understand that point 2 can be extremely confusing (at least for me, but will further elaborate it to my best in the next topic), but in short it is that the 'real' set of output variable is generated with the target function f.
Generated from the target function f(x)
A combination of both input (x) and output (y) variables
Each instance is represented by the row number N
We can have multiple hypotheses in our hypothesis set (H).
However, there can only be one (and only one) we select to be called g
To put it all together,
Hence, , or g is in H
g is the best hypothesis approximating the f,