The Machine Learning Bite Size Series is aimed at explaining the terms in a simple 1 minute read for a quick reference.
Suppose youβre training a machine learning model and generating predictions, you compare the predicated value with the actual targets and generate a loss value, depending on the comparison of the output.
The formula for computing the loss value for Hinge Loss (l) is
l = Hinge loss
π¦ = prediction
π‘ = actual target for the prediction , assume π‘ is either +1 or -1
- Correct prediction π‘=π¦ , loss is πππ₯(0,1β1)=πππ₯(0,0)=0
- [Low Loss] Incorrect prediction π‘β π¦, loss is πππ₯(0, 1-t*y) , e.g. π‘=1 while π¦=0.9, loss would be (max(0, 0.1) = 0.1
- [High Loss] Incorrect prediction π‘β π¦, loss is πππ₯(0, 1-t*y) , e.g. π‘=1 while π¦=-2, loss would be (max(0, 3) = 3
Hence hinge loss will be higher for higher inaccuracies in prediction
Effectively hinge loss will attempt to maximize the decision boundary between the two groups that must be discriminated.