If you are working on AI/ML products, you might have already come across the words “precision” & “recall” along the way, we use these terms in order to optimize our AI/ML products. but do we really understand the meaning behind those numbers?
Through examples from the US justice system, let’s go over the terminology and explain the terms, in order to list and learn about the possible trade-offs between them, and get the first step into understanding how and what we want to optimize.
Imagine you are now a prosecutor, in front of you is a former-felon, convicted of multiple offenses, He had the means, motive, and opportunity, he confessed to the crime at hand and there were also eyewitnesses confirming him as the perpetrator for the crime.
True Positive
So far, in this classic “Law and Order” case- it seems like all data points lead to only one conclusion, This defended is guilty! there is no doubt in anyone’s mind. This predator deserves to be in jail. This almost-perfect scenario, considering that it really is correct, is what we would call a “True Positive”.
False Positive
In real life, it is assumed that (in the US alone), about 1 out of 10 convictions is actually a wrongful one.
The person incarcerated did not commit the crime they are accused of. as mentioned in my previous post, approximately 25% of the wrongfully convicted confess to a crime they did not commit.
This is what we would call a false positive result when we, as a society, identified them as guilty of a crime they did not commit.
True Negative
Every once in a while, the justice system is unable to convict a person for a crime. and sometimes, this person is actually innocent.
Statistically speaking, this happens about 30% of the time, since the conviction rate in the US is around 70% on average.
In this case, a jury that will find the 1 out of 10 defended as “Not guilty” as the world interested, will be correct to do so. Thus, making the free man a part of the ~30% non- convicted cases. falling directly under the True negative section.
False Negative
Sometimes, the system might decide on a non-guilty plea for the actual perpetrator, and set a guilty person free. since somehow along the way a seed of reasonable doubt was planted, whether due to very good lawyers or to a technicality, The system was unable to hold the suspect accountable for the crime. for example, in the case of OJ Simpson, it is commonly thought that the trial has missed out on the actual criminal. thinking wrongfully that he is not the person responsible for the crime. This is what we would call a “False negative”.
Precision
Precision is the fraction of relevant instances among the retrieved results.
The number of true positives/ all positives. this means that in our example if we know that 1 out of 10 convictions is a wrongful one, the precision is generally 90%.
Recall
The recall is the fraction of retrieved relevant instances among all relevant instances.
The number of relevant results received/ all relevant results,
This means that for our example if for 9 suspects we received 7 determining rightfully that the suspect is responsible for his crimes it means that the recall is 77%
Now ask yourself, In the justice system for which of those do you prefer to optimize?
Would you prefer a significantly higher precision rate? one that will make sure that in 99% of certainty a person is the perpetrator of the crime they are accused of, so that only in this level of certainty they will lose their freedom.
This will cause your overall “relevant results” number to become lower, Meaning, fewer people will be defined as guilty and as a side effect, more guilty people will walk freely out of jail.
Or do you prefer to make sure 90% of all perpetrators are behind bars?
Being “hard on crime”? cleaning the streets from all bad seeds?
In doing so, you effectively decide that you are ok with sometimes missing out on a few- wrongfully accused individuals, claiming for the better good. and you need to ask yourself, are their lives a fee you are willing to pay for a safer street?
The beautiful world of AI products are derived from data-driven tools, specifically meant to help us make decisions. In order to fully grasp the impact of our numbers on people’s lives, we always have to consider all the tradeoffs of our optimization.
Only then, we as a society can optimize for humans instead fot numbers.