BY-ADITYA BHATT
THE IDEA OF SPECIFICITY AND SENSITIVITY IS VERY INTEGRAL IN ACCESSING HOW GOOD YOUR CLASSIFICATION MODEL IS.THE TERMS ARE CONFUSING AND OFTEN PEOPLE MESS UP BUT POST THIS BLOG ONE WOULD BE ABLE TO MAKE A CLEAR CHOICE ABOUT THE CORRECT EVALUATION MATRIX TO BE CHOSEN.
THE ABOVE TERMS WILL BE CLEAR BY GIVING A EXAMPLE WHICH ALL OF US WOULD RELATE,SUPPOSE WE GO FOR A COIVD TEST AND TEST NEGATIVE ARE WE 100% SURE THAT WE DON’T HAVE COVID?
THE ANSWER TO IT IS NO.
SO TO UNDERSTAND SPECIFICITY AND SENSITIVITY CONSIDER WE HAVE A GROUP 1000 PEOPLE AND WE KNOW THAT 600 PEOPLE HAVE COVID AND 400 DON’T HAVE IT.ALL PEOPLE GO FOR A COVID TEST USING TWO TOOL KITS ‘A’ AND ‘B’.
THE RESULS FOR A ARE-
OUT OF 600 PEOPLE WHICH HAD COVID 200 WERE CORRECTLY DIAGNOSED.(400 WERE INCORRECTLY CLASSIFIED)
OUT OF 400 PEOPLE WHICH DIDN’T HAVE COVID ALL WERE CORRECTLY DIAGNOSED AS NEGATIVE(0 WERE INCORRECTLY CLASSIFIED)
LETS US MAKE A CONFUSION MATRIX FOR A
TP =TRUE POSITIVE I.E ACTUAL(TRUE) OUTCOME IS TRUE AND MY MODEL HAS PREDICTED AS TRUE
FP=FALSE POSITIVE I.E ACTUAL(TRUE) OUTCOME IS FALSE BUT MY MODEL HAS PREDICTED AS TRUE.
FN=FALSE NEGATIVE I.E ACTUAL(TRUE) OUTCOME IS TRUE BUT MY MODEL HAS PREDICTED AS FALSE.
TN=TRUE NEGATIVE I.E ACTUAL(TRUE) OUTCOME IS FALSE AND MY MODEL HAS PREDICTED AS FALSE.
FOR TOOL KIT ‘B’ RESULTS ARE-
OUT OF 600 PEOPLE WHICH HAD COVID 400 WERE CORRECTLY DIAGNOSED.(400 WERE INCORRECTLY CLASSIFIED)
OUT OF 400 PEOPLE WHICH DIDN’T HAVE COVID ALL WERE CORRECTLY DIAGNOSED AS NEGATIVE(0 WERE INCORRECTLY CLASSIFIED)
LET US MAKE CONFUSION MATRIX FOR B-
NOW USING BASIC SENSE,WE WOULD CHOOSE TOOL KIT B AS THE BETTER OPTION.
AS IT GIVES MORE TRUE POSITIVE’S WHICH SHOULD BE OUR AIM WE DON’T WANT A PERSON WHO HAS COVID TO BE MISSED AS ONE CAN CREATE A LONG CHAIN OF INFECTION.
SO THE EXAMPLE MUST HAVE BEEN SIMPLE AND EASY TO RELATE.
NOW LET’S START WITH SENSITIVITY-
SENSITIVITY=TP/(TP+FN)
FOR TOOL KIT ‘A’ SENSITIVITY=200/(200+400)=1/3=0.33
SENSITIVITY IS ALSO KNOWN AS TRUE POSITIVE RATE AND RECALL
SO IF YOUR AIM IS TO TARGET MORE POSITIVES CORRECTLY YOU MUST INCREASE SENSITIVITY
FOR TOOL KIT ‘B’ SENSITIVITY=500/(500+100)=5/6=0.83
SO AS SENSITIVITY OF TOOL KIT ‘B’ IS HIGHER IT WILL BE A BETTER TEST/MODEL/CLASSIFIER TO DETECT AND TRACE COVID.
LETS MOVE TO SPECIFICITY
SPECIFICITY=TN/(TN+FP)
SO ONE MUST KNOW 1-SPECIFICITY=FALSE POSITIVITY RATE=TYPE 1 ERROR
SPECIFICITY FOR TOOL KIT ‘A’ IS=400/(400+0)=1
SPECIFICITY FOR TOOL KIT ‘B’ IS =400/(400+0)=1
SO BOTH KITS HAVE SAME SPECIFICITY
IF CORRECTLY IDENTIFYING NEGATIVES FOR YOU IS IMPORTANT TARGET SPECIFICITY.
IF CORRECTLY IDENTIFYING POSITIVES IS IMPORTANT FOR US THEN WE SHOULD CHOOSE A MODEL WITH HIGHER SENSITIVITY.HOWEVER IF CORRECTLY IDENTIFYING NEGATIVES IS MORE IMPORTANT WE SHOULD USE SPECIFICITY.
POST THIS I GUESS THINGS ARE BETTER.
CONCLUSIONS IN CONTEXT TO COVID TESTING-
- RTPCR HAS HIGH SENSITIVITY(NOT 100%) AND HIGH SPECIFICITY
- RAPID TEST HAS SLIGHLTY LESS SENSITIVITY
THAT IS WHY RTPCR IS BETTER AND ACCURATE FORM OF TESTING.