Before doing a 4-year degree in AI and Software Engineering, I’d always wonder:
”If AI is so good why don’t we use it in our day to day lives?”
After a lot of studying and trying my hand in computer vision and my own research, I realised: ML has so many flaws we often overlook.
These are problems that need to get solved if we ever want to bring AI into the real world.
You can often see this in the form of people saying “AI is racist” or similarly in research papers when they say they get 90% accuracy and then it’s terrible when tried in real life.
One of the biggest problems in ML right now is that it’s focused on academics getting higher accuracy on custom datasets, not on actual results in real life.
Nothing can quite mimic the randomness of real life, but it seems the gap between academic data and real-life data is too big.
I believe this is really due to 3 things:
a.) A wrong incentive structure — as academics currently get rewarded for higher accuracy rather than on more realistic data, so of course they will choose easier datasets.
b.) Ease of using pre-made datasets — many researchers due to lack of time end up using other people’s datasets even if doing a completely custom context for an ML model. It’s so much easier to use someone else’s data but then your model will also have all the inherent bias of the dataset.
3. Acquiring easy data — the researchers who do go out and collect data often take samples from their local context (for example universities) which greatly skews things like demographics. Furthermore, it is greatly overlooked in my opinion the biases inserted by data collectors when deciding what data to keep and what to remove. We talk about “stratified” and “random” sampling, but in reality data collectors insert bias everywhere from deciding who to take data from, where a camera is placed or what shots to include.
2. Fraudulent Academics (or bad ones)
Academics get rewarded for citations, which they often get more for in ML if they demonstrate higher accuracy of an ML model on a dataset.
The problem with this incentive structure is that then it can lead to cherry-picked results.
This is why studies show that most people that tried to replicate the accuracy of ML research papers simply can’t using the same model.
If a researcher gets a great accuracy once and then gets terrible accuracy after rerunning the model several times they are likely to use the higher one.
This is awful as it leads us to an unacademic rabbit hole , because many future papers are based on past papers with high accuracy models.
This is why I think there needs to be a academic standard of rigour for ML. For example using an average of trials using 5-fold cross validation.
Furthermore, the incentive system for academics needs a big change in order to reward genuine results that work in real life.
Essentially, ML research many times looks like a shot in the dark.
In part because it is — whether you are picking what ML model to use, data augmentation or hyperparameter searching there is a lot of trial and error involved in ML.
Most researchers actually spend a lot of time trying and re-trying hyper parameters to see which best fits a dataset.
I think a big bottleneck we need to solve is to help automate things like searching for the best ML models, augmentation and hyper parameters.
If a PhD ML student wastes their time on trial and error we may as well teach a machine to do it so they can instead focus on delivering better genuine results.
👏 Thanks again, I hope you liked this article! Please comment any of your thoughts!