A recent xkcd comic highlighted the varying complexity of tasks in computer science, and the unrealistic expectations that some might have using object recognition in images as an example:
I mention it because it reminds me of a great anecdote in the image processing/machine learning communities that I don’t hear often enough, here it goes-
The US government wanted a way to automatically detect tanks, for early warning or automated targeting. So a team of researchers went out and took 200 pictures of a variety of tanks. The next day they took 200 pictures without tanks.
They decided to use a neural network to teach their computer to recognise tanks. So in their training phase they gave it a picture of a tank, and told it there was a tank.
They gave it a picture without a tank and told it there was no tank. They repeated this with a hundred of each type, so the computer could identify tanks in a variety of circumstances – occlusions, colour, etc.
Then they gave it another picture from the remaining (unseen) images, and asked “tank or no tank?”. It got it right. They gave it another, and it got it right. It correctly classified all 200 unseen images.
This was a great achievement after a long period of research and significant funding.
Then to prove the versatility of the neural network, they started working with new images, with and without tanks.
The computer performed miserably, no better than random guessing.
Then someone noticed, in the original training set, the 200 pictures with tanks were taken on a sunny day, then 200 pictures without tanks were taken on a cloudy day.
They weren’t detecting tanks, they were detecting weather.
I’m not sure what the original source is, I was told the story at a BMVA event, but this appears to be the favourite telling: https://neil.fraser.name/writing/tank/.
It’s a great tale about the mysteries of neural nets, but also a tangential reminder that in image processing, computer vs human perceptions can be entirely different. There’s a lot of enthusiasm for replicating human vision systems, but it’s not the only option.