Automatic Tank Recognition

A recent xkcd comic highlighted the varying complexity of tasks in computer science, and the unrealistic expectations that some might have using object recognition in images as an example:

xkcd 1425

Someone at flickr recognised this as a great opportunity show off some of their research.

I mention it because it reminds me of a great anecdote in the image processing/machine learning communities that I don’t hear often enough, here it goes-

The US government wanted a way to automatically detect tanks, for early warning or automated targeting. So a team of researchers went out and took 200 pictures of a variety of tanks. The next day they took 200 pictures without tanks.

They decided to use a neural network to teach their computer to recognise tanks. So in their training phase they gave it a picture of a tank, and told it there was a tank.

They gave it a picture without a tank and told it there was no tank. They repeated this with a hundred of each type, so the computer could identify tanks in a variety of circumstances – occlusions, colour, etc.

Then they gave it another picture from the remaining (unseen) images, and asked “tank or no tank?”. It got it right. They gave it another, and it got it right. It correctly classified all 200 unseen images.

This was a great achievement after a long period of research and significant funding.

Then to prove the versatility of the neural network, they started working with new images, with and without tanks.

The computer performed miserably, no better than random guessing.

Then someone noticed, in the original training set, the 200 pictures with tanks were taken on a sunny day, then 200 pictures without tanks were taken on a cloudy day.
They weren’t detecting tanks, they were detecting weather.

I’m not sure what the original source is, I was told the story at a BMVA event, but this appears to be the favourite telling: https://neil.fraser.name/writing/tank/.

It’s a great tale about the mysteries of neural nets, but also a tangential reminder that in image processing, computer vs human perceptions can be entirely different. There’s a lot of enthusiasm for replicating human vision systems, but it’s not the only option.

Technocamps Beachlab 2014

Aberystwyth Technocamps Beachlab 3D Printing Minecraft

Photo by Arvid Parry Jones

On Saturday Aberystwyth University held an “Access All Areas” event; opening the university up to the general public. As with last year we worked together to combine it with the Technocamps beachlab.

Aberystwyth Technocamps Beachlab 3D Printing Minecraft

Last year I got to demonstrate our relatively new 3D printing and scanning setup. This year things weren’t as exciting owing to some broken kit, but it was still a good day nonetheless. The recent news regarding 3D printing of gun components has clearly led to a major increase in awareness (some good, some bad). In many ways it was better that the machines weren’t in use as I got to had some in depth discussion with a variety of people.

Beachlab 2014 Dalek Doris

Aberystwyth Technocamps Beachlab 3D Printing Minecraft

Aberystwyth Technocamps Beachlab 3D Printing Minecraft

Photo by Arvid Parry Jones

A Nasty Hack for Image Landing Pages

I saw this on the Aberystwyth Comp Sci facebook group: http://lcamtuf.coredump.cx/squirrel/. It’s a little example of embedding image data within a HTML pae, in a similar (but less pleasant) way to using data URIs for embedding images stored as base64 strings (see below). It’s a very hacky way to give direct-link (rather than inlin/hotlinked) users a page rather than the image alone.

I was intrigued so took a quick look at the source and replicated it.

I don’t see much practical use, it’s ugly and I wouldn’t rely on it, but it’s straightforward to replicate and as the page states, all the magic is done client side.

Should work with most images (only tried with jpeg), the first few bytes (APP0 segment) go before the tag, that way it’s recognised as an image (we need to make the body hidden so that doesn’t show, then the page itself goes in the next few bytes – so we’re hoping the browser ignores these. Lastly we put the image data in an unclosed html comment. I suspect with a longer page we’d see the image becoming corrupt.

It even preserves exif.

So what’s the use? Well, you can use the same URL for the img src tags as you do for a landing page. But at the end of the day, you’re serving a corrupt page that shouldn’t work and can’t be relied upon.

It’s interesting, but the correct way to deal with hot linking, or image landing pages is to use mod_rewrite (in Apache). But at the end of the day, file extensions exist for a reason and you shouldn’t really be serving up binary data in such a messy manner anyway. There’s simply no point in forcefully redirecting users away from data like this; those that want it, will get it.

Here is an example of an image that can be copied and pasted directly as HTML. Many browsers recognise data URIs in which we can store data 9such as images as base64:

Where “XXXXXXXXXXXXXX” is the base64 string.

Base64 is a binary-to-text encoding mechanism that allows binary data to be transmitted as ASCII (a mere 127 printable characters) strings, when you see “MIME” referenced in relation to email, it’s about getting attachments added, and that’s how it works. Very roughly, encoding data as base64 (using fewer bits) increases size by a third.

Barebones Distance Function for pdist()

I had a bit of bother when adding my own distance function for use with Matlab’s knnsearch (and other functions). Surprisingly, custom functions aren’t discussed much and can be a bit troublesome the first time, so here’s the template I’m using from now on:

 


%Structure as specified by knnsearch.m:
% function D2 = DISTFUN(ZI, ZJ),
%
% taking as arguments a 1-by-N vector ZI
% containing a single row of X or Y, an
% M2-by-N matrix ZJ containing multiple
% rows of X or Y, and returning an
% M2-by-1 vector of distances D2, whose
% Jth element is the distance between the
% observations ZI and ZJ(J,:).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [ L ] = distanceFunction(sample, models)
B = length(sample);

if size(models, 2)~= B
error('Mismatched vector lengths!');
end

model_count = size(models,1);
L = zeros(model_count, 1);

%compare the sample to each model
for m=1:model_count
model = models(m,:);
L(m) = 2.*(sum((sample.*log(sample+eps)) - (sample.*log(model+eps))));
end

I’ve left loops in for clarity, naturally, try and vectorise all that you can.

Also, eps is a useful function; it returns the distance to the next larger in magnitude floating point number. In most cases, you can take this to mean a really tiny number that prevents division by zero errors. A nasty but generally trustworthy trick.

First try with the 3Doodler

3doodler3

I backed the 3Doodler project back in March and in what must be a Kickstarter first, the folks at WobbleWorks have delivered well ahead of schedule. Today I got an email with a tracking number, I was pleased to find it had already passed customs and even more pleased when it arrived this afternoon.

3doodler2

Caroline and I got it running with no problems, though I think some practice is in order. It may not suit me, lacking an artistic skill, but it’s a lot easier to setup than a makerbot!

3doodler1

My only complaint is that it does seem to eat the PLA – we played with it for less than 5 minutes (just long enough to take a photo) and we went through a 25cm piece. I think the flow may be slightly inconsistent.. I’ll update once we’ve had chance to play some more.

Quick and dirty automatic old file deletion

I’ve started using my VPS to store stills from my IP cameras. It’s easy and quick – I’ve got a script running on my openwrt router that fetches the stills periodically from 8 different cameras. Whilst some support FTP upload, others are less well connected (hence the need for the script).

Unfortunately whilst I was away, without Internet last month, the storage on the VPS filled up with these images taking down most of the services running. This was the quick and dirty fix I managed to implement using on my phone:

This code adds a daily crontab that uses the ubiquitous find command to simply delete old files. In this case I’ve used 10 days, but you can get flexible with the mtime option if you check the man page. We use exec as this doesn’t expand the entire list of found files so avoids “argument list too long “issues.

Couple of caveats –

  • I ran it as root, because that’s what I had.. Many users are ingrained with a fear of the root; but I consider it a personal preference which is best debated elsewhere.
  • You should use  crontab -e to modify your cron files as it verifies the syntax.

Being able to do these quick fixes is what I love about linux. I highly recommend following bash one liners on twitter – https://twitter.com/bashoneliners who also have a QDB style website.