I recently received news that the paper I was working on has finally be published!

If you go to http://www.smc2013.org/human_ss_accepted, and then ctrl+f search for “Explore New Eye Tracking and Gaze Locating Methods”**, **then boom.

Thankyou, and goodnight!

]]>

When initialising the centre eye position, I considered 20 estimates (more would be better, but also take longer), took the standard deviation, and then threw out everything that was 2 standard deviations away from the mean. I then took the mean of the remaining estimates for the overall estimate.

For the eye estimates, I considered the last 10 estimates. I would consider more, but then we get the old eye positions influencing new ones. I didn’t use standard deviations, as if they eye does move, it might move outside of a standard deviation. I just used a simple mean. The same for gaze location.

So here is a video of it at work. Once again there is no match template, so we can’t really look at where the gaze estimate is, but more at how smoothly it moves. The estimate lags, but never mind!

I also looked a bit at the calibration process. Up till now I had just been guessing the “eye radius”… this is what I called the distance the eye moves to look from one side of the screen to the other. But guessing is never acceptable. Ever. So I devised a simple calibration process. First the person looks to the centre, then the left, then the right. Through all this the computer is finding the average distance moved, (throwing out all the points 2 standard deviations away etc). This gives us a good guess at the eye radius.

]]>

It works quite well, I have made the box around the eyes stick in place, on the face, and so now I really just need to make everything more efficient and accurate. Then I can do a bit of maths and work out some estimates for where the person is looking. Hopefully.

When it is a bit better I will post a video, but for now here are some pictures of the template matching working:

And here is a video of it at work! In this video I am slowly moving my head from left to right:

This is good, but notice how now, when I move my eyeballs from side to side, the estimation follows my iris imperfectly. This destroys a lot of the accuracy. So after much deliberation, and surfing through papers of people who have done similar things to me, I decided that I should search the area near the current estimations for either the local centre of mass, or the local minimum. I tested both of these, and they both seem to improve the accuracy of the estimate. I’m unsure as to which I should use, or maybe both… I also need to test different sizes of search area!

Here is a video of this:

]]>

So I open up a stream from the webcam, and then draw around the general eye area. This then remains fixed in space and does not move. The eyes are then located using my method. When the person is looking at the center of the screen, a button is pressed and this initialises the initial eye position. From then I just calculated the approximate eye displacement, and used the proportion displaced to relate to the proportion of the screen that the gaze had travelled. Please note that this is the most inaccurate thing ever, and will be fixed as soon as I have the time and motivation. Ok I lied, its not the very biggest problem…

The fact that the box around the eye area does not move is the BIGGEST problem! This means that if your head moves at all, the computer wont know that it was your head and not your eye. and it is very easy for your head to twitch by the radius of your eye. I will fix this using the power of template matching. Maybe. I still need to experiment. My plan is to take the initial box around the eye area, and use it as a template, and make it stay fixed on the face, rather than the webcam image. If that makes sense.

Once the eye box is fixed on the face, we can use that to tell us how the face moves, and the eye positions to tell us how the eyes move, and then all we need is some tasty maths!

Here is a video of my work so far…

http://www.youtube.com/watch?v=5v9S9zfnE8k&feature=plcp

There are 3 different sensitivity levels in this.

]]>

In other, less exciting news, I have been half heartedly working on a function to find the local maxima of an image. This would be easy (look for a change in gradient in both the x and y direction), but the image is full of disgusting noise, leading me to search for a more creative solution. Perhaps some pre-processing is needed? Or maybe some kind of… find a maximum and then ignor any nearby maxima. Could try filling in every change of gradient point, and then convolving with Gauss, and then repeating. Who knows!

]]>

First the paper introduced the concept of an isophote. This is simply an area of constant darkness/intensity. So if our image is described by a function f(x,y), then we have an isophote at (x,y) where f(x,y) = f0 (f0 is constant). This is essentially describing contours. It then goes on to use the curvature of each pixel, assuming it is on an isophote, to estimate the radius of the circle it is on. Curvature is the second derivative of y with respect to x, and a formula for this can be derived from the definition of an isophote using total derivatives (as f = a constant, we have y(x), so we must use the total derivative, paying attention to the chain rule). The radius of the circle that it is on is given by 1/curvature, which is very convenient. To convince yourself of this, try plugging the values x = 0, y = r into the definition of curvature, and you should get 1/r. Easy. So now we have the distance to the centre. The direction to the centre is given by the gradient at that point. All it takes is a little trigonometry, and we have an estimate!

So here is a pretty picture of it at work:

It is, granted, imperfect. However I think that is more due to me using bitmaps, and the image when you zoom in being somewhat more square than it should be for a circle. When I get some proper image processing software, I foresee this method getting better.

Here I have mapped every estimate in blue onto an image of an eye. The paper suggested convolving with the Gaussian functions (Gaussian blur?) so you can see that result here as well:

Not toooo bad, although far from perfect. There is a definite cluster around the eye, but also some clusters around random other bits of skin.

One last note. The paper suggested that the curvature of the pixel is used to determine how much emphasis is put on that estimate. (ie to weight each estimate). The forumla, for Intensity of picture I, is sqrt(Ixx^2 + 2Ixy^2 + Iyy^2) (Ix is the partial derivative with respect to x). This is just larger for smaller circles, smaller for larger circles. so smaller circles would have more effect. I am right now unsure about this. More ideas will follow (lol).

Over and out.

]]>

I have also altered the shape of the Gaussian function to more of an oval shape to accommodate eye shapes. This could be good, this could be bad. I have not tested it yet. The following is using an oval shape. I have also noticed that the picture gets darker with more spread out Gaussian functions…. I think I need to moderate the height so as to have a constant area under the function. Integration awaits me!

Original image:

And now after processing, the left is using the original “Joelet” method, the right is using Gaussian blur:

A little different, but I think the average number of eyes detected would be the same.

Great! This would make detecting bigger eyes much more efficient. Although I am still relying on the user inputting an approximate eye size, when I find a fairly optimum Gaussian function shape I think I could probably do away with that.

]]>

The worst part is that I’ve seen *I MY CHAIR JUST DISINTEGRATED UNDERNEATH ME!!* Alex’s method implemented in matlab code, but I such a matlab noob, I have a lot of trouble following it. In the near future I might just sit down and try extremely hard to make it work, using the matlab code and everything.

In other news, I have been trying to do a literature review on circle finding techniques (my first ever lit review!). It seems to me there are a few main techniques.

- Hough Transform. This is originally used for detecting straight lines, but has been modified to detect any number of shapes, including circles.
- Wavelets. Wavelets are the basis for how jpegs works, and are very powerful tools for looking at a picture in different resolutions. In particular, there have been modifications to “circlets” which can be used to track down circles.
- Other, misc. Lots of them similar to mine, in which some estimate is made of the centre of each pixel or group of pixels.

I also read n interesting paper about the relationship between head pose and eye gaze, and in particular, how to tell if someone is surprised by something they looked at, or whether they always intended to look at it. Essentially it showed that if a person looks with their eyes, and then turns their head, they are startled. But if they move their head a bit before their eyes, then they intended to look over there.

Fascinating stuff.

]]>

Previously (see Success!) I implemented a method of my own design, but noted a limitation: I needed to either input the approximate eye radius, or have massive computing power to deal with very large accidental lengths. Inspired by a paper that my supervisor sent to me, in which they use a Gaussian blur effect to turn lots of estimates of a point into one estimate. In fact, it is not truly “inspired” by the paper, but more taken from the paper. Exactly.

So Gaussian blur (as far as I can make out) works by convolving the Gaussian function (essentially a single hump) with the image. This smooths out the image.

I have attempted to apply this to my method. But first I need a FFT library. The following are my installation woes:

OpenCV (http://opencv.willowgarage.com/wiki/) and FFTW (http://www.fftw.org/) are C++ libraries. OpenCV is for image processing, and FFTW is used to perform Fourier Fast Transforms. They would be great, and really complement my project. However it turns out that its somewhat impossible to install and use them with any windows program. I have literally wasted days of my life trying to get them to work with either Microsoft Visual C++ 2010 or Netbeans. In the end I gave up.

In the end I gave up.

If anyone knows of any good (simple!) tutorials, could they possibly let me know? I think I will have to build OpenCV from the source code, but I really don’t know how I would go about that. FFTW requires me to use some lib.exe program from Visual C++ to add the libraries. But it seems to crash every time I run it.

In the mean time I am using a C program for FFT I found at http://www.tech.dmu.ac.uk/~eg/tensiometer/fft/. It appears to work… However their are some interesting results when I use it to perform a Guassian Blur.

Here are my Gaussian Blur attempts:

Blurs to:

First thing to note, the FFT program only accepts multiples of 2, so I had to stretch the image into a square frame. Not such a big deal as I can always stretch it back afterwards.

Second thing to note. WTF is it shuffled??? It appears to have been quartered and then shuffled. I can only assume this is a fault with the FFT library, but I wont know until I can test the same method out with a different FFT library.

Third thing to note, the Guassian Blur worked! This is my first program using any FFT’s, so I am very proud of this

]]>

As you can see… not quite what I had in mind. Although looking closely at this, it seems to be outlining white lines quite a lot. I should investigate that….

Nope! switching the black to white makes no noticeable difference. There must be a (quite large) bug somewhere in the program. Perhaps one day I will find it. I can but hope.

]]>