For a recent project, I wanted to be able to estimate the distance and direction to a known target using a webcam. I decided to use SimpleCV, a Python computer vision library, to process the images. I wanted to use a Raspberry Pi along with a Raspberry Pi camera module so that the entire project could be mobile; the Raspberry Pi is somewhat underpowered, so I spent a lot of time trying to optimize my code to get an acceptable framerate. The full code is available on GitHub.
To start, I needed a target that would be easily and unambiguously recognizable. SimpleCV has some primitives to recognize specific colors and shapes (including circles, squares and rectangles), so I decided to put a high contrast circle against a contrasting background. I decided on the circle because it shouldn’t be affected by roll of either the camera or the target. I tried a few different things, but ended up covering a CD in bright orange duct tape and attaching it to some black posterboard. The CD worked well because it has a well known and specific diameter.
To calculate the distance to the target, we can use some trigonometry as long as we know the angle to the target and the length of some object. CDs have a diameter of 120mm, so that part’s easy. We can find the angle by calculating the radius in pixels of the identified CD in the image, dividing that number by the width of the image, and then multiplying the ratio by the horizontal field of view of the camera. According to fastmapper in the Raspberry Pi forums, the Raspberry Pi camera module’s HFOV is 53.50±0.13 degrees. Using this information we can calculate the distance by using a tangent calculation.
Raspberry Pi camera
The Raspberry Pi camera module doesn’t show up as a Video4Linux device, so accessing it through SimpleCV isn’t as easy as other webcams. There are some projects to get the module to appear as a Video4Linux device, but they looked more convoluted than I wanted. I instead used the picamera module to access the Raspberry Pi camera and saved the data directly into an io.BytesIO buffer and used PIL to open the image.
camera = picamera.PiCamera() camera.resolution = (320, 240) camera.start_preview() # Start the camera with io.BytesIO() as byte_buffer: byte_buffer = io.BytesIO() camera.capture(byte_buffer, format='jpeg', use_video_port=True) byte_buffer.seek(0) image = SimpleCV.Image(PIL.Image.open(byte_buffer))
The use_video_port takes pictures faster, but the individual pictures come out more grainy. Without it, I was getting less than one picture per second. It’s worth noting that I’m encoding the image to JPG format and then immediately decoding it, which is wasteful and unecessary; it should be possible to use the raw data and avoid the transcoding, but I haven’t figured out how to do that yet.
To process the image, I needed to find circular orange blobs in the picture. SimpleCV has a method colorDistance that transforms an image into a grayscale image that represents each pixel’s distance from a particular color. This would work well for our bright orange CD if I could control the lighting at all times, but I plan on using the target outside, and any time the target enters a shadow, the color would change. Instead, I used hueDistance which uses the hue of a color and is less susceptible to changing lighting conditions.
One problem with just using the hue is that black is generally a constant distance from any particular hue, and I was having trouble separating the black background from the orange CD. To work around this, I used colorDistance to find a black mask that I could ignore and then only run image processing on the masked portion.
SimpleCV has some functions to detect blobs in the image and then determine if the blobs are circles. Once any circles are found, it can also estimate the radius of the circle to subpixel accuracy.
black_mask = image.colorDistance(SimpleCV.Color.BLACK).binarize() distance = image.hueDistance(ORANGE_DUCT_TAPE_RGB).invert() blobs = (distance - black_mask).findBlobs() if blobs is not None: circles = [b for b in blobs if b.isCircle(0.25)] if len(circles) == 0: return None # The circles should already be sorted by area ascending, so grab the biggest one cd = circles[-1]
Using this setup with 320×240 resolution, I was only able to get about 1 frame per second. There were a few things that I tried to speed everything up. First, I knew that the blobs I was interested in were always in a particular pixel size range, so it wasn’t worth calling isCircle on ones that were too small or too large. The findBlobs function takes optional minsize and maxsize parameters, so invalid blobs can be filtered out early. This made a small improvement, but it wasn’t consistently better because the image doesn’t always have blobs that can be filtered out.
The next thing I tried was cropping the image before doing any processing. For my application, the target is always going to be in the middle half of the image, so cropping the image before doing any processing should improve performance a lot. Unfortunately, I ran into some problems where cropping the image before calling colorDistance would produce clearly incorrect results. It’s worth noting that cropping the image before calling hueDistance worked fine. I don’t know if this is a bug in SimpleCV or in my code. To work around this, I just called colorDistance and cropped the result, and then did the rest of the processing from there. With these optimizations, I was able to about double the framerate.
Using the nominal values for the CD diameter and the camera’s HFOV, I calculated the distance to the target at a known distance of 1.155m 10 times. The total framerate was 2.04 frames per second and the readings averaged out to 1.298m with a standard deviation of 0.00134m. The precision was surprisingly high, even if the accuracy was not. I believe I read that the field of view changes when using the use_video_port option, but I can’t find if or where I read that, so I’m not going to mess with that value. I was concerned that maybe the outline of the CD was bleeding into the background, so I tried changing the the diameter to 110mm and got readings of 1.157m. I repeated this at a distance of 2m and 0.5m and got readings of 2.000m with standard deviation of 0.00554m and 0.5 and 0.496m with 0.000199m standard deviation. It’s worth noting that at 2m, the framerate dropped to 1.40 FPS; I’m not sure why that is. Considering how precise these measurements were, it might be worth reducing the resolution even more in order to increase the framerate.
I did notice that when I reduced the lighting (the above tests were done with ample lighting), the detected radius of the CD dropped slightly which affected the distance calculations. It might be possible to adjust the expected CD radius in response to changing lighting conditions, but the overall error rate in lower light conditions was only a few centimeters, so I didn’t try it.