Wednesday, April 25, 2012

Fast Object Tracking in Python using OpenCV

Machine Vision based competitions are being held in and around various engineering colleges. So, I have decided to make a small tutorial on how to make a small image processing application in python.


It tracks an object based on its color. A lot of people think of Matlab when they hear anything remotely related to image processing. However,there are a lot more ways of performing image processing on a PC. 


Some knowledge of python is appreciated before going into the details. It doesn't take a lot of time and you may learn the basics from my previous post (one of the most well spent moments of your life)


For this tutorial we are going to use the Python programming language which is actually very simple to program in even for a beginner (which I am). Now coming to  the image processing part, python alone is not capable of performing everything. We need some libraries that can handle image data and for that purpose I prefer using OpenCV which is an open source image processing and machine vision library.


So the first step, setting up the software which is a simple two step process (repeated thrice-Download>>Install).


Download Python 2.7.2.Install it (Nothing to say here I guess). Then download Numpy module for python 2.7. Remember python modules are version specific be sure to choose the module meant for your version of python In this discussion we are going to use Python 2.7. Numpy is a python module required for fast numerical manipulations on large arrays, image data in our case (more on Numpy soon). Download and install Numpy for python 2.7


After installing Numpy, download and install OpenCV 2.3. It is in the "opencv" folder on the local drive (C:\ in some cases). Copy the "cv.py" file from C:\opencv\modules\python\src2 into the python installation folder at the typical path C:\Python27\Lib\site-packages\ . Once done, you should have the stage set for image processing in python.


Image processing in python is simple when compared to that in other popular languages like C ,C++, Java. I prefer python over the others as its very simple to understand and is programmer friendly unlike C & C++.


Assuming you know how to program in python (if not visit my previous post ).


Start up the Python IDLE from the desktop icon. You get a command shell, this is the python shell where you can enter commands just like in Matlab (or command prompt).The commands entered here are executed immediately.


before using the image processing functions, we need to import the opencv module that we previously installed and copied. This brings all the functions encased in the opencv module into scope. 


("#" usually preceeds a comment in Python)
imports cv module into the python environment
>>>import cv


The cv.Load("file path") function is used to load an image in python. Use '\\' instead of '\' to ('\\' is python escape character for  '\'). This function loads the image data into the variable 'img'
>>>img=cv.LoadImage("C:\\Users\\Public\\Pictures\\Sample Pictures\\Koala.jpg")


To show that image, we need a window.So we create a window first using a function NamedWindow("window name",mode).mode=0(Auto Resize) mode=1(normal)
>>>cv.NamedWindow("Hello World",0) 


Tip: While typing a command hit 'Ctrl + Space' to see an auto completion box. It gives a list of all the function and modules loaded. Makes a programmer's life(very precious) heaven.


So we have created a window. Now to show the image in it. For that we better use a function called ShowImage("window name",image_variable)."window name" is the name of the window that we just created ("Hello World") and the image_variable is img in our case.


But the problem here is that the window we created has to be refreshed often using the ShowImage function. So we create an infinite loop that displays an image until the user hits 'Escape'. We use a WaitKey function to take input from the user. This is also a part of the opencv module. It makes the program wait for a  time defined in the parenthesis (in ms) before continuing to execute the code that follows it.
## infinite loop
>>>while True: 
        cv.ShowImage("Hello World",img)
        k=cv.WaitKey(10) #Waits for 10ms 
        if k==27:
             break


Here the image is refreshed ever 10ms. The first line is the typical while statement with the 'True' condition making it execute infinitely.The cv.ShowImage() refreshes the "Hello World" window with the image 'img'. The cv.WaitKey() waits for 10ms for a user input an returns the Ascii value of the input. It is copied into 'k' and if the value of 'k' is 27 (Ascii value of 'Escape key') the once infinite while loop breaks and the program  ends.


Note: The program ends but the window will not close as python doesn't do that by default. We need to add one more command at the end to destroy the window we created cv.DestroyAllWindows(). Use this command to close the  window instead of forcibly trying to close it. Always enter this command at the end of the program
>>>cv.DestroyAllWindows()


Tip: Python doesn't require a semicolon(;) unlike in C. But, you may use it to write and separate multiple statements in a single line.


You may enter all the commands into a new .py file (Hit 'Ctrl + N' in python window to open a python editor). Save and run the file to execute all commands sequentially.
Remember to put in the cv.DestroyAllWindows() at the end.




Python Editor with all the commands and the resulting window

I know its boring to read and display an image on the hardrive. Hence, make sure you have a camera attached to the PC (inbuilt camera will work in case of Laptops). Make sure their drivers are installed and the camera is working properly before running the program.
Download Amcap which is a small application to test and configure your camera settings.


Once we have your camera working, we need to access it from python. OpenCV library once again comes to our rescue by providing a function to access the camera, cv.CaptureFromCAM(camera_id) function. It takes the camera hardware id number, the numbering of camera hardware starts with the inbuilt camera being numbered '0' and so on.It returns a capture object through which we may capture frames from the camera.
>>>capture=cv.CaptureFromCAM(0) # '0' For the default inbuilt Camera


The above function not only creates an capture object but also turns on the camera.Now to capture image frames from the camera, we have a function called cv.QueryFrame(capture_obj). It takes in a capture object associated with a camera and returns an image.
>>>img=cv.QueryFrame(capture)
>>>cv.ShowImage("cam preview",img) #to show the image.


Note: Sometimes when the camera fails, the capture object can't be created and hence the cv.CaptureFromCAM() returns a '0'. The program will rise an error in that case.So we have made sure that the images are queried iff the capture object is created successfully (capture !=0) 


The overall code would be:

import cv
capture=cv.CaptureFromCAM(0)



while capture!=0:

    img=cv.QueryFrame(capture)
    cv.ShowImage("cam",img)
    k=cv.WaitKey(100)
    if k==27:
        break


cv.DestroyAllWindows()


I am tired. I guess you too are drooling over the keyboard.

If not, Thank You. That's it for now. This happens to be PART 1 of the Tutorial post.
See you in PART 2.

Do comment regarding any additional info needed. A 'like' on this page would definitely make me happy

3 comments:

  1. Hi,

    I am doing a project using object tracking. I am learning how to use OpenCV. You instructions are very clear. Do you have the continutes instructions on how to detect moving object? Thanks!

    ReplyDelete
  2. pls can some one post or mail me source code for eye tracking and hand gesture in python. i m trying for long but doesnt seem to work ? :(

    ReplyDelete
    Replies
    1. Odd that I was just discussing with my friend about the very thing, Hand Gesture Recognition. He implemented that in matlab and trained an ANN to classify the gestures using spatial moment features. I sure can help you out in coding in python if you have a sense of the approach that you want to take for either of the tasks you mentioned in the comment.

      Delete