In this page you will find the programming assignments for the Advanced Multimedia Computing course at the University of Amsterdam.
These assignments are industry oriented and are meant to let the student deal with computer vision problems coming from real situations.
All assigments need to be solved by using the open-source OpenCV library.
- Experience with C++ and relative IDEs is required (if you don't have an IDE, Visual C++ 2008 Express Edition is freely available from Microsoft)
- It is not possible to install OpenCV in the labs, however it is possible to install it somewhere else and then copy the headers, libraries and Dlls to use them in the Lab
- You may experience codec problems while reading videos, install the K-lite megapack (on windows) for an extensive list of codecs
- Please use OpenCV v1.0: OpenCV 1.1pre1 has a bug when recording videos, if you decide to use this version you will need to replace this dll in your installation (but it can create some problems for video input)
- A webcam is not necessary, but it is useful for live experimentation
- The maximum team size is 2
- The students must deliver a video with the results of their program and a well commented code.
- The output videos do not need to be 100% perfect to be considered valid, so don't worry too much about mistakes!!!
- No reports are required.
The final grade depends on the percentage of delivered working assignments: 60% will be a 6, 100% a 10. Non working assignments are weigthed by the difficulty and the progress (alias: i decide).
It is strongly recommended to go through the samples included in the OpenCV library directory. The following is a useful list of resources for OpenCV:
- OpenCV Wiki Latest documentation, HowTo's etc..
- Seeing with OpenCV nice tutorials, with some examples useful for the assignments.
- Introduction to OpenCV useful as quick reference for common operations.
- Yahoo Group to ask question/find solutions on OpenCV problems
Assignment 1: Set up and running (suggested deadline:April 14th)
This is a tutorial task: it might be boring but it's a good exercise to deal with pointers, video input/output and simple opencv operations. It will give an overview on how to use the library before start implementing nice stuff :) . Iteratively load frames from a video (.avi) or from a webcam (for video, make it loop when is over).
- convert the frame to grayscale
- scale the frame of 1/8 of its size
- blur and perform some morphological operation on the original frame
- copy the scaled frame to one of the corners of the grayscale image. (the scaled frame is still RGB)
- write your name on the bottom of the frame, surrounded by a filled rectangle.
- Open an output window, place a trackbar on it which is linked to the blurring parameter (1,3, or 5)
- Display the resulting frame on the output window.
- While manually changing the blurring parameter on the bar, record an output video using the recording functions.
For many of the i/o functions you will need to use functions from the highgui library. make sure that you are releasing all the structures correctly (opencv is famous for memory leaks) by checking the allocated memory every few minutes.
Assignment 2: Hotspot detector (suggested deadline:April 21st)
Download the movie at http://staff.science.uva.nl/~rvalenti/downloads/teaching/hand.avi
Implement an hotspot detector, write on the image which book is being touched. Optional: implement a "visual piano", use your webcam to play it.
Assignment 3: Face counting (suggested deadline:May 4th)
Download the movies at http://staff.science.uva.nl/~rvalenti/downloads/teaching/face_counting1.avi
Count the number of people that are displayed on the video. Try not to double-count by using some heuristics. Show the results in real time on the video.
Assignment 4: Car tracking (suggested deadline:May 14th)
Download the movie at http://staff.science.uva.nl/~rvalenti/downloads/teaching/cars.avi
Count how many cars are passing on the left and right lane, independently. Track their location (optional: display an estimate of the speed of each car) Count how many cars on the right lane are turning right at the crossing and display the information on the image.
Assignment 5: A simple OCR (Optional)
Let's implement a simple barcode reader/ocr:
Download an image of a bar code at http://staff.science.uva.nl/~rvalenti/downloads/teaching/upc.jpg.
- Use template matching (cvMatchTemplate) to detect the human readable numbers (you can use the numbers on the same image as template).
- Read the bar code (use http://www.howstuffworks.com/upc.htm).
- Validate the barcode using the check digit.
- If correct: highlight the cypher that the OCR did recognize correctly.
- If wrong: use the OCR cyphers to try to "patch" the code so that the check digit is valid.
- Display the final recognized code on the image, use different color coding for the cyphers that were wrongly recognized by the OCR.
- Optional (very challenging): Record a database of few products, scan them using a webcam and write their name on the image once recognized. Try to deal with scale and tilt!
Assignment 6: Smart shopping window (deadline:June 2nd)
Download the shopping window movie at http://staff.science.uva.nl/~rvalenti/downloads/teaching/shopping.avi
Note: the video is quite big, you can process only the part that works best with your approach. In case you have bandwidth problems, a shorter version (and in lower resolution) is available at http://staff.science.uva.nl/~rvalenti/downloads/teaching/shoppingsmall.avi .
There are 4 progressive points for this assignment (each of them counts as 25% for the total completion), you should be able to complete some of them starting from the code of the previous assignments:
- Count the number of people that are passing by.
- Mark them with a bounding box and leave a trace for their path. Indicate their direction using an arrow.
- Distinguish between the people that are just passing by and the ones that actually stop for some time in front of the window. Mark them with 2 different colors.
- Learn some distinctive features (e.g. pca on the faces) from the people that look at the window. Label them as "subject n" each time you find a new subject, and use the same label when find him again. Collect statistics (e.g. number of returning customer, time each customer was looking at the shopping window).
Display the obtained information on the image.