2

Forgive me for I am not an expert in multi-threading by any means and need some assistance. So just some given knowledge before I get to my question:

Pre-Knowledge

  • Developing C++ code on the Jetson TK1
  • Jetson has 4 CPU cores (quad-core CPU ARMv7 CPU)
  • From what I have researched, each core can utilize one thread ( 4 cores -> 4 threads)
  • I am running a computer vision application which uses OpenCV
  • Capturing frames from a webcam as well as grabbing frames from a video file

Psuedo-Code I am trying to optimize my multi-threaded code such that I can gain the maximum amount of performance for my application. Currently this is basic layout of my code:

int HALT=0;

//Both func1 and func2 can be ran parallel for a short period of time
//but both must finish before moving to the next captured webcam frame
void func1(*STUFF){
    //Processes some stuff
}
void func2(*STUFF){
    //Processes similar stuff
}

void displayVideo(*STUFF){
    while(PLAYBACK!=DONE){
        *reads video from file and uses imshow to display the video*
        *delay to match framerate*
    }
    HALT=1;
}
main{
    //To open these I am using OpenCVs VideoCapture class
    *OPEN VIDEO FILE* 
    *OPEN WEBCAM STREAM*
    thread play(displayVideo, &STUFF);
    play.detach();
    while(HALT!=1){
        *Grab frame from webcam*
        //Process frame
        thread A(func1,&STUFF);
        thread B(func2,&STUFF);
        A.join();
        *Initialize some variables and do some other stuff*
        B.join();
        *Do some processing... more than what is between A.join and B.join*
        *Possibly display webcam frame using imshow*
        *Wait for user input to watch for terminating character*
    }
    //This while loop runs for about a minute or two so thread A and thread
    //B are being constructed many times.
}

Question(s) So what I would like to know is if there is a way to specify which core/thread I will use when I construct a new thread. I fear that when I am creating threads A and B over and over again, they are jump around to different threads and hampering the speed of my system and/or the reading of the video. Although this fear is not well justified, I see very bizarre behavior on the four cores when running the code. Typically I will always see one core running around 40-60% which I would assume is either the main thread or the play thread. But as for the other cores, the computational load is very jumpy. Also throughout the application playing, I see two cores go from around 60% all the way to 100% but these two cores don't remain constant. It could be the first, second, third, or even fourth core and then they will greatly decline usually to about 20->40%. Occasionally I will see only 1 core drop to 0% and remain that way for what appears to be another cycle through the while loop(i.e. grab frame, process, thread A, thread B, repeat). Then I will see all four of them active again which is the more expected behavior.

I am hoping that I have not been too vague in this post. I just see that I am getting slightly unexpected behavior and I would like to understand what I might be doing incorrectly or not accounting for. Thank you to whomever can help or point me in the right direction.

  • [Race condition](http://stackoverflow.com/questions/34510/what-is-a-race-condition): You access `HALT` from `main` and from the `play` thread`. Therefore the program emits [undefined behavior](http://stackoverflow.com/questions/10369638/what-is-undefined-behavior-in-c). Do you have [tag:C++11] available? – nwp May 12 '15 at 13:18
  • Okay that makes sense. Yes I am using c++11. Would you recommend a better approach at doing this? The idea is that if an ESC key is hit during the playback in the play thread, it would cease playing... return to the main... then the while loop in the main would not continue. – John-Michael Burke May 12 '15 at 13:53
  • C++11 makes this a bit easier. Use `std::atomic Halt;` (which you need to manually set to 0) to fix that race condition. Same goes for `PLAYBACK`. If `STUFF` is not `const` you have another race there. Creating and destroying threads is not cheap, doing so in a loop is unnecessarily slow. Consider using a [threadpool](http://stackoverflow.com/questions/3988128/c-thread-pool) instead. Another possibility is to use `std::async` which may do that for you. You must make sure you have no race condition, which requires synchronization, which is slow. Single threaded may be faster and is easier. – nwp May 12 '15 at 15:02

0 Answers0