I have a program (say, "prog") written in C that makes many numerical operations. I want to write a "driver" utility in python that runs the "prog" with different configurations in a parallel way, reads its outputs and logs them. There are several issues to take into account:
- All sort of things can go bad any time so logging has to be done as soon as possible after any
proginstance finishes. - Several
progs can finish simultaneously so logging should be done centralized - workers may be killed somehow and
driverhas to handle that situation properly - all workers and logger must be terminated correctly without tons of backtraces when
KeyboardInterruptis handled
The first two points make me think that all workers have to send their results to some centralized logger worker through for example multiprocessing.Queue. But it seems that the third point makes this solution a bad one because if a worker is killed the queue is going to become corrupted. So the Queue is not suitable. Instead I can use multiple process to process pipes (i.e. every worker is connected through the pipe with a logger). But then the other problems raise:
- reading from pipe is a blocking operation so one logger can't read asynchronously from several workers (use threads?)
- if a worker is killed and a pipe is corrupted, how the logger can diagnose this?
P.S. point #4 seems to be solveable -- a have to
disable default SIGINT handling in all workers and logger;
add
try exceptblock to main process that makespool.terminate();pool.join()calls in case of SIGINT exception handled.
Could you please suggest a better design approach if possible and if not than how to tackle the problems described above?
P.S. python 2.7