Commits on Jul 4, 2017
-
test_mx_util: Avoid -Wparentheses warnings
Add extra parentheses around assigments evaluated as conditions as suggested by -Wparentheses. This is to avoid CC test_mx_util.o test_mx_util.c: In function ‘test_mx_strskipwhitespaces’: test_mx_util.c:17:4: warning: suggest parentheses around assignment used as truth value [-Wparentheses] assert(s = mx_strskipwhitespaces(" abc ")); ^ test_mx_util.c:20:4: warning: suggest parentheses around assignment used as truth value [-Wparentheses] assert(s = mx_strskipwhitespaces("abc ")); ^ test_mx_util.c:23:4: warning: suggest parentheses around assignment used as truth value [-Wparentheses] assert(s = mx_strskipwhitespaces("")); ^ test_mx_util.c: In function ‘test_mx_strscan’: test_mx_util.c:311:5: warning: suggest parentheses around assignment used as truth value [-Wparentheses] assert(s = strdup("123 456 -789 246 abc")); ^ etc.
-
mx_util: Add mx_sort_linked_list
This is a general purpose routine to sort a single linked list of objects. It requires callbacks for comparison and for locating the next pointer in the object.
-
mxqd_control: Use generic sort function to sort users
No functional change.
-
mxqd_config: Add server_sort_groups_by_priority
Add a utility routing to sort the groups of all users of the server to their group priority.
-
When active groups are (re-)loaded from the database, we sort them by group priority. This implements group priority (mxqsub -P).
Commits on Jul 5, 2017
-
We intend to expand the functionality which can be triggered by signals, e.g. switch log level to debug or trigger a state dump. By processing signals synchronously, we are free to call non-reentrant functions and know, that our own data is not in a transient state. Block asynchronous signals and receive and process signals explicitly. Re-enable the asynchronous signals when the user process in initialized.
-
mxqd: Remove obsolete signal() calls in reaper
Now hat all signals are blocked, we don't need to ignore them.
-
It is no longer necessary to reset signals, because we no longer change them. The signals may stay blocked for the reaper process and are explicitly unblocked for the user process.
-
mxqd: Remove volatile sig_atomic_t type
We process signals synchronously now. Change "volatile sig_atomic_t" to "int" type.
-
mxqd: Dump server state on SIGUSR2 q=10
A dump of server state can be triggerd by env kill -usr2 -q 10 mxqd
-
mxqd: Change loglevel on SIGUSR2 q=20 or 21
The loglevel may be set to info with env kill -usr2 -q 21 mxqd and to info with env kill -usr2 -q 20 mxqd
Commits on Jul 6, 2017
-
mxq_log: Do not deduplicate messages
We intent to make loggin on info level less verbose so that we don'T need the deduplication here. At debug level we want to see all events as they happen.
-
-
mxqd: Start jobs without main loop iteration
Try to start as many jobs jobs as possible without going through a main loop iteration (database roundtrip).
-
Use 10 seconds everywhere to decrease the load on the database and races between the mxq daemons a bit. At the same time this increases the chance that multiple jobs of the same group are started on the same server, which is good (better use of caches, smaller failure surface). This is the maximum time a single server will need to react to database changes (mxqsub or mxqkill). Administrative signals will get immediate reaction. Finished user jobs will usually also get immediate reaction. However, this is not true for jobs we picked up from a previous daemon incarnation and which are not our children. If these jobs finish, we will not get a signal, so we need to look into the spool directory from time to time. This is another reason, why we need a timeout at all. Now that we want to use 10 seconds everywhere, we can make it a constant.
-
mxqd: Only process signals we expect
We don't want spurious signals (like SIGWINCH) to trigger a new evaluation of the main loop. Only wait for signals we want to be processed.
-
-
mxqdctl-hostconfig: Add debug commands
Add commands killall quitall reloadall reloadall (=restartall) dumpall setdebugall setinfoall (=setnodebugall)
-
The server status if multiple times set from the main loop to IDLE,RUNNING,BACKFILL,CPUOPTIMAL or FULL based on server->slots_running, server->slots, server->threads_running. Factor this out into a separate function to avoid repetition. Add a call to this function before the main loop so that we immediately get a correct status after a server start. Simplify the loop code by setting server status at end of loop and avoid the continue pattern. As a side effect this also resolves a bug, that the server status was not updated in a loop iteration where new jobs were started.
-
mxqd: Use return() instead of exit()
This is just to avoid valgrind leak warnings. We have two variables with cleanup attribute in the scope of main, which are not freed, when we leave by exit() instead of running out of scope. ==10696== 21 bytes in 1 blocks are still reachable in loss record 1 of 3 ==10696== at 0x4C29AC6: malloc (vg_replace_malloc.c:299) ==10696== by 0x5E19449: strdup (strdup.c:42) ==10696== by 0x5E750DD: get_current_dir_name (getdirname.c:40) ==10696== by 0x409F64: main (mxqd.c:2383) ==10696== ==10696== 45 bytes in 1 blocks are still reachable in loss record 2 of 3 ==10696== at 0x4C29AC6: malloc (vg_replace_malloc.c:299) ==10696== by 0x40CFF9: mx_strvec_to_str (mx_util.c:1059) ==10696== by 0x409F58: main (mxqd.c:2382)