Issue 51 limit increase #57

mariux · 2017-05-07T16:57:11Z

first bulk assign implementation for testing locking issue when assigning single jobs.. relates to #51

mariux · 2017-05-07T16:59:11Z

this needs further optimization for fast running jobs...

no logic was changed in this commit renamed mxq_assign_job_from_group_to_daemon() to mxq_assign_jobs_from_group_to_daemon() and added parameter limit relates to issue #51

… assigned relates to issue #51

relates to issue #51

donald · 2017-05-08T11:12:53Z

Thanks. I gave it a ride on the test cluster. It performed a little bit better then fix-issue-51:

master:                    1500 jobs in 49 min ,  1750 jobs in 57 min
fix-issue-51:              1500 jobs in 11 min ,  8270 jobs in 59 min
issue-51-limit-increase:   1500 jobs in  9 min , 11166 jobs in 60 min

Although the initial loading was not as fast as I expected. This is because mxq_load_job_from_group_for_daemon still loads maximum one job before exiting to the server main loop with the sleep. Was this intended?

I'm currently running a second test with some jitter in the runtime of the jobs (perl -MTime::HiRes -e 'Time::HiRes::usleep((20+rand(10)-5)*1000000)' instead of sleep 20) which might be more realistic.

donald · 2017-05-08T12:17:59Z

No, the main loop doesn't sleep... There's another problem:

2017-05-08 12:52:43 +0200 mxqd[2509]: hostname=sigill.molgen.mpg.de daemon_name=main daemon_id=9 :: MXQ server started.
2017-05-08 12:52:43 +0200 mxqd[2509]:   host_id=57b28696-f203-41c8-9ed7-b0387533efe8-2882b65-9cd
2017-05-08 12:52:43 +0200 mxqd[2509]: slots=7 memory_total=4096 memory_avg_per_slot=585 memory_limit_slot_soft=585 memory_limit_slot_hard=4096 :: server initialized.
2017-05-08 12:52:43 +0200 mxqd[2509]: cpu set available: [1-7]
2017-05-08 12:52:43 +0200 mxqd[2509]:   group=buczek(125):1 jobs_max=7 slots_max=7 memory_max=1400 slots_per_job=1 memory_per_job_thread=200.000000 :: group initialized.
2017-05-08 12:52:43 +0200 mxqd[2509]: recover: 1 running groups loaded.
2017-05-08 12:52:43 +0200 mxqd[2509]: ====================== SERVER DUMP START ======================
2017-05-08 12:52:43 +0200 mxqd[2509]:     user=buczek(125) slots_running=0 global_slots_running=0 global_threads_running=0
2017-05-08 12:52:43 +0200 mxqd[2509]:         group=buczek(125):1 test02 jobs_max=7 slots_per_job=1 jobs_in_q=200000
2017-05-08 12:52:43 +0200 mxqd[2509]: memory_used=0 memory_total=4096
2017-05-08 12:52:43 +0200 mxqd[2509]: slots_running=0 slots=7 threads_running=0 jobs_running=0
2017-05-08 12:52:43 +0200 mxqd[2509]: global_slots_running=0 global_threads_running=0
2017-05-08 12:52:43 +0200 mxqd[2509]: cpu set running: []
2017-05-08 12:52:43 +0200 mxqd[2509]: ====================== SERVER DUMP END ======================
2017-05-08 12:52:43 +0200 mxqd[2509]:   group=buczek(125):1 slots_to_start=7 slots_per_job=1 :: trying to start job for group.
2017-05-08 12:53:00 +0200 mxqd[2509]: WARNING: MySQL mysql_stmt_execute(): ERROR 1690 (22003): BIGINT UNSIGNED value is out of range in '(OLD.stats_run_sec + (unix_timestamp(NEW.group_mtime) - unix_timestamp(OLD.group_mtime)))'
2017-05-08 12:53:00 +0200 mxqd[2509]: EMERGENCY: MySQL mysql_stmt_execute(): ERROR 1690 (22003): BIGINT UNSIGNED value is out of range in '(OLD.stats_run_sec + (unix_timestamp(NEW.group_mtime) - unix_timestamp(OLD.group_mtime)))'
2017-05-08 12:53:00 +0200 mxqd[2509]: EMERGENCY: ERROR: mysql_stmt_execute() returned undefined error number: 1690
2017-05-08 12:53:00 +0200 mxqd[2509]: ERROR: mx_mysql_statement_execute(): Invalid exchange
2017-05-08 12:53:00 +0200 mxqd[2509]: ERROR: mx_mysql_do_statement(): Invalid exchange
2017-05-08 12:53:00 +0200 mxqd[2509]: ERROR:   group_id=1 :: mxq_assign_jobs_from_group_to_daemon(): Invalid exchange
2017-05-08 12:53:00 +0200 mxqd[2509]: Tried Hard and nobody is doing anything. Sleeping for a long while (15 seconds).
2017-05-08 12:53:15 +0200 mxqd[2509]:   group=buczek(125):1 slots_to_start=7 slots_per_job=1 :: trying to start job for group.
2017-05-08 12:53:28 +0200 mxqd[2509]:    job=buczek(125):1:99 :: new job loaded.
2017-05-08 12:53:28 +0200 mxqd[2509]: job assigned cpus: [7]
2017-05-08 12:53:28 +0200 mxqd[2525]:    job=buczek(125):1:99 host_pid=2525 pgrp=2509 :: new child process forked.
2017-05-08 12:53:28 +0200 mxqd[2525]: starting reaper process.
2017-05-08 12:53:28 +0200 mxqd[2526]: starting user process.
2017-05-08 12:53:30 +0200 mxqd[2509]:    job=buczek(125):1:99 :: added running job to watch queue.
2017-05-08 12:53:30 +0200 mxqd[2509]: slots_started=1 :: Main Loop started 1 slots.

donald · 2017-05-08T13:41:56Z

BIGINT UNSIGNED value is out of range in '(OLD.stats_run_sec + (unix_timestamp(NEW.group_mtime) - unix_timestamp(OLD.group_mtime))) seen with master, too. Not a problem of these patches.

donald · 2017-05-08T20:27:58Z

Same results with jitter:

master:                    1500 jobs in 51 min ,  1762 jobs in 60 min
fix-issue-51:              1500 jobs in 11 min ,  8100 jobs in 60 min
issue-51-limit-increase:   1500 jobs in  8 min . 12000 jobs in 60 min

donald · 2017-05-09T06:38:32Z

Can you remind me, why we needed the status LOADED between ASSIGNED and RUNNING?

donald · 2017-05-10T07:15:25Z

To answer my own question: LOADED means the server decided to start the job but has not yet recorded a pid in the database. So a job recorded as 'LOADED' in the database might or might not been started.

mariux · 2017-05-10T17:36:37Z

btw.. the commits are not complete... ;) this was manly for benchmarking reasons..

this needs further patches to handle new issues that come with mass assign:
e.g. assigning more jobs than can be executed in multi user mode and blocking those for other servers.

assigning jobs_max is also just a quick fix and can be optimized even more like already stated in my other comments.. ;)

for the questions:

yes, loaded is state between assigned and started. if server dies in execution phase the state of job is unknown until marked running.
starting only one was intended because this meant the least change in code and starting one should not be blocking as it is only reading..

side hint - as i know you thought about doing centralized scheduling:
you can also add an external master to help scheduling jobs with more global knowledge e.g. from the outside by additionally assigning jobs for the cluster... mxqd will only self-assign new jobs it it does not find any assigned jobs in the database and has free ressources to start some.

mariux · 2017-05-10T18:14:21Z

in addition LOADED should only be released by the daemon itself... ASSIGNED can be reset from the outside... (need to be verified again but that was the idea: it should be possible to "steal" a already assigned job and reassign it to another host...)

donald · 2017-05-10T19:32:39Z

assigning more jobs than can be executed in multi user mod

yea, I guess I've merged to early.

donald · 2017-05-10T19:47:17Z

Undoing the excessive assigns when we later decide, that we don't want to actually start all job (eg. over "fair share" and other users waiting or --max-jobs-per-node used) would probably waste more performance that was won.

So we need to calculate the exact number of jobs to start in advance.

mariux · 2017-05-10T22:19:40Z

as said in earlier comments... this strategy should be used for fast running jobs... for long running jobs it's not needed because the pressure on the database is minimal.. so jobs with run-time > 15min (current default) should not be preassigned (limit 1 as usaual)... also jobs with runtime< 15min can be even preassigned with higher limits e.g. jobs_max * 15min/(actual runtime in minutes) ... or instead of job_max use current number of jobs a user can execute in this daemon... (see later comment)

mariux · 2017-05-10T22:33:21Z

as far as i remember assigned jobs account for jobs_inq counter... and there is a place in code when jobs_inq > 0 and assigning new jobs fails... here we could try to steal an assigned job... (or a greater number of jobs)

mariux · 2017-05-10T22:44:45Z

here we can actually calculate number of jobs in this group that might get started (correct only if no other group of this user with equal priority has jobs_inq > 0)... but this number will be <= jobs_max and honors at least multi user environments a lot better then using jobs_max..

mariux · 2017-05-10T23:15:14Z

So we need to calculate the exact number of jobs to start in advance.

edge case might be easily detected: would be something like jobs_inq <= jobs_max for this group..
if detected switch back to LIMIT 1.. and free assignments in main-loop (if any - track in glist(?)).. this would render stealing (which can also end in endless stealing) as not being needed anymore...

donald · 2017-05-19T04:22:59Z

LOADED would perhaps be more obvious if it was called STARTING. I think this was more to the standard idioms for state machines.

donald · 2017-06-02T10:28:25Z

Master has been reset to a state before this merge. Needs to be redone.

mariux mentioned this pull request May 7, 2017

mxqsub fail / lousy performance finding jobs to activate #51

Open

mariux added 2 commits May 7, 2017 19:03

mxq_job: allow to set limit for assigning jobs to deamon

9007b4a

no logic was changed in this commit renamed mxq_assign_job_from_group_to_daemon() to mxq_assign_jobs_from_group_to_daemon() and added parameter limit relates to issue #51

mxqd: always assign jobs_max jobs form group to daemon if no jobs are…

b329b7a

… assigned relates to issue #51

mariux force-pushed the issue-51-limit-increase branch from 6ce80d2 to b329b7a Compare May 7, 2017 17:03

mariux added 2 commits May 7, 2017 19:18

mxq_job: Add missing assertions

db6c3e4

mxqsub: Retry on deadlock in mysql database

72f2543

relates to issue #51

donald merged commit b2298db into master May 10, 2017

donald deleted the issue-51-limit-increase branch May 10, 2017 08:39

Issue 51 limit increase #57

Issue 51 limit increase #57

mariux commented May 7, 2017 •

edited

Loading

mariux commented May 7, 2017

donald commented May 8, 2017 •

edited

Loading

donald commented May 8, 2017

donald commented May 8, 2017

donald commented May 8, 2017

donald commented May 9, 2017

donald commented May 10, 2017

mariux commented May 10, 2017

mariux commented May 10, 2017 •

edited

Loading

donald commented May 10, 2017 •

edited

Loading

donald commented May 10, 2017

mariux commented May 10, 2017 •

edited

Loading

mariux commented May 10, 2017 •

edited

Loading

mariux commented May 10, 2017 •

edited

Loading

mariux commented May 10, 2017

donald commented May 19, 2017

donald commented Jun 2, 2017

Issue 51 limit increase #57

Issue 51 limit increase #57

Conversation

mariux commented May 7, 2017 • edited Loading

mariux commented May 7, 2017

donald commented May 8, 2017 • edited Loading

donald commented May 8, 2017

donald commented May 8, 2017

donald commented May 8, 2017

donald commented May 9, 2017

donald commented May 10, 2017

mariux commented May 10, 2017

mariux commented May 10, 2017 • edited Loading

donald commented May 10, 2017 • edited Loading

donald commented May 10, 2017

mariux commented May 10, 2017 • edited Loading

mariux commented May 10, 2017 • edited Loading

mariux commented May 10, 2017 • edited Loading

mariux commented May 10, 2017

donald commented May 19, 2017

donald commented Jun 2, 2017

mariux commented May 7, 2017 •

edited

Loading

donald commented May 8, 2017 •

edited

Loading

mariux commented May 10, 2017 •

edited

Loading

donald commented May 10, 2017 •

edited

Loading

mariux commented May 10, 2017 •

edited

Loading

mariux commented May 10, 2017 •

edited

Loading

mariux commented May 10, 2017 •

edited

Loading