Skip to content

mxqsub fail / lousy performance finding jobs to activate #51

Open
donald opened this issue Apr 28, 2017 · 17 comments
Open

mxqsub fail / lousy performance finding jobs to activate #51

donald opened this issue Apr 28, 2017 · 17 comments

Comments

@donald
Copy link
Contributor

donald commented Apr 28, 2017

User report

Start job for FAB44256.merged.fq.gz
WARNING: MySQL mysql_stmt_execute(): ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
EMERGENCY: MySQL mysql_stmt_execute(): ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
EMERGENCY: ERROR: mysql_stmt_execute() returned undefined error number: 1213
ERROR: mx_mysql_statement_execute(): Invalid exchange
ERROR: Job submission failed: Invalid exchange

Several mxqd log similar errors (but continue). Frequency of jobs getting from INQ to RUNNING severely limited.

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

For the records:
mxqprj@macheteinfach:~> mysql -e 'show processlist'
+---------+--------+--------------------------------------------------------+------+---------+------+-------------+------------------------------------------------------------------------------------------------------+
| Id      | User   | Host                                                   | db   | Command | Time | State       | Info |
+---------+--------+--------------------------------------------------------+------+---------+------+-------------+------------------------------------------------------------------------------------------------------+
| 2799943 | mxqprj | gynaekophobie.molgen.mpg.de:60836                      | mxq  | Sleep   |    5 |             | NULL |
| 3005008 | mxqprj | altmetall.molgen.mpg.de:48334                          | mxq  | Sleep   |    5 |             | NULL |
| 3011795 | mxqprj | uselessbox.molgen.mpg.de:55338                         | mxq  | Sleep   |    5 |             | NULL |
| 3036007 | mxqprj | simsalabimbambasaladusaladim.molgen.mpg.de:43074       | mxq  | Sleep   |    4 |             | NULL |
| 3037048 | mxqprj | internetguide.molgen.mpg.de:35414                      | mxq  | Execute |  331 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037150 | mxqprj | hal102.molgen.mpg.de:49024                             | mxq  | Execute |  188 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037196 | mxqprj | avaritia.molgen.mpg.de:58760                           | mxq  | Execute |  145 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037216 | mxqprj | shutupandtakemymoney.molgen.mpg.de:38190               | mxq  | Execute |  112 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037231 | mxqprj | gula.molgen.mpg.de:48256                               | mxq  | Execute |   83 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037244 | mxqprj | ira.molgen.mpg.de:47218                                | mxq  | Execute |   77 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037262 | mxqprj | esodophobie.molgen.mpg.de:41534                        | mxq  | Execute |   69 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037266 | mxqprj | mylittlepony.molgen.mpg.de:58740                       | mxq  | Execute |   67 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037288 | mxqprj | disastergirl.molgen.mpg.de:53260                       | mxq  | Execute |   57 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037291 | mxqprj | luxuria.molgen.mpg.de:37948                            | mxq  | Execute |   55 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037301 | mxqprj | hal101.molgen.mpg.de:43668                             | mxq  | Execute |   47 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037308 | mxqprj | evilplottingraccoon.molgen.mpg.de:56266                | mxq  | Execute |   42 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037310 | mxqprj | nomnomnom.molgen.mpg.de:59646                          | mxq  | Execute |   38 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037313 | mxqprj | confessionbear.molgen.mpg.de:47434                     | mxq  | Execute |   36 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037331 | mxqprj | invidia.molgen.mpg.de:36144                            | mxq  | Execute |   28 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037345 | mxqprj | rechnerzumrechnenvonrechenaufgaben.molgen.mpg.de:42706 | mxq  | Execute |   20 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037348 | mxqprj | fordprefect.molgen.mpg.de:54652                        | mxq  | Execute |   18 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037351 | mxqprj | peanutbutterjellytime.molgen.mpg.de:51024              | mxq  | Execute |   16 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037354 | mxqprj | superbia.molgen.mpg.de:59626                           | mxq  | Execute |   14 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037358 | mxqprj | foreveralone.molgen.mpg.de:46498                       | mxq  | Execute |   12 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037374 | mxqprj | grumpycat.molgen.mpg.de:50834                          | mxq  | Execute |    8 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037381 | mxqprj | acedia.molgen.mpg.de:50870                             | mxq  | Execute |    6 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037386 | mxqprj | dontpanic.molgen.mpg.de:39942                          | mxq  | Execute |    4 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037389 | mxqprj | philosoraptor.molgen.mpg.de:39564                      | mxq  | Execute |    1 | System lock | UPDATE  mxq_job SET daemon_id     = ?, host_hostname = ?, host_slots    = ?,  server_id     = ?, job |
| 3037392 | mxqprj | localhost                                              | NULL | Query   |    0 | init        | show processlist |
+---------+--------+--------------------------------------------------------+------+---------+------+-------------+------------------------------------------------------------------------------------------------------+

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

The UPDATE is from https://github.molgen.mpg.de/mariux64/mxq/blob/master/mxq_job.c#L294

From a test instance with a copy of the database:

Ich glaube, das Problem ist die Kombination "WHERE job_status=0 [...]" mit "order by job_priority".
Ob job_priority index ist oder nicht ist egal, da die Sortierung ja über die Ergebnismenge geht.
mysql> select job_id from mxq_job where job_status=0 order by job_id limit 1;
+---------+
| job_id  |
+---------+
| 8286229 |
+---------+
1 row in set (0.00 sec)
mysql> select job_id from mxq_job  order by job_priority,job_id limit 1;
+--------+
| job_id |
+--------+
|  54012 |
+--------+
1 row in set (0.00 sec)
mysql> select job_id from mxq_job where job_status=0 order by job_priority,job_id limit 1;
+---------+
| job_id  |
+---------+
| 8286229 |
+---------+
1 row in set (24.07 sec)
mysql> select count(job_id) from mxq_job where job_status=0;
+---------------+
| count(job_id) |
+---------------+
|        120647 |
+---------------+
1 row in set (0.02 sec)
Es müssten eben 120000 ergebnis-rows erzeugt, gespeichert und sortiert werden.
Die Table verkleinern, indem obsolete jobs aufgeräumt werden, würde schon was bringen:
mysql> create table test LIKE mxq_job;
Query OK, 0 rows affected (0.13 sec)
mysql> insert into test select * from mxq_job where job_id>8000000;
Query OK, 408122 rows affected (11.74 sec)
Records: 408122  Duplicates: 0  Warnings: 0
mysql> select job_id from test where job_status=0 order by job_priority,job_id limit 1;
+---------+
| job_id  |
+---------+
| 8286229 |
+---------+
1 row in set (0.93 sec)

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

The "SQL database as queue antipattern"-fraction might have a point, although some are arguing against.

https://blog.engineyard.com/2011/5-subtle-ways-youre-using-mysql-as-a-queue-and-why-itll-bite-you

All problems predicted... :-)

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

Average runtime of jobs in the queue is about 2 minutes, but about 200 jobs running, so about every half second a jobs finishes. The mxqds are fully serialized grabbing jobs in queue and need much longer. We have this sort problem, which is not be speed up by an index, and we have an update on several columns which are indexed, so lots of index operations. Total cluster performance is not limited by free capacity but by this job-grabbing operation.

Pending locks are not served in order. Sometimes the lock wait time exceeds the timeout (50s).

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

As a quick experimental work-around, I suggest to temporarily disable the sorting by priority, a feature which users don't use very often.

mysql> select job_priority,count(*) from mxq_job group by job_priority;
+--------------+----------+
| job_priority | count(*) |
+--------------+----------+
|          127 |  8364087 |
|          128 |      475 |
|          130 |       18 |
+--------------+----------+
3 rows in set (8.84 sec)

Update: Would not resolve the problem. See update on comment two above.

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

On live database;

mysql> select job_id from mxq_job WHERE group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' AND host_pid      = 0 ORDER BY job_priority, job_id LIMIT 1;
+---------+
| job_id  |
+---------+
| 8310461 |
+---------+
1 row in set (0.60 sec)
mysql> update mxq_job set daemon_id=daemon_id,host_hostname=host_hostname,host_slots=host_slots,server_id=server_id,job_status=job_status WHERE group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' AND host_pid      = 0 ORDER BY job_id LIMIT 1;
Query OK, 0 rows affected (15.28 sec)
Rows matched: 1  Changed: 0  Warnings: 0

So its (no longer?) a query problem, but just the queuing for the update.

@pmenzel
Copy link
Contributor

pmenzel commented Apr 28, 2017

The buffer pool is big enough, and there is still room.

mysql> show engine innodb status;
[…]
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 35165044736; in additional pool allocated 0
Dictionary memory allocated 90099
Buffer pool size   2097136
Free buffers       1750533
Database pages     334144
Old database pages 123616
Modified db pages  112
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 34, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 332777, created 1367, written 940749
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 334144, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
[…]

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

Update only to column not indexed: still bad performance:

mysql> update mxq_job set stats_nvcsw=stats_nvcsw WHERE group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' AND host_pid      = 0 ORDER BY job_priority, job_id LIMIT 1;
Query OK, 0 rows affected (0.63 sec)
Rows matched: 1  Changed: 0  Warnings: 0

select on same contraints: better performance:

mysql> select 1 from mxq_job WHERE group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' AND host_pid      = 0 ORDER BY job_priority, job_id LIMIT 1;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.18 sec)

Select for update: bad performance again:

mysql> select 1 from mxq_job WHERE group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' AND host_pid      = 0 ORDER BY job_priority, job_id LIMIT 1 for update;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.63 sec)

So the problem might be the locks.
Using another group with just 10 pending jobs (instead of 172333) : instantly:

 mysql> select 1 from mxq_job WHERE group_id      = 9597 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' AND host_pid      = 0 ORDER BY job_priority, job_id LIMIT 1 for update;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.00 sec)

So maybe all records are locked, even if the update only touches one. In fact it takes the same time to lock without the limit:

select 1 from mxq_job WHERE group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' AND host_pid      = 0 ORDER BY job_priority, job_id for update;
+---+
120644 rows in set (0.69 sec)

Maybe first can select without lock first, then select the candidate row again by job_id plus original (unassigned) contraints for update. If other server has taken it, we restart. Otherwise we update and commit. That way we only lock a single row. Along this line:

mysql> select job_id from mxq_job WHERE group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' 
+---------+
| job_id  |
+---------+
| 8286229 |
+---------+
1 row in set (0.18 sec)
mysql> select 1 from mxq_job WHERE job_id=8286229 AND group_id      = 22930 AND job_status    = 0 AND daemon_id     = 0 AND host_hostname = '' AND server_id     = '' for update;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.00 sec)
mysql> update mxq_job set job_status=job_status where job_id=8286229;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1  Changed: 0  Warnings: 0
mysql> commit;
Query OK, 0 rows affected (0.00 sec)

@donald
Copy link
Contributor Author

donald commented Apr 28, 2017

This gives a little bit insight:

mysql> select lock_mode as LM,lock_type,lock_table,lock_index,trx_state,trx_started,trx_wait_started,left(trx_query,40),trx_operation_state,trx_rows_locked from INNODB_LOCKS,INNODB_TRX where lock_trx_id=trx_id;
+----+-----------+-----------------+------------+-----------+---------------------+---------------------+------------------------------------------+---------------------+-----------------+
| LM | lock_type | lock_table      | lock_index | trx_state | trx_started         | trx_wait_started    | left(trx_query,40)                       | trx_operation_state | trx_rows_locked |
+----+-----------+-----------------+------------+-----------+---------------------+---------------------+------------------------------------------+---------------------+-----------------+
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:46 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 873, | starting index read |               1 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:44 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 867, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:42 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 882, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | RUNNING   | 2017-04-28 19:16:41 | NULL                | UPDATE  mxq_job SET daemon_id     = 868, | starting index read |          331719 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:37 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 865, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:34 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 859, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:33 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 872, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:29 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 881, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:28 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 860, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:24 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 861, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:23 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 877, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:18 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 869, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:16 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 864, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:11 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 875, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:16:06 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 874, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:55 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 871, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:52 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 878, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:50 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 876, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:45 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 862, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:44 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 858, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:27 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 880, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:25 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 879, | starting index read |               2 |
| X  | RECORD    | `mxq`.`mxq_job` | job_status | LOCK WAIT | 2017-04-28 19:15:21 | 2017-04-28 19:16:46 | UPDATE  mxq_job SET daemon_id     = 870, | starting index read |               2 |
+----+-----------+-----------------+------------+-----------+---------------------+---------------------+------------------------------------------+---------------------+-----------------+
23 rows in set (0.00 sec)

https://dev.mysql.com/doc/refman/5.6/en/innodb-locking.html

@donald
Copy link
Contributor Author

donald commented May 4, 2017

tested with mysql-5.7.18: same timing, not faster

donald added a commit that referenced this issue May 4, 2017
When the server chose a job to assign to itself by

   UPDATE mxq_job SET ... WHERE (unassigned) ORDER BY LIMIT 1

we had a performance problem when a group contained very many
pending jobs, because the UPDATE temporarily locked all records
considered (issue #51).

Now we SELECT without any lock and then UPDATE the single
job by its primary index. In case there is a collision
with another server, the second UPDATE will have no effect,
because it still contains the original contraints. The
second server will returns to its main loop where it
will sleep for some seconds.
donald added a commit that referenced this issue May 4, 2017
When the server chose a job to assign to itself by

   UPDATE mxq_job SET ... WHERE (unassigned) ORDER BY LIMIT 1

we had a performance problem when a group contained very many
pending jobs, because the UPDATE temporarily locked all records
considered (issue #51).

Now we SELECT without any lock and then UPDATE the single
job by its primary index. In case there is a collision
with another server, the second UPDATE will have no effect,
because it still contains the original contraints. The
second server will returns to its main loop where it
will sleep for some seconds.
donald added a commit that referenced this issue May 4, 2017
When the server chose a job to assign to itself by

   UPDATE mxq_job SET ... WHERE (unassigned) ORDER BY LIMIT 1

we had a performance problem when a group contained very many
pending jobs, because the UPDATE temporarily locked all records
considered (issue #51).

Now we SELECT without any lock and then UPDATE the single
job by its primary index. In case there is a collision
with another server, the second UPDATE will have no effect,
because it still contains the original contraints. The
second server will returns to its main loop where it
will sleep for some seconds.
@donald
Copy link
Contributor Author

donald commented May 5, 2017

Test-Cluster (16 nodes, 150 cores) with patch e6bfc5d and submit of 200000 times "sleep 20" everything looked good (150 running jobs) for hours until about 145759 jobs were submitted. From then on only 60-70 jobs were running ( 116155 pending, 69 running, 2953 finished).

select 1 from mxq_job WHERE group_id = 6 AND job_status = 0 AND daemon_id = 0 AND host_hostname = '' AND server_id = '' AND host_pid = 0 ORDER BY job_priority, job_id LIMIT 1;

timings: 0.31 sec 0.36 sec, 0.35 sec, 0.44 sec 0.37 sec , average 0.366 sec

So we can start maximum 1/0.366s = 2.73 jobs per second just because of the select.
This would be a limit of 20s * 2.73 jobs/s = 54.6 jobs, which is the magnitude we see.

The patch improved a lot, mysql is running on 300% (instead 100%) but still the cluster is limited by mysql.

@donald
Copy link
Contributor Author

donald commented May 5, 2017

When WHERE group_id=6 AND job_status=0 AND daemon_id=0 AND host_hostname='' AND server_id='' AND host_pid=0 is reduced to WHERE group_id=6 AND job_status=0; query time goes down from 0.33s to 0.08 s and the test cluster runs 80 jobs in parallel.

@thomas
Copy link
Contributor

thomas commented May 5, 2017

Maybe this was thought/talked about before. What would happen if the jobs within a group will also get a sequence number? Say 'group_id,seq_nr', this pair would give an uniqe index. It's more a 'feeling', but I think such a thing might be faster. (or even weirder, one field with 'group_id*100_000+seq_nr')

Just checked the idea on a larger range, forget it!

--------------- TWIMC ---------------

get the big ones ...

MariaDB [mxq]> select group_id, count(group_id) as count from mxq_job group by group_id having count>500 order by count desc limit 12;

+----------+--------+
| group_id | count  |
+----------+--------+
|    21503 | 464392 |
|     9758 | 214157 |
|    22744 | 199233 |
|    11060 | 188695 |
|    11717 | 187519 |
|    17831 | 172352 |
|    22930 | 172333 |
|    11236 | 165573 |
|    21647 | 163772 |
|    17908 | 161862 |
|     9790 | 152084 |
|    21225 | 140312 |
+----------+--------+

find out a 'range'.

MariaDB [mxq]> select group_id,count(group_id),min(job_id),max(job_id) from mxq_job where group_id=21503;

+----------+-----------------+-------------+-------------+
| group_id | count(group_id) | min(job_id) | max(job_id) |
+----------+-----------------+-------------+-------------+
|    21503 |          464392 |     5190742 |     5655786 |
+----------+-----------------+-------------+-------------+

1 row in set (0.18 sec)

count via primary key

MariaDB [mxq]> select count(*) from mxq_job where job_id >= 5190742 and job_id <= 5655786;

+----------+
| count(*) |
+----------+
|   465045 |
+----------+

1 row in set (6.65 sec)

the other way

MariaDB [mxq]> select count(*) from mxq_job where group_id=21503;

+----------+
| count(*) |
+----------+
|   464392 |
+----------+

1 row in set (0.16 sec)

@donald
Copy link
Contributor Author

donald commented May 6, 2017

Why would WHERE group_id=xxx ORDER BY job_id be better than 'WHERE group_id=xxx ORDER BY seq_nr' (ignoring job_priority for now)? Isn't the job_id exactly the same as the seq_nr? Just that job_id may have gaps (100,102,103,108,109,110) instead of (1,2,3,4,5,6), but that shouldn't make a difference for a btree.

@donald
Copy link
Contributor Author

donald commented May 6, 2017

We've already made a BIG progress. I compared the version from the master with the version from fix-issue-51 branch. They started with exactly the same database (preloaded 200000 pending jobs, "sleep 20"), the same mysql server (theinternet) and the same test-cluster. I let each one run for a little less then about one hour.

master:       1500 jobs in 49 min  , 1750 jobs in 57min
fix-issue-51: 1500 jobs in 11 min  , 8270 jobs in 59min

So we are 4 to 5 times faster already. Plus we no longer have any any index or gap locks. So mxqsub should never need to wait when inserting new jobs and the problem reported by the user (Deadlock found when trying to get lock; try restarting transaction) should no longer be possible. So I think we can take the commits in any case. I just want to review the code to make sure, that it can't make a difference when the additional constraints are left out.

@mariux
Copy link
Contributor

mariux commented May 7, 2017

the problem is that the algorithm was not designed for such short jobs... it just not scalable for very short jobs... job times should be at least 15min... but it won't scale with 15 minutes anyway if number of cores increases dramatically....

so instead of optimizing the queries (which will not fix the problem but the symptoms) you should optimize the selection algorithm...

next iteration would be to bulk assign/start jobs... because the hosts knows how much jobs it can do in X minutes... so if it has 80 slots...and 2 minute jobs just plan for 15 minutes and assign Y jobs... so it would be 15minutes for 2minute jobs that would be 7 jobs per slot for 80 slots just assign 560 jobs for this server... so instead of doing transactional selects for update... just increase the LIMIT 1 to a higher value for fast running jobs... also since you know the total number of cores in the cluster, the runtimes and available slots just set it to a sane high number... start with number of jobs equal to number of jobs the server can start for this group...

can you benchmark the LIMIT N approach? (where N is max number of jobs the server can run from this group)

as far as i remember already assigned jobs are used first instead of assigning new jobs..

@mariux
Copy link
Contributor

mariux commented May 7, 2017

see #57 for sample implementation... for benchmarking purposes..

mariux added a commit that referenced this issue May 7, 2017
no logic was changed in this commit

renamed mxq_assign_job_from_group_to_daemon() to mxq_assign_jobs_from_group_to_daemon()
and added parameter limit

relates to issue #51
mariux added a commit that referenced this issue May 7, 2017
mariux added a commit that referenced this issue May 7, 2017
Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants