Skip to content

fix incosistent database on startup #8

Closed
mariux opened this issue Mar 13, 2015 · 5 comments
Closed

fix incosistent database on startup #8

mariux opened this issue Mar 13, 2015 · 5 comments
Assignees
Labels
Milestone

Comments

@mariux
Copy link
Contributor

mariux commented Mar 13, 2015

collect jobs that are still running or not running anymore and save a resonable state in the database.

@mariux mariux added this to the 1.0 milestone Mar 13, 2015
@mariux mariux self-assigned this Mar 13, 2015
@mariux
Copy link
Contributor Author

mariux commented Jun 16, 2015

depends on #14

@mariux
Copy link
Contributor Author

mariux commented Jul 22, 2015

@donald sagte:

Hintergrund war, dass wir noch Karteileichen nach der Stromabschaltung
loswerden wollten. (Ist erledigt). Dafür könnte man bei Gelegenheit auch
ein Tool machen bzw, falls eine neue Instanz gestartet wird, könnte das
der neue mxqd selber machen.

@mariux
Copy link
Contributor Author

mariux commented Jul 22, 2015

quick fix:

update job_status=750 where job_status=200 and host_hostname like 'crahedhosthere%';

vor dem start des mxqd.. ansonsten noch

and date_start <= crashzeitpunkt.

@mariux mariux changed the title fix inkosistent database on startup fix incosistent database on startup Jul 27, 2015
mariux added a commit to mariux/mxq that referenced this issue Jul 29, 2015
fixes mariux64#8

 - unassign already assigned jobs
 - change status to unknown for loaded and/or running jobs
@mariux
Copy link
Contributor Author

mariux commented Jul 30, 2015

@donald @wwwutz
to fix inconsistent database of a crashed host:

  • a normal restart of mxqd on this host fixes the database.

to fix inconsistent database of a host that died forever:

  • commit a7b0180 introduces option --hostname to mxqd

so

mxqd --hostname <diedhost> -m 1 --no-log

should do the trick and remove all died jobs on host <diedhost>... just ^C and done..
-m 1

(be sure to use fqdn)

root:thehawk:~/# /home/mariux/src/mxq.git/mxqd --memory=100 --max-memory-per-slot=100 -j 100 --hostname hal104.molgen.mpg.de --no-log 
2015-07-29 17:51:51 +0200 mxqd[3809]: mxqd - MXQ v0.9.0 (beta)
2015-07-29 17:51:51 +0200 mxqd[3809]:   by Marius Tolzmann <tolzmann@molgen.mpg.de> 2013-2015
2015-07-29 17:51:51 +0200 mxqd[3809]:   Max Planck Institute for Molecular Genetics - Berlin Dahlem
2015-07-29 17:51:51 +0200 mxqd[3809]: hostname=hal104.molgen.mpg.de server_id=main :: MXQ server started.
2015-07-29 17:51:51 +0200 mxqd[3809]: slots=100 memory_total=100 memory_avg_per_slot=1 memory_max_per_slot=100 :: server initialized.
2015-07-29 17:51:51 +0200 mxqd[3809]: hostname=hal104.molgen.mpg.de server_id=main :: recovered from previous crash: set job_status='unknown' for 17 jobs.
2015-07-29 17:51:51 +0200 mxqd[3809]: WARNING: total 6550464 jobs recovered from previous crash.

(warning in last line is fixed in 6d7649a)

@donald
Copy link
Contributor

donald commented Jul 30, 2015

Perfekt.
Grüße vom Pool

Sign in to join this conversation on GitHub.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants