Enforce/default to tmpdir usage #113

pmenzel · 2021-10-14T14:28:33Z

As we have cluster nodes with small, that means 1 TB, TMPDIR (/scratch/local2), and users sporadically fill up the temporary space causing run-time issues, let’s enforce tmpdir usage with some default.

Example problem: Jobs in uninterruptable sleep cannot be killed:

@internetguide:~$ uname -a
Linux internetguide.molgen.mpg.de 5.10.24.mx64.375 #1 SMP Fri Mar 19 12:29:21 CET 2021 x86_64 GNU/Linux
@internetguide:~$ ps aux | grep XXX
XXX 53622 24.5  0.0   7860  3424 ?        D    15:55   4:25 awk BEGIN{OFS="\t"}NF>=11{$1=$1"/1"; print} /scratch/local2/juicer_job_tmpdir/63521/splits/mpimg_L18466-1_906-02-8_S102_R1_001.fastq.gz_sort.sam
@internetguide:~$ sudo more /proc/53622/stack
[<0>] __flush_work+0x142/0x1c0
[<0>] xfs_file_buffered_aio_write+0x2d2/0x320
[<0>] new_sync_write+0x11f/0x1b0
[<0>] vfs_write+0x218/0x280
[<0>] ksys_write+0xa1/0xe0
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

The text was updated successfully, but these errors were encountered:

donald · 2021-10-15T07:20:36Z

So the idea is, that every job is started with its own (and guaranteed) $TMPDIR, even if not explicitly requested with --tmpdir, right? I had this in mind from the very beginning. I don't remember if somebody was against it? Anyway, we wanted to see if this continuous creation and mounting of filesystems runs into problems first. But it didn't. So I agree, we should do that.

Wouldn't help if people got used to use /scratch/local2, though,

When a job can't be killed, this is a kernel bug to me, no matter if disks are full or not.

donald · 2021-10-15T07:23:13Z

53622 seems to be gone now. Maybe not deadlocks, but just slow?

pmenzel · 2021-10-15T07:36:37Z

53622 seems to be gone now. Maybe not deadlocks, but just slow?

Sorry for being unclear. The user came down with the issue, that killing the group worked, but the job was still listed. After deleting files on /scratch/local2/ to free up some space, job 32522716 was successfully killed and was gone.

pmenzel · 2021-10-15T07:40:18Z

2021-10-14 16:02:32 +0200 mxqd[1844]: job=XXX(15013):291947:32522716 cancelled
2021-10-14 16:02:32 +0200 mxqd[1844]: sending signal=15 to job=XXX(15013):291947:32522716
[…]
2021-10-14 16:03:05 +0200 mxqd[1844]: sending signal=9 to job=XXX(15013):291947:32522716
[…]
2021-10-14 16:15:35 +0200 mxqd[1844]: sending signal=9 to job=XXX(15013):291947:32522716
[…]
2021-10-14 16:15:58 +0200 mxqd[1844]: job finished (via fspool) : job 32522716 pid 63520 status 15

pmenzel · 2021-10-15T07:43:23Z

I edited the paste, and added the first time, KILL was signaled to the process.

donald · 2021-10-15T08:32:46Z

Hmmm. Yes, job exit 12 minutes after kill -9 doesn't look right. The above awk writes to stdout (which seems to go into a /scratch/local2 xfs file). Probably awk does not check for write errors so it might just keep going when the disk is full. We could simulate that to find if we can trigger the problem.

Anyway, full disks are a problem (and might even produce wrong results if applications fail to check for write errors ). So we should try to avoid that as much as possible. To enforce TMPDIR to managed space for every job would be one way to help in that regard. What would be a good TMPDIR default? 100G?

pmenzel · 2021-10-15T08:39:04Z

(Aren’t you on vacation. ;-))

I would have said 10 GB to make users more aware, what their programs are doing, and not exclude too many nodes (which shouldn’t happen too often anymore though, when this is the default). It could always be increased in case it impairs too many users. But 100 GB is also fine.

arndt · 2022-05-12T13:11:09Z

This just crashed some of my jobs which apparently needed more tmpdir ...

donald · 2022-05-12T13:29:19Z

Sorry, but see it this way: Your jobs also used to crash, when $TMPDIR was filled by other users. Now, after you've set --tmpdir to what you need, you are guaranteed to have this space., If someone fills /scratch/local2, you job won't start and die on that node but would start on another node which has the requested disk space.

arndt · 2022-05-12T13:33:41Z

all good - to have this guarantee is good - but now I also have to think about how much tmp space I need. In my case I didn't even know that I need tmp space - but apparently some internals routine need it to store a gzip-decompressed file when I read it.

donald · 2022-05-12T13:47:52Z

Keep doubling until enough....

donald mentioned this issue May 11, 2022

0.30.6 #133

Merged

donald closed this as completed in #133 May 11, 2022

donald mentioned this issue May 12, 2022

0.30.7 #134

Merged

Enforce/default to tmpdir usage #113

Enforce/default to tmpdir usage #113

pmenzel commented Oct 14, 2021

donald commented Oct 15, 2021 •

edited

Loading

donald commented Oct 15, 2021

pmenzel commented Oct 15, 2021

pmenzel commented Oct 15, 2021 •

edited

Loading

pmenzel commented Oct 15, 2021

donald commented Oct 15, 2021 •

edited

Loading

pmenzel commented Oct 15, 2021

arndt commented May 12, 2022

donald commented May 12, 2022

arndt commented May 12, 2022

donald commented May 12, 2022

Enforce/default to tmpdir usage #113

Enforce/default to tmpdir usage #113

Comments

pmenzel commented Oct 14, 2021

donald commented Oct 15, 2021 • edited Loading

donald commented Oct 15, 2021

pmenzel commented Oct 15, 2021

pmenzel commented Oct 15, 2021 • edited Loading

pmenzel commented Oct 15, 2021

donald commented Oct 15, 2021 • edited Loading

pmenzel commented Oct 15, 2021

arndt commented May 12, 2022

donald commented May 12, 2022

arndt commented May 12, 2022

donald commented May 12, 2022

donald commented Oct 15, 2021 •

edited

Loading

pmenzel commented Oct 15, 2021 •

edited

Loading

donald commented Oct 15, 2021 •

edited

Loading