Skip to content

0.30.8 #137

Merged
merged 4 commits into from Oct 28, 2022
Merged

0.30.8 #137

merged 4 commits into from Oct 28, 2022

Commits on Oct 12, 2022

  1. Copy the full SHA
    2640b23 View commit details
    Browse the repository at this point in the history
  2. Remove create_job_tmpdir helper

    This helper has been replaced by tmpdir-setup. Remove it.
    donald committed Oct 12, 2022
    Copy the full SHA
    4285d95 View commit details
    Browse the repository at this point in the history

Commits on Oct 28, 2022

  1. tmpdir-setup: Don't cleanup asynchronously

    When mxqd restarts and finds finished jobs, it calls the tmpdir cleanup
    code for these jobs.
    
    As part of the recovery procedure, it later scans the system for any
    leftover mounts. When the regular tmpdir cleanup is done asynchronously,
    mxqd might discover a directory which is in the progress of being
    dismounted but still exists in which case it calls the tmpdir
    cleanup code a second time.
    
    There is no harm done, the jobs completed normally. The second
    attempted cleanup just produces some error messages in the logfile
    
    This bug is only triggered when jobs complete while mxqd is stopped.
    
    As the "old style" tmpdir setup is going away anyway, don't invent
    something complicated here and just do the cleanup synchronously.
    donald committed Oct 28, 2022
    Copy the full SHA
    0b1a29f View commit details
    Browse the repository at this point in the history
  2. tmpdir-setup: Avoid flush of TMPDIR

    Use a dm-device (linear target) between the filesystem and the loop
    device and then use this sequence for teardown:
    
    - fcntl EXT4_IOC_SHUTDOWN with EXT4_GOING_FLAGS_NOLOGFLUSH
    - dmestup reload $dmname --table "0 $sectors zero"
    - dmsetup resume $dmname --noflush
    - umount $mountpoint
    - dmsetup remove --deferred $dmname
    - rmdir $mountpoint
    
    The zero target prevents any real writes to the block device. However,
    if the filesystems reads back some data, it will get zeros, which could
    lead to all kinds of random behaviour. For this reason, we shut down the
    filesystem, which has the additional advantage, that some I/O is
    prevented in an even ealier stage. Shutdown alone, however, would not
    prevent all I/O (e.g. not cache writeback or superblock write), so we
    still need the zero target.
    
    Even with this setting, ext4 sometimes logs some errors
    ("ext4_writepages: jbd2_start: XXX pages, ino YYY; err -5").
    
    We've patched our kernel to avoid that message if the filesystem is shut
    down. This goes on top of the patches which avoid the usual "mounted"
    and "unmounted" messages for ext4.
    
    To support rolling upgrades of mxqd, keep support to clean up mounts
    created the old way, which is to mount a loop device directly.
    donald committed Oct 28, 2022
    Copy the full SHA
    2c29bb3 View commit details
    Browse the repository at this point in the history