Skip to content

Commit

Permalink
xfs: check return codes when flushing block devices
Browse files Browse the repository at this point in the history
If a blkdev_issue_flush fails, fsync needs to report that to upper
levels.  Modify xfs_file_fsync to capture the errors, while trying to
flush as much data and log updates to disk as possible.

If log writes cannot flush the data device, we need to shut down the log
immediately because we've violated a log invariant.  Modify this code to
check the return value of blkdev_issue_flush as well.

This behavior seems to go back to about 2.6.15 or so, which makes this
fixes tag a bit misleading.

Link: https://elixir.bootlin.com/linux/v2.6.15/source/fs/xfs/xfs_vnodeops.c#L1187
Fixes: b5071ad ("xfs: remove xfs_blkdev_issue_flush")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
  • Loading branch information
Darrick J. Wong committed Aug 6, 2022
1 parent 5e9466a commit 7d839e3
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 10 deletions.
22 changes: 14 additions & 8 deletions fs/xfs/xfs_file.c
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ xfs_file_fsync(
{
struct xfs_inode *ip = XFS_I(file->f_mapping->host);
struct xfs_mount *mp = ip->i_mount;
int error = 0;
int error, err2;
int log_flushed = 0;

trace_xfs_file_fsync(ip);
Expand All @@ -163,18 +163,21 @@ xfs_file_fsync(
* inode size in case of an extending write.
*/
if (XFS_IS_REALTIME_INODE(ip))
blkdev_issue_flush(mp->m_rtdev_targp->bt_bdev);
error = blkdev_issue_flush(mp->m_rtdev_targp->bt_bdev);
else if (mp->m_logdev_targp != mp->m_ddev_targp)
blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);
error = blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);

/*
* Any inode that has dirty modifications in the log is pinned. The
* racy check here for a pinned inode while not catch modifications
* racy check here for a pinned inode will not catch modifications
* that happen concurrently to the fsync call, but fsync semantics
* only require to sync previously completed I/O.
*/
if (xfs_ipincount(ip))
error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
if (xfs_ipincount(ip)) {
err2 = xfs_fsync_flush_log(ip, datasync, &log_flushed);
if (err2 && !error)
error = err2;
}

/*
* If we only have a single device, and the log force about was
Expand All @@ -184,8 +187,11 @@ xfs_file_fsync(
* commit.
*/
if (!log_flushed && !XFS_IS_REALTIME_INODE(ip) &&
mp->m_logdev_targp == mp->m_ddev_targp)
blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);
mp->m_logdev_targp == mp->m_ddev_targp) {
err2 = blkdev_issue_flush(mp->m_ddev_targp->bt_bdev);
if (err2 && !error)
error = err2;
}

return error;
}
Expand Down
12 changes: 10 additions & 2 deletions fs/xfs/xfs_log.c
Original file line number Diff line number Diff line change
Expand Up @@ -1925,9 +1925,17 @@ xlog_write_iclog(
* device cache first to ensure all metadata writeback covered
* by the LSN in this iclog is on stable storage. This is slow,
* but it *must* complete before we issue the external log IO.
*
* If the flush fails, we cannot conclude that past metadata
* writeback from the log succeeded. Repeating the flush is
* not possible, hence we must shut down with log IO error to
* avoid shutdown re-entering this path and erroring out again.
*/
if (log->l_targ != log->l_mp->m_ddev_targp)
blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev);
if (log->l_targ != log->l_mp->m_ddev_targp &&
blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) {
xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
return;
}
}
if (iclog->ic_flags & XLOG_ICL_NEED_FUA)
iclog->ic_bio.bi_opf |= REQ_FUA;
Expand Down

0 comments on commit 7d839e3

Please sign in to comment.