Skip to content

Commit

Permalink
Revert "fs: do not prefault sys_write() user buffer pages"
Browse files Browse the repository at this point in the history
This reverts commit 998ef75.

The commit itself does not appear to be buggy per se, but it is exposing
a bug in ext4 (and Ted thinks ext3 too, but we solved that by getting
rid of it).  It's too late in the release cycle to really worry about
this, even if Dave Hansen has a patch that may actually fix the
underlying ext4 problem.  We can (and should) revisit this for the next
release.

The problem is that moving the prefaulting later now exposes a special
case with partially successful writes that isn't handled correctly.  And
the prefaulting likely isn't normally even that much of a performance
issue - it looks like at least one reason Dave saw this in his
performance tests is that he also ran them on Skylake that now supports
the new SMAP code, which makes the normally very cheap user space
prefaulting noticeably more expensive.

Bisected-and-acked-by: Ted Ts'o <tytso@mit.edu>
Analyzed-and-acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  • Loading branch information
Linus Torvalds committed Oct 7, 2015
1 parent f670268 commit 00a3d66
Showing 1 changed file with 16 additions and 18 deletions.
34 changes: 16 additions & 18 deletions mm/filemap.c
Original file line number Diff line number Diff line change
Expand Up @@ -2473,24 +2473,30 @@ ssize_t generic_perform_write(struct file *file,
iov_iter_count(i));

again:
/*
* Bring in the user page that we will copy from _first_.
* Otherwise there's a nasty deadlock on copying from the
* same page as we're writing to, without it being marked
* up-to-date.
*
* Not only is this an optimisation, but it is also required
* to check that the address is actually valid, when atomic
* usercopies are used, below.
*/
if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
status = -EFAULT;
break;
}

status = a_ops->write_begin(file, mapping, pos, bytes, flags,
&page, &fsdata);
if (unlikely(status < 0))
break;

if (mapping_writably_mapped(mapping))
flush_dcache_page(page);
/*
* 'page' is now locked. If we are trying to copy from a
* mapping of 'page' in userspace, the copy might fault and
* would need PageUptodate() to complete. But, page can not be
* made Uptodate without acquiring the page lock, which we hold.
* Deadlock. Avoid with pagefault_disable(). Fix up below with
* iov_iter_fault_in_readable().
*/
pagefault_disable();

copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
pagefault_enable();
flush_dcache_page(page);

status = a_ops->write_end(file, mapping, pos, bytes, copied,
Expand All @@ -2513,14 +2519,6 @@ ssize_t generic_perform_write(struct file *file,
*/
bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset,
iov_iter_single_seg_count(i));
/*
* This is the fallback to recover if the copy from
* userspace above faults.
*/
if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
status = -EFAULT;
break;
}
goto again;
}
pos += copied;
Expand Down

0 comments on commit 00a3d66

Please sign in to comment.