From 275d7d44d802ef271a42dc87ac091a495ba72fc5 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz@infradead.org>
Date: Thu, 20 Aug 2015 10:34:59 +0930
Subject: [PATCH 001/199] module: Fix locking in symbol_put_addr()

Poma (on the way to another bug) reported an assertion triggering:

  [<ffffffff81150529>] module_assert_mutex_or_preempt+0x49/0x90
  [<ffffffff81150822>] __module_address+0x32/0x150
  [<ffffffff81150956>] __module_text_address+0x16/0x70
  [<ffffffff81150f19>] symbol_put_addr+0x29/0x40
  [<ffffffffa04b77ad>] dvb_frontend_detach+0x7d/0x90 [dvb_core]

Laura Abbott <labbott@redhat.com> produced a patch which lead us to
inspect symbol_put_addr(). This function has a comment claiming it
doesn't need to disable preemption around the module lookup
because it holds a reference to the module it wants to find, which
therefore cannot go away.

This is wrong (and a false optimization too, preempt_disable() is really
rather cheap, and I doubt any of this is on uber critical paths,
otherwise it would've retained a pointer to the actual module anyway and
avoided the second lookup).

While its true that the module cannot go away while we hold a reference
on it, the data structure we do the lookup in very much _CAN_ change
while we do the lookup. Therefore fix the comment and add the
required preempt_disable().

Reported-by: poma <pomidorabelisima@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: a6e6abd575fc ("module: remove module_text_address()")
Cc: stable@kernel.org
---
 kernel/module.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index b86b7bf1be388..8f051a106676f 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1063,11 +1063,15 @@ void symbol_put_addr(void *addr)
 	if (core_kernel_text(a))
 		return;
 
-	/* module_text_address is safe here: we're supposed to have reference
-	 * to module from symbol_get, so it can't go away. */
+	/*
+	 * Even though we hold a reference on the module; we still need to
+	 * disable preemption in order to safely traverse the data structure.
+	 */
+	preempt_disable();
 	modaddr = __module_text_address(a);
 	BUG_ON(!modaddr);
 	module_put(modaddr);
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(symbol_put_addr);
 

From 868e87ccda2461cafd4a0d39f1486eb8f4a9a6f9 Mon Sep 17 00:00:00 2001
From: Russell King <rmk+kernel@arm.linux.org.uk>
Date: Mon, 28 Sep 2015 10:31:50 +0100
Subject: [PATCH 002/199] ARM: make RiscPC depend on MMU

RiscPC fails to build if MMU is disabled:

arch/arm/mach-rpc/ecard.c: In function 'ecard_init_pgtables':
arch/arm/mach-rpc/ecard.c:229:2: error: implicit declaration of function 'pgd_offset' [-Werror=implicit-function-declaration]

arrange for RiscPC to depend on MMU.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 72ad724c67ae9..639411f73ca95 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -645,6 +645,7 @@ config ARCH_SHMOBILE_LEGACY
 
 config ARCH_RPC
 	bool "RiscPC"
+	depends on MMU
 	select ARCH_ACORN
 	select ARCH_MAY_HAVE_PC_FDC
 	select ARCH_SPARSEMEM_ENABLE

From 178b2d09afc05a46f68b190c6594f3a429bc2385 Mon Sep 17 00:00:00 2001
From: Fabio Estevam <fabio.estevam@freescale.com>
Date: Thu, 24 Sep 2015 16:18:12 -0300
Subject: [PATCH 003/199] ARM: dts: imx7d: Fix UART2 base address

The UART2 memory space starts at address 0x30890000 (UART2_URXD).

Fix it so that UART2 can be used.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Fixes: 949673450291 ("ARM: dts: add imx7d soc dtsi file")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
---
 arch/arm/boot/dts/imx7d.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/imx7d.dtsi b/arch/arm/boot/dts/imx7d.dtsi
index b738ce0f9d9bc..6e444bb873f92 100644
--- a/arch/arm/boot/dts/imx7d.dtsi
+++ b/arch/arm/boot/dts/imx7d.dtsi
@@ -588,10 +588,10 @@
 				status = "disabled";
 			};
 
-			uart2: serial@30870000 {
+			uart2: serial@30890000 {
 				compatible = "fsl,imx7d-uart",
 					     "fsl,imx6q-uart";
-				reg = <0x30870000 0x10000>;
+				reg = <0x30890000 0x10000>;
 				interrupts = <GIC_SPI 27 IRQ_TYPE_LEVEL_HIGH>;
 				clocks = <&clks IMX7D_UART2_ROOT_CLK>,
 					<&clks IMX7D_UART2_ROOT_CLK>;

From 1f744fd317dc55cadd7132c57c499e3117aea01d Mon Sep 17 00:00:00 2001
From: Thomas Hebb <tommyhebb@gmail.com>
Date: Thu, 1 Oct 2015 21:00:00 +0200
Subject: [PATCH 004/199] ARM: dts: berlin: change BG2Q's USB PHY compatible

Currently, BG2Q shares a compatible with BG2. This is incorrect, since
BG2 and BG2Q use different USB PLL dividers. In reality, BG2Q shares a
divider with BG2CD. Change BG2Q's USB PHY compatible string to reflect
that.

Cc: <stable@vger.kernel.org> # v4.2.0-
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
---
 arch/arm/boot/dts/berlin2q.dtsi | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/berlin2q.dtsi b/arch/arm/boot/dts/berlin2q.dtsi
index 63a48490e2f96..d4dbd28d348c0 100644
--- a/arch/arm/boot/dts/berlin2q.dtsi
+++ b/arch/arm/boot/dts/berlin2q.dtsi
@@ -152,7 +152,7 @@
 		};
 
 		usb_phy2: phy@a2f400 {
-			compatible = "marvell,berlin2-usb-phy";
+			compatible = "marvell,berlin2cd-usb-phy";
 			reg = <0xa2f400 0x128>;
 			#phy-cells = <0>;
 			resets = <&chip_rst 0x104 14>;
@@ -170,7 +170,7 @@
 		};
 
 		usb_phy0: phy@b74000 {
-			compatible = "marvell,berlin2-usb-phy";
+			compatible = "marvell,berlin2cd-usb-phy";
 			reg = <0xb74000 0x128>;
 			#phy-cells = <0>;
 			resets = <&chip_rst 0x104 12>;
@@ -178,7 +178,7 @@
 		};
 
 		usb_phy1: phy@b78000 {
-			compatible = "marvell,berlin2-usb-phy";
+			compatible = "marvell,berlin2cd-usb-phy";
 			reg = <0xb78000 0x128>;
 			#phy-cells = <0>;
 			resets = <&chip_rst 0x104 13>;

From 7cc97d77ee8a90a6389b96a62472cddc02475ffc Mon Sep 17 00:00:00 2001
From: Adam YH Lee <adam.yh.lee@gmail.com>
Date: Tue, 4 Aug 2015 11:15:48 -0700
Subject: [PATCH 005/199] iio: adc: twl4030: Fix ADC[3:6] readings

MADC[3:6] reads incorrect values without these two following changes:

- enable the 3v1 bias regulator for ADC[3:6]
- configure ADC[3:6] lines as input, not as USB

Signed-off-by: Adam YH Lee <adam.yh.lee@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
---
 drivers/iio/adc/twl4030-madc.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/drivers/iio/adc/twl4030-madc.c b/drivers/iio/adc/twl4030-madc.c
index ebe415f106400..0c74869a540ad 100644
--- a/drivers/iio/adc/twl4030-madc.c
+++ b/drivers/iio/adc/twl4030-madc.c
@@ -45,13 +45,18 @@
 #include <linux/types.h>
 #include <linux/gfp.h>
 #include <linux/err.h>
+#include <linux/regulator/consumer.h>
 
 #include <linux/iio/iio.h>
 
+#define TWL4030_USB_SEL_MADC_MCPC	(1<<3)
+#define TWL4030_USB_CARKIT_ANA_CTRL	0xBB
+
 /**
  * struct twl4030_madc_data - a container for madc info
  * @dev:		Pointer to device structure for madc
  * @lock:		Mutex protecting this data structure
+ * @regulator:		Pointer to bias regulator for madc
  * @requests:		Array of request struct corresponding to SW1, SW2 and RT
  * @use_second_irq:	IRQ selection (main or co-processor)
  * @imr:		Interrupt mask register of MADC
@@ -60,6 +65,7 @@
 struct twl4030_madc_data {
 	struct device *dev;
 	struct mutex lock;	/* mutex protecting this data structure */
+	struct regulator *usb3v1;
 	struct twl4030_madc_request requests[TWL4030_MADC_NUM_METHODS];
 	bool use_second_irq;
 	u8 imr;
@@ -841,6 +847,32 @@ static int twl4030_madc_probe(struct platform_device *pdev)
 	}
 	twl4030_madc = madc;
 
+	/* Configure MADC[3:6] */
+	ret = twl_i2c_read_u8(TWL_MODULE_USB, &regval,
+			TWL4030_USB_CARKIT_ANA_CTRL);
+	if (ret) {
+		dev_err(&pdev->dev, "unable to read reg CARKIT_ANA_CTRL  0x%X\n",
+				TWL4030_USB_CARKIT_ANA_CTRL);
+		goto err_i2c;
+	}
+	regval |= TWL4030_USB_SEL_MADC_MCPC;
+	ret = twl_i2c_write_u8(TWL_MODULE_USB, regval,
+				 TWL4030_USB_CARKIT_ANA_CTRL);
+	if (ret) {
+		dev_err(&pdev->dev, "unable to write reg CARKIT_ANA_CTRL 0x%X\n",
+				TWL4030_USB_CARKIT_ANA_CTRL);
+		goto err_i2c;
+	}
+
+	/* Enable 3v1 bias regulator for MADC[3:6] */
+	madc->usb3v1 = devm_regulator_get(madc->dev, "vusb3v1");
+	if (IS_ERR(madc->usb3v1))
+		return -ENODEV;
+
+	ret = regulator_enable(madc->usb3v1);
+	if (ret)
+		dev_err(madc->dev, "could not enable 3v1 bias regulator\n");
+
 	ret = iio_device_register(iio_dev);
 	if (ret) {
 		dev_err(&pdev->dev, "could not register iio device\n");
@@ -866,6 +898,8 @@ static int twl4030_madc_remove(struct platform_device *pdev)
 	twl4030_madc_set_current_generator(madc, 0, 0);
 	twl4030_madc_set_power(madc, 0);
 
+	regulator_disable(madc->usb3v1);
+
 	return 0;
 }
 

From 61fd56309165d4790f99462d893b099f0b07312a Mon Sep 17 00:00:00 2001
From: Linus Walleij <linus.walleij@linaro.org>
Date: Wed, 2 Sep 2015 21:02:58 +0200
Subject: [PATCH 006/199] iio: st_accel: fix interrupt handling on LIS3LV02

This accelerometer accidentally either emits a DRDY signal or an
IRQ signal. Accidentally I activated the IRQ signal as I thought
it was analogous to the interrupt generator on other ST
accelerometers. This was wrong. After this patch generic_buffer
gives a nice stream of accelerometer readings.

Fixes: 3acddf74f807778f "iio: st-sensors: add support for lis3lv02d accelerometer"
Cc: Denis CIOCCA <denis.ciocca@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
---
 drivers/iio/accel/st_accel_core.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/iio/accel/st_accel_core.c b/drivers/iio/accel/st_accel_core.c
index ff30f88068801..fb93111104249 100644
--- a/drivers/iio/accel/st_accel_core.c
+++ b/drivers/iio/accel/st_accel_core.c
@@ -149,8 +149,6 @@
 #define ST_ACCEL_4_BDU_MASK			0x40
 #define ST_ACCEL_4_DRDY_IRQ_ADDR		0x21
 #define ST_ACCEL_4_DRDY_IRQ_INT1_MASK		0x04
-#define ST_ACCEL_4_IG1_EN_ADDR			0x21
-#define ST_ACCEL_4_IG1_EN_MASK			0x08
 #define ST_ACCEL_4_MULTIREAD_BIT		true
 
 /* CUSTOM VALUES FOR SENSOR 5 */
@@ -489,10 +487,6 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
 		.drdy_irq = {
 			.addr = ST_ACCEL_4_DRDY_IRQ_ADDR,
 			.mask_int1 = ST_ACCEL_4_DRDY_IRQ_INT1_MASK,
-			.ig1 = {
-				.en_addr = ST_ACCEL_4_IG1_EN_ADDR,
-				.en_mask = ST_ACCEL_4_IG1_EN_MASK,
-			},
 		},
 		.multi_read_bit = ST_ACCEL_4_MULTIREAD_BIT,
 		.bootime = 2, /* guess */

From eda7d0f38aaf50dbb2a2de15e8db386c4f6f65fc Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Sat, 8 Aug 2015 22:16:42 +0300
Subject: [PATCH 007/199] iio: accel: sca3000: memory corruption in
 sca3000_read_first_n_hw_rb()

"num_read" is in byte units but we are write u16s so we end up write
twice as much as intended.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
---
 drivers/staging/iio/accel/sca3000_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/iio/accel/sca3000_ring.c b/drivers/staging/iio/accel/sca3000_ring.c
index 23685e74917e2..bd2c69f85949b 100644
--- a/drivers/staging/iio/accel/sca3000_ring.c
+++ b/drivers/staging/iio/accel/sca3000_ring.c
@@ -116,7 +116,7 @@ static int sca3000_read_first_n_hw_rb(struct iio_buffer *r,
 	if (ret)
 		goto error_ret;
 
-	for (i = 0; i < num_read; i++)
+	for (i = 0; i < num_read / sizeof(u16); i++)
 		*(((u16 *)rx) + i) = be16_to_cpup((__be16 *)rx + i);
 
 	if (copy_to_user(buf, rx, num_read))

From d836ace65ee98d7079bc3c5afdbcc0e27dca20a3 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Sat, 3 Oct 2015 13:03:47 -0700
Subject: [PATCH 008/199] ARM: orion: Fix DSA platform device after mvmdio
 conversion

DSA expects the host_dev pointer to be the device structure associated
with the MDIO bus controller driver. First commit breaking that was
c3a07134e6aa ("mv643xx_eth: convert to use the Marvell Orion MDIO
driver"), and then, it got completely under the radar for a while.

Reported-by: Frans van de Wiel <fvdw@fvdw.eu>
Fixes: c3a07134e6aa ("mv643xx_eth: convert to use the Marvell Orion MDIO driver")
CC: stable@vger.kernel.org
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 arch/arm/plat-orion/common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c
index 2235081a04eea..8861c367d0611 100644
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -495,7 +495,7 @@ void __init orion_ge00_switch_init(struct dsa_platform_data *d, int irq)
 
 	d->netdev = &orion_ge00.dev;
 	for (i = 0; i < d->nr_chips; i++)
-		d->chip[i].host_dev = &orion_ge00_shared.dev;
+		d->chip[i].host_dev = &orion_ge_mvmdio.dev;
 	orion_switch_device.dev.platform_data = d;
 
 	platform_device_register(&orion_switch_device);

From 1266963170f576d4d08e6310b6963e26d3ff9d1e Mon Sep 17 00:00:00 2001
From: Sasha Levin <sasha.levin@oracle.com>
Date: Wed, 7 Oct 2015 11:03:28 -0500
Subject: [PATCH 009/199] PCI: Prevent out of bounds access in numa_node
 override

63692df103e9 ("PCI: Allow numa_node override via sysfs") didn't check that
the numa node provided by userspace is valid.  Passing a node number too
high would attempt to access invalid memory and trigger a kernel panic.

Fixes: 63692df103e9 ("PCI: Allow numa_node override via sysfs")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v3.19+
---
 drivers/pci/pci-sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 312f23a8429cd..92618686604cb 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -216,7 +216,7 @@ static ssize_t numa_node_store(struct device *dev,
 	if (ret)
 		return ret;
 
-	if (!node_online(node))
+	if (node >= MAX_NUMNODES || !node_online(node))
 		return -EINVAL;
 
 	add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);

From a54c8f0f2d7df525ff997e2afe71866a1a013064 Mon Sep 17 00:00:00 2001
From: Cathy Avery <cathy.avery@oracle.com>
Date: Fri, 2 Oct 2015 09:35:01 -0400
Subject: [PATCH 010/199] xen-blkfront: check for null drvdata in
 blkback_changed (XenbusStateClosing)

xen-blkfront will crash if the check to talk_to_blkback()
in blkback_changed()(XenbusStateInitWait) returns an error.
The driver data is freed and info is set to NULL. Later during
the close process via talk_to_blkback's call to xenbus_dev_fatal()
the null pointer is passed to and dereference in blkfront_closing.

CC: stable@vger.kernel.org
Signed-off-by: Cathy Avery <cathy.avery@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/block/xen-blkfront.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 6d89ed35d80c0..c8fdbc77f9b17 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1968,7 +1968,8 @@ static void blkback_changed(struct xenbus_device *dev,
 			break;
 		/* Missed the backend's Closing state -- fallthrough */
 	case XenbusStateClosing:
-		blkfront_closing(info);
+		if (info)
+			blkfront_closing(info);
 		break;
 	}
 }

From dcc909d90ccdbb73226397ff6d298f7af35b0e11 Mon Sep 17 00:00:00 2001
From: Markus Pargmann <mpa@pengutronix.de>
Date: Tue, 6 Oct 2015 20:03:54 +0200
Subject: [PATCH 011/199] nbd: Add locking for tasks

The timeout handling introduced in
	7e2893a16d3e (nbd: Fix timeout detection)
introduces a race condition which may lead to killing of tasks that are
not in nbd context anymore. This was not observed or reproducable yet.

This patch adds locking to critical use of task_recv and task_send to
avoid killing tasks that already left the NBD thread functions. This
lock is only acquired if a timeout occures or the nbd device
starts/stops.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 7e2893a16d3e ("nbd: Fix timeout detection")
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 drivers/block/nbd.c | 36 ++++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 293495a75d3d8..1b87623381e2b 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -60,6 +60,7 @@ struct nbd_device {
 	bool disconnect; /* a disconnect has been requested by user */
 
 	struct timer_list timeout_timer;
+	spinlock_t tasks_lock;
 	struct task_struct *task_recv;
 	struct task_struct *task_send;
 
@@ -140,21 +141,23 @@ static void sock_shutdown(struct nbd_device *nbd)
 static void nbd_xmit_timeout(unsigned long arg)
 {
 	struct nbd_device *nbd = (struct nbd_device *)arg;
-	struct task_struct *task;
+	unsigned long flags;
 
 	if (list_empty(&nbd->queue_head))
 		return;
 
 	nbd->disconnect = true;
 
-	task = READ_ONCE(nbd->task_recv);
-	if (task)
-		force_sig(SIGKILL, task);
+	spin_lock_irqsave(&nbd->tasks_lock, flags);
+
+	if (nbd->task_recv)
+		force_sig(SIGKILL, nbd->task_recv);
 
-	task = READ_ONCE(nbd->task_send);
-	if (task)
+	if (nbd->task_send)
 		force_sig(SIGKILL, nbd->task_send);
 
+	spin_unlock_irqrestore(&nbd->tasks_lock, flags);
+
 	dev_err(nbd_to_dev(nbd), "Connection timed out, killed receiver and sender, shutting down connection\n");
 }
 
@@ -403,17 +406,24 @@ static int nbd_thread_recv(struct nbd_device *nbd)
 {
 	struct request *req;
 	int ret;
+	unsigned long flags;
 
 	BUG_ON(nbd->magic != NBD_MAGIC);
 
 	sk_set_memalloc(nbd->sock->sk);
 
+	spin_lock_irqsave(&nbd->tasks_lock, flags);
 	nbd->task_recv = current;
+	spin_unlock_irqrestore(&nbd->tasks_lock, flags);
 
 	ret = device_create_file(disk_to_dev(nbd->disk), &pid_attr);
 	if (ret) {
 		dev_err(disk_to_dev(nbd->disk), "device_create_file failed!\n");
+
+		spin_lock_irqsave(&nbd->tasks_lock, flags);
 		nbd->task_recv = NULL;
+		spin_unlock_irqrestore(&nbd->tasks_lock, flags);
+
 		return ret;
 	}
 
@@ -429,7 +439,9 @@ static int nbd_thread_recv(struct nbd_device *nbd)
 
 	device_remove_file(disk_to_dev(nbd->disk), &pid_attr);
 
+	spin_lock_irqsave(&nbd->tasks_lock, flags);
 	nbd->task_recv = NULL;
+	spin_unlock_irqrestore(&nbd->tasks_lock, flags);
 
 	if (signal_pending(current)) {
 		siginfo_t info;
@@ -534,8 +546,11 @@ static int nbd_thread_send(void *data)
 {
 	struct nbd_device *nbd = data;
 	struct request *req;
+	unsigned long flags;
 
+	spin_lock_irqsave(&nbd->tasks_lock, flags);
 	nbd->task_send = current;
+	spin_unlock_irqrestore(&nbd->tasks_lock, flags);
 
 	set_user_nice(current, MIN_NICE);
 	while (!kthread_should_stop() || !list_empty(&nbd->waiting_queue)) {
@@ -572,7 +587,15 @@ static int nbd_thread_send(void *data)
 		nbd_handle_req(nbd, req);
 	}
 
+	spin_lock_irqsave(&nbd->tasks_lock, flags);
 	nbd->task_send = NULL;
+	spin_unlock_irqrestore(&nbd->tasks_lock, flags);
+
+	/* Clear maybe pending signals */
+	if (signal_pending(current)) {
+		siginfo_t info;
+		dequeue_signal_lock(current, &current->blocked, &info);
+	}
 
 	return 0;
 }
@@ -1052,6 +1075,7 @@ static int __init nbd_init(void)
 		nbd_dev[i].magic = NBD_MAGIC;
 		INIT_LIST_HEAD(&nbd_dev[i].waiting_queue);
 		spin_lock_init(&nbd_dev[i].queue_lock);
+		spin_lock_init(&nbd_dev[i].tasks_lock);
 		INIT_LIST_HEAD(&nbd_dev[i].queue_head);
 		mutex_init(&nbd_dev[i].tx_lock);
 		init_timer(&nbd_dev[i].timeout_timer);

From b94e22805a2224061bb263a82b72e09544a5fbb3 Mon Sep 17 00:00:00 2001
From: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Date: Wed, 7 Oct 2015 13:10:54 +0200
Subject: [PATCH 012/199] iio: mxs-lradc: Fix temperature offset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

0° Kelvin is actually −273.15°C, not -272.15°C. Fix the temperature offset.
Also improve the comment explaining the calculation.

Reported-by: Janusz Użycki <j.uzycki@elpromaelectronics.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Marek Vasut <marex@denx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
---
 drivers/staging/iio/adc/mxs-lradc.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/iio/adc/mxs-lradc.c b/drivers/staging/iio/adc/mxs-lradc.c
index 3f7715c9968b8..47fc00a3f63bc 100644
--- a/drivers/staging/iio/adc/mxs-lradc.c
+++ b/drivers/staging/iio/adc/mxs-lradc.c
@@ -915,11 +915,12 @@ static int mxs_lradc_read_raw(struct iio_dev *iio_dev,
 	case IIO_CHAN_INFO_OFFSET:
 		if (chan->type == IIO_TEMP) {
 			/* The calculated value from the ADC is in Kelvin, we
-			 * want Celsius for hwmon so the offset is
-			 * -272.15 * scale
+			 * want Celsius for hwmon so the offset is -273.15
+			 * The offset is applied before scaling so it is
+			 * actually -213.15 * 4 / 1.012 = -1079.644268
 			 */
-			*val = -1075;
-			*val2 = 691699;
+			*val = -1079;
+			*val2 = 644268;
 
 			return IIO_VAL_INT_PLUS_MICRO;
 		}

From 9babcd7929bc8967ae3bb6093f603b93c2f9958f Mon Sep 17 00:00:00 2001
From: Daniel Bristot de Oliveira <bristot@redhat.com>
Date: Thu, 8 Oct 2015 15:36:06 -0300
Subject: [PATCH 013/199] sched, tracing: Stop/start critical timings around
 the idle=poll idle loop

When using idle=poll, the preemptoff tracer is always showing
the idle task as the culprit for long latencies. That happens
because critical timings are not stopped before idle loop. This
patch stops critical timings before entering the idle loop,
starting it again after the idle loop.

This problem does not affect the irqsoff tracer because
interruptions are enabled before entering the idle loop.

Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Reviewed-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/10fc3705874aef11dbe152a068b591a7be1899b4.1444314899.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/idle.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 8f177c73ae199..4a2ef5a02fd3f 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -57,9 +57,11 @@ static inline int cpu_idle_poll(void)
 	rcu_idle_enter();
 	trace_cpu_idle_rcuidle(0, smp_processor_id());
 	local_irq_enable();
+	stop_critical_timings();
 	while (!tif_need_resched() &&
 		(cpu_idle_force_poll || tick_check_broadcast_expired()))
 		cpu_relax();
+	start_critical_timings();
 	trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
 	rcu_idle_exit();
 	return 1;

From 0480334fa60488d12ae101a02d7d9e1a3d03d7dd Mon Sep 17 00:00:00 2001
From: David Howells <dhowells@redhat.com>
Date: Fri, 18 Sep 2015 11:45:12 +0100
Subject: [PATCH 014/199] ovl: use O_LARGEFILE in ovl_copy_up()

Open the lower file with O_LARGEFILE in ovl_copy_up().

Pass O_LARGEFILE unconditionally in ovl_copy_up_data() as it's purely for
catching 32-bit userspace dealing with a file large enough that it'll be
mishandled if the application isn't aware that there might be an integer
overflow.  Inside the kernel, there shouldn't be any problems.

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org> # v3.18+
---
 fs/overlayfs/copy_up.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 84d693d374284..b1990ac8fa097 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -81,11 +81,11 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 	if (len == 0)
 		return 0;
 
-	old_file = ovl_path_open(old, O_RDONLY);
+	old_file = ovl_path_open(old, O_LARGEFILE | O_RDONLY);
 	if (IS_ERR(old_file))
 		return PTR_ERR(old_file);
 
-	new_file = ovl_path_open(new, O_WRONLY);
+	new_file = ovl_path_open(new, O_LARGEFILE | O_WRONLY);
 	if (IS_ERR(new_file)) {
 		error = PTR_ERR(new_file);
 		goto out_fput;

From ab79efab0a0ba01a74df782eb7fa44b044dae8b5 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells@redhat.com>
Date: Fri, 18 Sep 2015 11:45:22 +0100
Subject: [PATCH 015/199] ovl: fix dentry reference leak

In ovl_copy_up_locked(), newdentry is leaked if the function exits through
out_cleanup as this just to out after calling ovl_cleanup() - which doesn't
actually release the ref on newdentry.

The out_cleanup segment should instead exit through out2 as certainly
newdentry leaks - and possibly upper does also, though this isn't caught
given the catch of newdentry.

Without this fix, something like the following is seen:

	BUG: Dentry ffff880023e9eb20{i=f861,n=#ffff880023e82d90} still in use (1) [unmount of tmpfs tmpfs]
	BUG: Dentry ffff880023ece640{i=0,n=bigfile}  still in use (1) [unmount of tmpfs tmpfs]

when unmounting the upper layer after an error occurred in copyup.

An error can be induced by creating a big file in a lower layer with
something like:

	dd if=/dev/zero of=/lower/a/bigfile bs=65536 count=1 seek=$((0xf000))

to create a large file (4.1G).  Overlay an upper layer that is too small
(on tmpfs might do) and then induce a copy up by opening it writably.

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org> # v3.18+
---
 fs/overlayfs/copy_up.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index b1990ac8fa097..871fcb67be974 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -267,7 +267,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 
 out_cleanup:
 	ovl_cleanup(wdir, newdentry);
-	goto out;
+	goto out2;
 }
 
 /*

From 1c8a47df36d72ace8cf78eb6c228aa0f8027d3c2 Mon Sep 17 00:00:00 2001
From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 12 Oct 2015 15:56:20 +0200
Subject: [PATCH 016/199] ovl: fix open in stacked overlay

If two overlayfs filesystems are stacked on top of each other, then we need
recursion in ovl_d_select_inode().

I guess d_backing_inode() is supposed to do that.  But currently it doesn't
and that functionality is open coded in vfs_open().  This is now copied
into ovl_d_select_inode() to fix this regression.

Reported-by: Alban Crequy <alban.crequy@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay...")
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
---
 fs/overlayfs/inode.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index d9da5a4e93821..ec0c2a050043a 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -363,6 +363,9 @@ struct inode *ovl_d_select_inode(struct dentry *dentry, unsigned file_flags)
 		ovl_path_upper(dentry, &realpath);
 	}
 
+	if (realpath.dentry->d_flags & DCACHE_OP_SELECT_INODE)
+		return realpath.dentry->d_op->d_select_inode(realpath.dentry, file_flags);
+
 	return d_backing_inode(realpath.dentry);
 }
 

From 0f95502ad84874b3c05fc7cdd9d4d9d5cddf7859 Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date: Mon, 24 Aug 2015 15:57:18 +0300
Subject: [PATCH 017/199] ovl: free stack of paths in ovl_fill_super

This fixes small memory leak after mount.

Kmemleak report:

unreferenced object 0xffff88003683fe00 (size 16):
  comm "mount", pid 2029, jiffies 4294909563 (age 33.380s)
  hex dump (first 16 bytes):
    20 27 1f bb 00 88 ff ff 40 4b 0f 36 02 88 ff ff   '......@K.6....
  backtrace:
    [<ffffffff811f8cd4>] create_object+0x124/0x2c0
    [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
    [<ffffffff811dffe6>] __kmalloc+0x106/0x340
    [<ffffffffa01b7a29>] ovl_fill_super+0x389/0x9a0 [overlay]
    [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
    [<ffffffffa01b7118>] ovl_mount+0x18/0x20 [overlay]
    [<ffffffff81201ab3>] mount_fs+0x43/0x170
    [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
    [<ffffffff812233ad>] do_mount+0x22d/0xdf0
    [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
    [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Fixes: a78d9f0d5d5c ("ovl: support multiple lower layers")
Cc: <stable@vger.kernel.org> # v4.0+
---
 fs/overlayfs/super.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 7466ff339c667..3f90c43c3c4a6 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1048,6 +1048,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		oe->lowerstack[i].dentry = stack[i].dentry;
 		oe->lowerstack[i].mnt = ufs->lower_mnt[i];
 	}
+	kfree(stack);
 
 	root_dentry->d_fsdata = oe;
 

From 5ffdbe8bf1e485026e1c7e4714d2841553cf0b40 Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date: Mon, 24 Aug 2015 15:57:19 +0300
Subject: [PATCH 018/199] ovl: free lower_mnt array in ovl_put_super

This fixes memory leak after umount.

Kmemleak report:

unreferenced object 0xffff8800ba791010 (size 8):
  comm "mount", pid 2394, jiffies 4294996294 (age 53.920s)
  hex dump (first 8 bytes):
    20 1c 13 02 00 88 ff ff                           .......
  backtrace:
    [<ffffffff811f8cd4>] create_object+0x124/0x2c0
    [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
    [<ffffffff811dffe6>] __kmalloc+0x106/0x340
    [<ffffffffa0152bfc>] ovl_fill_super+0x55c/0x9b0 [overlay]
    [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
    [<ffffffffa0152118>] ovl_mount+0x18/0x20 [overlay]
    [<ffffffff81201ab3>] mount_fs+0x43/0x170
    [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
    [<ffffffff812233ad>] do_mount+0x22d/0xdf0
    [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
    [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Fixes: dd662667e6d3 ("ovl: add mutli-layer infrastructure")
Cc: <stable@vger.kernel.org> # v4.0+
---
 fs/overlayfs/super.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 3f90c43c3c4a6..8d04b86e0680d 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -544,6 +544,7 @@ static void ovl_put_super(struct super_block *sb)
 	mntput(ufs->upper_mnt);
 	for (i = 0; i < ufs->numlower; i++)
 		mntput(ufs->lower_mnt[i]);
+	kfree(ufs->lower_mnt);
 
 	kfree(ufs->config.lowerdir);
 	kfree(ufs->config.upperdir);

From 9ad18ab938375502c03cf467abecbb77264c9475 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 29 Sep 2015 12:47:50 -0400
Subject: [PATCH 019/199] writeback: laptop_mode_timer_fn() needs
 rcu_read_lock() around bdi_writeback iteration

laptop_mode_timer_fn() was using bdi_for_each_wb() without the
required RCU locking leading to the following warning.

 WARNING: CPU: 0 PID: 0 at include/linux/backing-dev.h:415 laptop_mode_timer_fn+0x106/0x170()
 ...
 Call Trace:
  <IRQ>  [<ffffffff81480cdc>] dump_stack+0x4e/0x82
  [<ffffffff81051912>] warn_slowpath_common+0x82/0xc0
  [<ffffffff81051a0a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8115f0e6>] laptop_mode_timer_fn+0x106/0x170
  [<ffffffff810ca8e3>] call_timer_fn+0xb3/0x2f0
  [<ffffffff810cad25>] run_timer_softirq+0x205/0x370
  [<ffffffff81056854>] __do_softirq+0xd4/0x460
  [<ffffffff81056d69>] irq_exit+0x89/0xa0
  [<ffffffff8185a892>] smp_apic_timer_interrupt+0x42/0x50
  [<ffffffff81858a44>] apic_timer_interrupt+0x84/0x90
 ...

Fix it by adding rcu_read_lock() around the iteration.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: a06fd6b10228 ("writeback: make laptop_mode_timer_fn() handle multiple bdi_writeback's")
Reviewed-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 mm/page-writeback.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0a931cdd4f6ba..902e5f215e57e 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1965,10 +1965,12 @@ void laptop_mode_timer_fn(unsigned long data)
 	if (!bdi_has_dirty_io(&q->backing_dev_info))
 		return;
 
+	rcu_read_lock();
 	bdi_for_each_wb(wb, &q->backing_dev_info, &iter, 0)
 		if (wb_has_dirty_io(wb))
 			wb_start_writeback(wb, nr_pages, true,
 					   WB_REASON_LAPTOP_TIMER);
+	rcu_read_unlock();
 }
 
 /*

From 6fdf860f15d4a6be8f0947bad608d687fe0c7af7 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 29 Sep 2015 12:47:51 -0400
Subject: [PATCH 020/199] writeback: fix bdi_writeback iteration in
 wakeup_dirtytime_writeback()

wakeup_dirtytime_writeback() walks and wakes up all wb's of all bdi's;
unfortunately, it was always waking up bdi->wb instead of the wb being
walked.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 001fe6f617b1 ("writeback: make wakeup_dirtytime_writeback() handle multiple bdi_writeback's")
Reviewed-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 fs/fs-writeback.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 091a36444972f..d0da30668e980 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1897,8 +1897,8 @@ static void wakeup_dirtytime_writeback(struct work_struct *w)
 		struct wb_iter iter;
 
 		bdi_for_each_wb(wb, bdi, &iter, 0)
-			if (!list_empty(&bdi->wb.b_dirty_time))
-				wb_wakeup(&bdi->wb);
+			if (!list_empty(&wb->b_dirty_time))
+				wb_wakeup(wb);
 	}
 	rcu_read_unlock();
 	schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);

From b817525a4a80c04e4ca44192d97a1ffa9f2be572 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Fri, 2 Oct 2015 14:47:05 -0400
Subject: [PATCH 021/199] writeback: bdi_writeback iteration must not skip
 dying ones

bdi_for_each_wb() is used in several places to wake up or issue
writeback work items to all wb's (bdi_writeback's) on a given bdi.
The iteration is performed by walking bdi->cgwb_tree; however, the
tree only indexes wb's which are currently active.

For example, when a memcg gets associated with a different blkcg, the
old wb is removed from the tree so that the new one can be indexed.
The old wb starts dying from then on but will linger till all its
inodes are drained.  As these dying wb's may still host dirty inodes,
writeback operations which affect all wb's must include them.
bdi_for_each_wb() skipping dying wb's led to sync(2) missing and
failing to sync the inodes belonging to those wb's.

This patch adds a RCU protected @bdi->wb_list which lists all wb's
beloinging to that bdi.  wb's are added on creation and removed on
release rather than on the start of destruction.  bdi_for_each_wb()
usages are replaced with list_for_each[_continue]_rcu() iterations
over @bdi->wb_list and bdi_for_each_wb() and its helpers are removed.

v2: Updated as per Jan.  last_wb ref leak in bdi_split_work_to_wbs()
    fixed and unnecessary list head severing in cgwb_bdi_destroy()
    removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Artem Bityutskiy <dedekind1@gmail.com>
Fixes: ebe41ab0c79d ("writeback: implement bdi_for_each_wb()")
Link: http://lkml.kernel.org/g/1443012552.19983.209.camel@gmail.com
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 fs/fs-writeback.c                | 31 +++++++++++-----
 include/linux/backing-dev-defs.h |  3 ++
 include/linux/backing-dev.h      | 63 --------------------------------
 mm/backing-dev.c                 | 14 ++++++-
 mm/page-writeback.c              |  3 +-
 5 files changed, 39 insertions(+), 75 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index d0da30668e980..29e4599f6fc1c 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -778,19 +778,24 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 				  struct wb_writeback_work *base_work,
 				  bool skip_if_busy)
 {
-	int next_memcg_id = 0;
-	struct bdi_writeback *wb;
-	struct wb_iter iter;
+	struct bdi_writeback *last_wb = NULL;
+	struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list,
+						struct bdi_writeback, bdi_node);
 
 	might_sleep();
 restart:
 	rcu_read_lock();
-	bdi_for_each_wb(wb, bdi, &iter, next_memcg_id) {
+	list_for_each_entry_continue_rcu(wb, &bdi->wb_list, bdi_node) {
 		DEFINE_WB_COMPLETION_ONSTACK(fallback_work_done);
 		struct wb_writeback_work fallback_work;
 		struct wb_writeback_work *work;
 		long nr_pages;
 
+		if (last_wb) {
+			wb_put(last_wb);
+			last_wb = NULL;
+		}
+
 		/* SYNC_ALL writes out I_DIRTY_TIME too */
 		if (!wb_has_dirty_io(wb) &&
 		    (base_work->sync_mode == WB_SYNC_NONE ||
@@ -819,12 +824,22 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 
 		wb_queue_work(wb, work);
 
-		next_memcg_id = wb->memcg_css->id + 1;
+		/*
+		 * Pin @wb so that it stays on @bdi->wb_list.  This allows
+		 * continuing iteration from @wb after dropping and
+		 * regrabbing rcu read lock.
+		 */
+		wb_get(wb);
+		last_wb = wb;
+
 		rcu_read_unlock();
 		wb_wait_for_completion(bdi, &fallback_work_done);
 		goto restart;
 	}
 	rcu_read_unlock();
+
+	if (last_wb)
+		wb_put(last_wb);
 }
 
 #else	/* CONFIG_CGROUP_WRITEBACK */
@@ -1857,12 +1872,11 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
 	rcu_read_lock();
 	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
 		struct bdi_writeback *wb;
-		struct wb_iter iter;
 
 		if (!bdi_has_dirty_io(bdi))
 			continue;
 
-		bdi_for_each_wb(wb, bdi, &iter, 0)
+		list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
 			wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
 					   false, reason);
 	}
@@ -1894,9 +1908,8 @@ static void wakeup_dirtytime_writeback(struct work_struct *w)
 	rcu_read_lock();
 	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
 		struct bdi_writeback *wb;
-		struct wb_iter iter;
 
-		bdi_for_each_wb(wb, bdi, &iter, 0)
+		list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
 			if (!list_empty(&wb->b_dirty_time))
 				wb_wakeup(wb);
 	}
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index a23209b438421..1b4d69f68c33c 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -116,6 +116,8 @@ struct bdi_writeback {
 	struct list_head work_list;
 	struct delayed_work dwork;	/* work item used for writeback */
 
+	struct list_head bdi_node;	/* anchored at bdi->wb_list */
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 	struct percpu_ref refcnt;	/* used only for !root wb's */
 	struct fprop_local_percpu memcg_completions;
@@ -150,6 +152,7 @@ struct backing_dev_info {
 	atomic_long_t tot_write_bandwidth;
 
 	struct bdi_writeback wb;  /* the root writeback info for this bdi */
+	struct list_head wb_list; /* list of all wbs */
 #ifdef CONFIG_CGROUP_WRITEBACK
 	struct radix_tree_root cgwb_tree; /* radix tree of active cgroup wbs */
 	struct rb_root cgwb_congested_tree; /* their congested states */
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index d5eb4ad1c534a..78677e5a65bf1 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -408,61 +408,6 @@ static inline void unlocked_inode_to_wb_end(struct inode *inode, bool locked)
 	rcu_read_unlock();
 }
 
-struct wb_iter {
-	int			start_memcg_id;
-	struct radix_tree_iter	tree_iter;
-	void			**slot;
-};
-
-static inline struct bdi_writeback *__wb_iter_next(struct wb_iter *iter,
-						   struct backing_dev_info *bdi)
-{
-	struct radix_tree_iter *titer = &iter->tree_iter;
-
-	WARN_ON_ONCE(!rcu_read_lock_held());
-
-	if (iter->start_memcg_id >= 0) {
-		iter->slot = radix_tree_iter_init(titer, iter->start_memcg_id);
-		iter->start_memcg_id = -1;
-	} else {
-		iter->slot = radix_tree_next_slot(iter->slot, titer, 0);
-	}
-
-	if (!iter->slot)
-		iter->slot = radix_tree_next_chunk(&bdi->cgwb_tree, titer, 0);
-	if (iter->slot)
-		return *iter->slot;
-	return NULL;
-}
-
-static inline struct bdi_writeback *__wb_iter_init(struct wb_iter *iter,
-						   struct backing_dev_info *bdi,
-						   int start_memcg_id)
-{
-	iter->start_memcg_id = start_memcg_id;
-
-	if (start_memcg_id)
-		return __wb_iter_next(iter, bdi);
-	else
-		return &bdi->wb;
-}
-
-/**
- * bdi_for_each_wb - walk all wb's of a bdi in ascending memcg ID order
- * @wb_cur: cursor struct bdi_writeback pointer
- * @bdi: bdi to walk wb's of
- * @iter: pointer to struct wb_iter to be used as iteration buffer
- * @start_memcg_id: memcg ID to start iteration from
- *
- * Iterate @wb_cur through the wb's (bdi_writeback's) of @bdi in ascending
- * memcg ID order starting from @start_memcg_id.  @iter is struct wb_iter
- * to be used as temp storage during iteration.  rcu_read_lock() must be
- * held throughout iteration.
- */
-#define bdi_for_each_wb(wb_cur, bdi, iter, start_memcg_id)		\
-	for ((wb_cur) = __wb_iter_init(iter, bdi, start_memcg_id);	\
-	     (wb_cur); (wb_cur) = __wb_iter_next(iter, bdi))
-
 #else	/* CONFIG_CGROUP_WRITEBACK */
 
 static inline bool inode_cgwb_enabled(struct inode *inode)
@@ -522,14 +467,6 @@ static inline void wb_blkcg_offline(struct blkcg *blkcg)
 {
 }
 
-struct wb_iter {
-	int		next_id;
-};
-
-#define bdi_for_each_wb(wb_cur, bdi, iter, start_blkcg_id)		\
-	for ((iter)->next_id = (start_blkcg_id);			\
-	     ({	(wb_cur) = !(iter)->next_id++ ? &(bdi)->wb : NULL; }); )
-
 static inline int inode_congested(struct inode *inode, int cong_bits)
 {
 	return wb_congested(&inode_to_bdi(inode)->wb, cong_bits);
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 2df8ddcb0ca0a..e92d77937fd36 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -480,6 +480,10 @@ static void cgwb_release_workfn(struct work_struct *work)
 						release_work);
 	struct backing_dev_info *bdi = wb->bdi;
 
+	spin_lock_irq(&cgwb_lock);
+	list_del_rcu(&wb->bdi_node);
+	spin_unlock_irq(&cgwb_lock);
+
 	wb_shutdown(wb);
 
 	css_put(wb->memcg_css);
@@ -575,6 +579,7 @@ static int cgwb_create(struct backing_dev_info *bdi,
 		ret = radix_tree_insert(&bdi->cgwb_tree, memcg_css->id, wb);
 		if (!ret) {
 			atomic_inc(&bdi->usage_cnt);
+			list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list);
 			list_add(&wb->memcg_node, memcg_cgwb_list);
 			list_add(&wb->blkcg_node, blkcg_cgwb_list);
 			css_get(memcg_css);
@@ -764,15 +769,22 @@ static void cgwb_bdi_destroy(struct backing_dev_info *bdi) { }
 
 int bdi_init(struct backing_dev_info *bdi)
 {
+	int ret;
+
 	bdi->dev = NULL;
 
 	bdi->min_ratio = 0;
 	bdi->max_ratio = 100;
 	bdi->max_prop_frac = FPROP_FRAC_BASE;
 	INIT_LIST_HEAD(&bdi->bdi_list);
+	INIT_LIST_HEAD(&bdi->wb_list);
 	init_waitqueue_head(&bdi->wb_waitq);
 
-	return cgwb_bdi_init(bdi);
+	ret = cgwb_bdi_init(bdi);
+
+	list_add_tail_rcu(&bdi->wb.bdi_node, &bdi->wb_list);
+
+	return ret;
 }
 EXPORT_SYMBOL(bdi_init);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 902e5f215e57e..0f1a94e9f3518 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1956,7 +1956,6 @@ void laptop_mode_timer_fn(unsigned long data)
 	int nr_pages = global_page_state(NR_FILE_DIRTY) +
 		global_page_state(NR_UNSTABLE_NFS);
 	struct bdi_writeback *wb;
-	struct wb_iter iter;
 
 	/*
 	 * We want to write everything out, not just down to the dirty
@@ -1966,7 +1965,7 @@ void laptop_mode_timer_fn(unsigned long data)
 		return;
 
 	rcu_read_lock();
-	bdi_for_each_wb(wb, &q->backing_dev_info, &iter, 0)
+	list_for_each_entry_rcu(wb, &q->backing_dev_info.wb_list, bdi_node)
 		if (wb_has_dirty_io(wb))
 			wb_start_writeback(wb, nr_pages, true,
 					   WB_REASON_LAPTOP_TIMER);

From d60d1bddd5b642711a237511845853755b25bf1f Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 29 Sep 2015 12:47:53 -0400
Subject: [PATCH 022/199] writeback: memcg dirty_throttle_control should be
 initialized with wb->memcg_completions

MDTC_INIT() is used to initialize dirty_throttle_control for memcg
domains.  It used DTC_INIT_COMMON() to initialized mdtc->wb and
->wb_completions which is incorrect as DTC_INIT_COMMON() sets the
latter to wb->completions instead of wb->memcg_completions.  This can
lead to wildly incorrect results when calculating the proportion of
dirty memory the memcg domain should get.

Remove DTC_INIT_COMMON() and update MDTC_INIT() to initialize
mdtc->wb_completions to wb->memcg_completions.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: c2aa723a6093 ("writeback: implement memcg writeback domain based throttling")
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 mm/page-writeback.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0f1a94e9f3518..56c0bffa9f49f 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -145,9 +145,6 @@ struct dirty_throttle_control {
 	unsigned long		pos_ratio;
 };
 
-#define DTC_INIT_COMMON(__wb)	.wb = (__wb),				\
-				.wb_completions = &(__wb)->completions
-
 /*
  * Length of period for aging writeout fractions of bdis. This is an
  * arbitrarily chosen number. The longer the period, the slower fractions will
@@ -157,12 +154,16 @@ struct dirty_throttle_control {
 
 #ifdef CONFIG_CGROUP_WRITEBACK
 
-#define GDTC_INIT(__wb)		.dom = &global_wb_domain,		\
-				DTC_INIT_COMMON(__wb)
+#define GDTC_INIT(__wb)		.wb = (__wb),				\
+				.dom = &global_wb_domain,		\
+				.wb_completions = &(__wb)->completions
+
 #define GDTC_INIT_NO_WB		.dom = &global_wb_domain
-#define MDTC_INIT(__wb, __gdtc)	.dom = mem_cgroup_wb_domain(__wb),	\
-				.gdtc = __gdtc,				\
-				DTC_INIT_COMMON(__wb)
+
+#define MDTC_INIT(__wb, __gdtc)	.wb = (__wb),				\
+				.dom = mem_cgroup_wb_domain(__wb),	\
+				.wb_completions = &(__wb)->memcg_completions, \
+				.gdtc = __gdtc
 
 static bool mdtc_valid(struct dirty_throttle_control *dtc)
 {
@@ -213,7 +214,8 @@ static void wb_min_max_ratio(struct bdi_writeback *wb,
 
 #else	/* CONFIG_CGROUP_WRITEBACK */
 
-#define GDTC_INIT(__wb)		DTC_INIT_COMMON(__wb)
+#define GDTC_INIT(__wb)		.wb = (__wb),                           \
+				.wb_completions = &(__wb)->completions
 #define GDTC_INIT_NO_WB
 #define MDTC_INIT(__wb, __gdtc)
 

From c5edf9cdc4c483b9a94c03fc0b9f769bd090bf3e Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 29 Sep 2015 13:04:26 -0400
Subject: [PATCH 023/199] writeback: fix incorrect calculation of available
 memory for memcg domains

For memcg domains, the amount of available memory was calculated as

 min(the amount currently in use + headroom according to memcg,
     total clean memory)

This isn't quite correct as what should be capped by the amount of
clean memory is the headroom, not the sum of memory in use and
headroom.  For example, if a memcg domain has a significant amount of
dirty memory, the above can lead to a value which is lower than the
current amount in use which doesn't make much sense.  In most
circumstances, the above leads to a number which is somewhat but not
drastically lower.

As the amount of memory which can be readily allocated to the memcg
domain is capped by the amount of system-wide clean memory which is
not already assigned to the memcg itself, the number we want is

 the amount currently in use +
 min(headroom according to memcg, clean memory elsewhere in the system)

This patch updates mem_cgroup_wb_stats() to return the number of
filepages and headroom instead of the calculated available pages.
mdtc_cap_avail() is renamed to mdtc_calc_avail() and performs the
above calculation from file, headroom, dirty and globally clean pages.

v2: Dummy mem_cgroup_wb_stats() implementation wasn't updated leading
    to build failure when !CGROUP_WRITEBACK.  Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: c2aa723a6093 ("writeback: implement memcg writeback domain based throttling")
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 include/linux/memcontrol.h |  8 +++++---
 mm/memcontrol.c            | 35 +++++++++++++++++------------------
 mm/page-writeback.c        | 29 ++++++++++++++++++-----------
 3 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 6452ff4c463fd..3e3318ddfc0e3 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -676,8 +676,9 @@ enum {
 
 struct list_head *mem_cgroup_cgwb_list(struct mem_cgroup *memcg);
 struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb);
-void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pavail,
-			 unsigned long *pdirty, unsigned long *pwriteback);
+void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
+			 unsigned long *pheadroom, unsigned long *pdirty,
+			 unsigned long *pwriteback);
 
 #else	/* CONFIG_CGROUP_WRITEBACK */
 
@@ -687,7 +688,8 @@ static inline struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
 }
 
 static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
-				       unsigned long *pavail,
+				       unsigned long *pfilepages,
+				       unsigned long *pheadroom,
 				       unsigned long *pdirty,
 				       unsigned long *pwriteback)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1fedbde68f595..882c10cfd0ba4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3740,44 +3740,43 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
 /**
  * mem_cgroup_wb_stats - retrieve writeback related stats from its memcg
  * @wb: bdi_writeback in question
- * @pavail: out parameter for number of available pages
+ * @pfilepages: out parameter for number of file pages
+ * @pheadroom: out parameter for number of allocatable pages according to memcg
  * @pdirty: out parameter for number of dirty pages
  * @pwriteback: out parameter for number of pages under writeback
  *
- * Determine the numbers of available, dirty, and writeback pages in @wb's
- * memcg.  Dirty and writeback are self-explanatory.  Available is a bit
- * more involved.
+ * Determine the numbers of file, headroom, dirty, and writeback pages in
+ * @wb's memcg.  File, dirty and writeback are self-explanatory.  Headroom
+ * is a bit more involved.
  *
- * A memcg's headroom is "min(max, high) - used".  The available memory is
- * calculated as the lowest headroom of itself and the ancestors plus the
- * number of pages already being used for file pages.  Note that this
- * doesn't consider the actual amount of available memory in the system.
- * The caller should further cap *@pavail accordingly.
+ * A memcg's headroom is "min(max, high) - used".  In the hierarchy, the
+ * headroom is calculated as the lowest headroom of itself and the
+ * ancestors.  Note that this doesn't consider the actual amount of
+ * available memory in the system.  The caller should further cap
+ * *@pheadroom accordingly.
  */
-void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pavail,
-			 unsigned long *pdirty, unsigned long *pwriteback)
+void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
+			 unsigned long *pheadroom, unsigned long *pdirty,
+			 unsigned long *pwriteback)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
 	struct mem_cgroup *parent;
-	unsigned long head_room = PAGE_COUNTER_MAX;
-	unsigned long file_pages;
 
 	*pdirty = mem_cgroup_read_stat(memcg, MEM_CGROUP_STAT_DIRTY);
 
 	/* this should eventually include NR_UNSTABLE_NFS */
 	*pwriteback = mem_cgroup_read_stat(memcg, MEM_CGROUP_STAT_WRITEBACK);
+	*pfilepages = mem_cgroup_nr_lru_pages(memcg, (1 << LRU_INACTIVE_FILE) |
+						     (1 << LRU_ACTIVE_FILE));
+	*pheadroom = PAGE_COUNTER_MAX;
 
-	file_pages = mem_cgroup_nr_lru_pages(memcg, (1 << LRU_INACTIVE_FILE) |
-						    (1 << LRU_ACTIVE_FILE));
 	while ((parent = parent_mem_cgroup(memcg))) {
 		unsigned long ceiling = min(memcg->memory.limit, memcg->high);
 		unsigned long used = page_counter_read(&memcg->memory);
 
-		head_room = min(head_room, ceiling - min(ceiling, used));
+		*pheadroom = min(*pheadroom, ceiling - min(ceiling, used));
 		memcg = parent;
 	}
-
-	*pavail = file_pages + head_room;
 }
 
 #else	/* CONFIG_CGROUP_WRITEBACK */
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 56c0bffa9f49f..2c90357c34ea4 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -684,13 +684,19 @@ static unsigned long hard_dirty_limit(struct wb_domain *dom,
 	return max(thresh, dom->dirty_limit);
 }
 
-/* memory available to a memcg domain is capped by system-wide clean memory */
-static void mdtc_cap_avail(struct dirty_throttle_control *mdtc)
+/*
+ * Memory which can be further allocated to a memcg domain is capped by
+ * system-wide clean memory excluding the amount being used in the domain.
+ */
+static void mdtc_calc_avail(struct dirty_throttle_control *mdtc,
+			    unsigned long filepages, unsigned long headroom)
 {
 	struct dirty_throttle_control *gdtc = mdtc_gdtc(mdtc);
-	unsigned long clean = gdtc->avail - min(gdtc->avail, gdtc->dirty);
+	unsigned long clean = filepages - min(filepages, mdtc->dirty);
+	unsigned long global_clean = gdtc->avail - min(gdtc->avail, gdtc->dirty);
+	unsigned long other_clean = global_clean - min(global_clean, clean);
 
-	mdtc->avail = min(mdtc->avail, clean);
+	mdtc->avail = filepages + min(headroom, other_clean);
 }
 
 /**
@@ -1564,16 +1570,16 @@ static void balance_dirty_pages(struct address_space *mapping,
 		}
 
 		if (mdtc) {
-			unsigned long writeback;
+			unsigned long filepages, headroom, writeback;
 
 			/*
 			 * If @wb belongs to !root memcg, repeat the same
 			 * basic calculations for the memcg domain.
 			 */
-			mem_cgroup_wb_stats(wb, &mdtc->avail, &mdtc->dirty,
-					    &writeback);
-			mdtc_cap_avail(mdtc);
+			mem_cgroup_wb_stats(wb, &filepages, &headroom,
+					    &mdtc->dirty, &writeback);
 			mdtc->dirty += writeback;
+			mdtc_calc_avail(mdtc, filepages, headroom);
 
 			domain_dirty_limits(mdtc);
 
@@ -1895,10 +1901,11 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb)
 		return true;
 
 	if (mdtc) {
-		unsigned long writeback;
+		unsigned long filepages, headroom, writeback;
 
-		mem_cgroup_wb_stats(wb, &mdtc->avail, &mdtc->dirty, &writeback);
-		mdtc_cap_avail(mdtc);
+		mem_cgroup_wb_stats(wb, &filepages, &headroom, &mdtc->dirty,
+				    &writeback);
+		mdtc_calc_avail(mdtc, filepages, headroom);
 		domain_dirty_limits(mdtc);	/* ditto, ignore writeback */
 
 		if (mdtc->dirty > mdtc->bg_thresh)

From 835da3f99d329b1160a1f7fc82c7ac81163d63d0 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Tue, 6 Oct 2015 22:29:48 +0200
Subject: [PATCH 024/199] nvme: fix 32-bit build warning

Compiling the nvme driver on 32-bit warns about a cast from a __u64
variable to a pointer:

drivers/block/nvme-core.c: In function 'nvme_submit_io':
drivers/block/nvme-core.c:1847:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    (void __user *)io.addr, length, NULL, 0);

The cast here is intentional and safe, so we can shut up the
gcc warning by adding an intermediate cast to 'uintptr_t'.

I had previously submitted a patch to fix this problem in the
nvme driver, but it was accepted on the same day that two new
warnings got added.

For clarification, I also change the third instance of this cast
to use uintptr_t instead of unsigned long now.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: d29ec8241c10e ("nvme: submit internal commands through the block layer")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 drivers/block/nvme-core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 6f04771f10197..fce353ba5f66c 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -1804,7 +1804,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
 
 	length = (io.nblocks + 1) << ns->lba_shift;
 	meta_len = (io.nblocks + 1) * ns->ms;
-	metadata = (void __user *)(unsigned long)io.metadata;
+	metadata = (void __user *)(uintptr_t)io.metadata;
 	write = io.opcode & 1;
 
 	if (ns->ext) {
@@ -1844,7 +1844,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
 	c.rw.metadata = cpu_to_le64(meta_dma);
 
 	status = __nvme_submit_sync_cmd(ns->queue, &c, NULL,
-			(void __user *)io.addr, length, NULL, 0);
+			(void __user *)(uintptr_t)io.addr, length, NULL, 0);
  unmap:
 	if (meta) {
 		if (status == NVME_SC_SUCCESS && !write) {
@@ -1886,7 +1886,7 @@ static int nvme_user_cmd(struct nvme_dev *dev, struct nvme_ns *ns,
 		timeout = msecs_to_jiffies(cmd.timeout_ms);
 
 	status = __nvme_submit_sync_cmd(ns ? ns->queue : dev->admin_q, &c,
-			NULL, (void __user *)cmd.addr, cmd.data_len,
+			NULL, (void __user *)(uintptr_t)cmd.addr, cmd.data_len,
 			&cmd.result, timeout);
 	if (status >= 0) {
 		if (put_user(cmd.result, &ucmd->result))

From 51a6256b00008a3c520f6f31bcd62cd15cb05960 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date: Tue, 13 Oct 2015 04:32:49 +0900
Subject: [PATCH 025/199] ARM: EXYNOS: Fix double of_node_put() when parsing
 child power domains

On each next iteration of for_each_compatible_node() the reference
counter for current device node is already decreased by the loop
iterator. The manual call to of_node_get() is required only on loop
break which is not happening here.

The double of_node_get() (with enabled CONFIG_OF_DYNAMIC) lead to
decreasing the counter below expected, initial value.

Fixes: fe4034a3fad7 ("ARM: EXYNOS: Add missing of_node_put() when parsing power domains")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
---
 arch/arm/mach-exynos/pm_domains.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mach-exynos/pm_domains.c b/arch/arm/mach-exynos/pm_domains.c
index 4a87e86dec45d..7c21760f590ff 100644
--- a/arch/arm/mach-exynos/pm_domains.c
+++ b/arch/arm/mach-exynos/pm_domains.c
@@ -200,15 +200,15 @@ static __init int exynos4_pm_init_power_domain(void)
 		args.args_count = 0;
 		child_domain = of_genpd_get_from_provider(&args);
 		if (IS_ERR(child_domain))
-			goto next_pd;
+			continue;
 
 		if (of_parse_phandle_with_args(np, "power-domains",
 					 "#power-domain-cells", 0, &args) != 0)
-			goto next_pd;
+			continue;
 
 		parent_domain = of_genpd_get_from_provider(&args);
 		if (IS_ERR(parent_domain))
-			goto next_pd;
+			continue;
 
 		if (pm_genpd_add_subdomain(parent_domain, child_domain))
 			pr_warn("%s failed to add subdomain: %s\n",
@@ -216,8 +216,6 @@ static __init int exynos4_pm_init_power_domain(void)
 		else
 			pr_info("%s has as child subdomain: %s.\n",
 				parent_domain->name, child_domain->name);
-next_pd:
-		of_node_put(np);
 	}
 
 	return 0;

From b8bb9baad27e455c467e8fac47eebadbe765c18f Mon Sep 17 00:00:00 2001
From: Alim Akhtar <alim.akhtar@samsung.com>
Date: Tue, 13 Oct 2015 04:32:53 +0900
Subject: [PATCH 026/199] ARM: dts: Fix audio card detection on Peach boards

Since commit 2fad972d45c4 ("ARM: dts: Add mclk entry for Peach boards"),
sound card detection is broken on peach boards and gives below errors:

[    3.630457] max98090 7-0010: MAX98091 REVID=0x51
[    3.634233] max98090 7-0010: use default 2.8v micbias
[    3.640985] snow-audio sound: HiFi <-> 3830000.i2s mapping ok
[    3.645307] max98090 7-0010: Invalid master clock frequency
[    3.650824] snow-audio sound: ASoC: Peach-Pi-I2S-MAX98091 late_probe() failed: -22
[    3.658914] snow-audio sound: snd_soc_register_card failed (-22)
[    3.664366] snow-audio: probe of sound failed with error -22

This patch adds missing assigned-clocks and assigned-clock-parents for
pmu_system_controller node which is used as "mclk" for audio codec.

Fixes: 2fad972d45c4 ("ARM: dts: Add mclk entry for Peach boards")
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
---
 arch/arm/boot/dts/exynos5420-peach-pit.dts | 5 +++++
 arch/arm/boot/dts/exynos5800-peach-pi.dts  | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420-peach-pit.dts b/arch/arm/boot/dts/exynos5420-peach-pit.dts
index 8f4d76c5e11c5..1b95da79293c5 100644
--- a/arch/arm/boot/dts/exynos5420-peach-pit.dts
+++ b/arch/arm/boot/dts/exynos5420-peach-pit.dts
@@ -915,6 +915,11 @@
 	};
 };
 
+&pmu_system_controller {
+	assigned-clocks = <&pmu_system_controller 0>;
+	assigned-clock-parents = <&clock CLK_FIN_PLL>;
+};
+
 &rtc {
 	status = "okay";
 	clocks = <&clock CLK_RTC>, <&max77802 MAX77802_CLK_32K_AP>;
diff --git a/arch/arm/boot/dts/exynos5800-peach-pi.dts b/arch/arm/boot/dts/exynos5800-peach-pi.dts
index 7d5b386b5ae6a..8f40c7e549bd5 100644
--- a/arch/arm/boot/dts/exynos5800-peach-pi.dts
+++ b/arch/arm/boot/dts/exynos5800-peach-pi.dts
@@ -878,6 +878,11 @@
 	};
 };
 
+&pmu_system_controller {
+	assigned-clocks = <&pmu_system_controller 0>;
+	assigned-clock-parents = <&clock CLK_FIN_PLL>;
+};
+
 &rtc {
 	status = "okay";
 	clocks = <&clock CLK_RTC>, <&max77802 MAX77802_CLK_32K_AP>;

From 7e381ec6a36aa44f15fc1a76e6efb9e2cd942e61 Mon Sep 17 00:00:00 2001
From: Tomi Valkeinen <tomi.valkeinen@ti.com>
Date: Fri, 25 Sep 2015 16:02:03 +0300
Subject: [PATCH 027/199] ARM: dts: am57xx-beagle-x15: set VDD_SD to always-on

LDO1 regulator (VDD_SD) is connected to SoC's vddshv8. vddshv8 needs to
be kept always powered (see commit 5a0f93c6576a ("ARM: dts: Add
am57xx-beagle-x15"), but at the moment VDD_SD is enabled/disabled
depending on whether an SD card is inserted or not.

This patch sets LDO1 regulator to always-on.

This patch has a side effect of fixing another issue, HDMI DDC not
working when SD card is not inserted:

Why this happens is that the tpd12s015 (HDMI level shifter/ESD
protection chip) has LS_OE GPIO input, which needs to be enabled for the
HDMI DDC to work. LS_OE comes from gpio6_28. The pin that provides
gpio6_28 is powered by vddshv8, and vddshv8 comes from VDD_SD.

So when SD card is not inserted, VDD_SD is disabled, and LS_OE stays
off.

The proper fix for the HDMI DDC issue would be to maybe have the pinctrl
framework manage the pin specific power.

Apparently this fixes also a third issue (copy paste from Kishon's
patch):

ldo1_reg in addition to being connected to the io lines is also
connected to the card detect line. On card removal, omap_hsmmc
driver does a regulator_disable causing card detect line to be
pulled down. This raises a card insertion interrupt and once the
MMC core detects there is no card inserted, it does a
regulator disable which again raises a card insertion interrupt.
This happens in a loop causing infinite MMC interrupts.

Fixes: 5a0f93c6576a ("ARM: dts: Add am57xx-beagle-x15")
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reported-by: Louis McCarthy <compeoree@gmail.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 arch/arm/boot/dts/am57xx-beagle-x15.dts | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/am57xx-beagle-x15.dts b/arch/arm/boot/dts/am57xx-beagle-x15.dts
index 568adf5efde05..d55e3ea89fda5 100644
--- a/arch/arm/boot/dts/am57xx-beagle-x15.dts
+++ b/arch/arm/boot/dts/am57xx-beagle-x15.dts
@@ -402,11 +402,12 @@
 				/* SMPS9 unused */
 
 				ldo1_reg: ldo1 {
-					/* VDD_SD  */
+					/* VDD_SD / VDDSHV8  */
 					regulator-name = "ldo1";
 					regulator-min-microvolt = <1800000>;
 					regulator-max-microvolt = <3300000>;
 					regulator-boot-on;
+					regulator-always-on;
 				};
 
 				ldo2_reg: ldo2 {

From be59b6192f43a9792f5f636b51358196ba11bbf6 Mon Sep 17 00:00:00 2001
From: Tony Lindgren <tony@atomide.com>
Date: Mon, 12 Oct 2015 16:19:54 -0700
Subject: [PATCH 028/199] memory: omap-gpmc: Fix unselectable debug option for
 GPMC
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Commit 63aa945b1013 ("memory: omap-gpmc: Add Kconfig option for debug")
added a debug option for GPMC, but somehow managed to keep it unselectable.

This probably happened because I had some uncommitted changes and the
GPMC option is selected in the platform specific Kconfig.

Let's also update the description a bit, it does not mention that
enabling the debug option also disables the reset of GPMC controller
during the init as pointed out by Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> and Roger Quadros <rogerq@ti.com>.

Fixes: 63aa945b1013 ("memory: omap-gpmc: Add Kconfig option for debug")
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 drivers/memory/Kconfig | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c6a644b22af44..6f3154613dc71 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -58,12 +58,18 @@ config OMAP_GPMC
 	  memory drives like NOR, NAND, OneNAND, SRAM.
 
 config OMAP_GPMC_DEBUG
-	bool
+	bool "Enable GPMC debug output and skip reset of GPMC during init"
 	depends on OMAP_GPMC
 	help
 	  Enables verbose debugging mostly to decode the bootloader provided
-	  timings. Enable this during development to configure devices
-	  connected to the GPMC bus.
+	  timings. To preserve the bootloader provided timings, the reset
+	  of GPMC is skipped during init. Enable this during development to
+	  configure devices connected to the GPMC bus.
+
+	  NOTE: In addition to matching the register setup with the bootloader
+	  you also need to match the GPMC FCLK frequency used by the
+	  bootloader or else the GPMC timings won't be identical with the
+	  bootloader timings.
 
 config MVEBU_DEVBUS
 	bool "Marvell EBU Device Bus Controller"

From fd820a1ec758bc25b0eb10ab5e88e6c61fbcc8aa Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig@pengutronix.de>
Date: Tue, 6 Oct 2015 22:07:49 +0200
Subject: [PATCH 029/199] memory: omap-gpmc: dump "before" state before first
 modification
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When gpmc_cs_show_timings is called in gpmc_cs_set_timings()
gpmc_cs_program_settings() was already run which modifies the CONFIG1
register. So to be more useful do the "before" dump earlier.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 drivers/memory/omap-gpmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/memory/omap-gpmc.c b/drivers/memory/omap-gpmc.c
index 32ac049f2bc4d..6515dfc2b805d 100644
--- a/drivers/memory/omap-gpmc.c
+++ b/drivers/memory/omap-gpmc.c
@@ -696,7 +696,6 @@ int gpmc_cs_set_timings(int cs, const struct gpmc_timings *t,
 	int div;
 	u32 l;
 
-	gpmc_cs_show_timings(cs, "before gpmc_cs_set_timings");
 	div = gpmc_calc_divider(t->sync_clk);
 	if (div < 0)
 		return div;
@@ -1988,6 +1987,7 @@ static int gpmc_probe_generic_child(struct platform_device *pdev,
 	if (ret < 0)
 		goto err;
 
+	gpmc_cs_show_timings(cs, "before gpmc_cs_program_settings");
 	ret = gpmc_cs_program_settings(cs, &gpmc_s);
 	if (ret < 0)
 		goto err;

From d8e1f5ed11a39a68da00f05000466c4f6db4456e Mon Sep 17 00:00:00 2001
From: Tony Lindgren <tony@atomide.com>
Date: Mon, 12 Oct 2015 16:19:54 -0700
Subject: [PATCH 030/199] Documentation: ARM: List new omap MMC requirements

Earlier the PBIAS regulator was optional, not so with recent
omap_hsmmc changes. To make things easier for people with
custom .config files, let's add minimal documentation for it
as suggested by Russell King <rmk+kernel@arm.linux.org.uk>.

Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 Documentation/arm/OMAP/README | 7 +++++++
 1 file changed, 7 insertions(+)
 create mode 100644 Documentation/arm/OMAP/README

diff --git a/Documentation/arm/OMAP/README b/Documentation/arm/OMAP/README
new file mode 100644
index 0000000000000..75645c45d14a1
--- /dev/null
+++ b/Documentation/arm/OMAP/README
@@ -0,0 +1,7 @@
+This file contains documentation for running mainline
+kernel on omaps.
+
+KERNEL		NEW DEPENDENCIES
+v4.3+		Update is needed for custom .config files to make sure
+		CONFIG_REGULATOR_PBIAS is enabled for MMC1 to work
+		properly.

From 42f2bb1c494543084b764e1ca253c73db910daf2 Mon Sep 17 00:00:00 2001
From: Vinod Koul <vinod.koul@intel.com>
Date: Tue, 13 Oct 2015 14:57:49 +0530
Subject: [PATCH 031/199] ALSA: hdac: Explicitly add io.h

Compiling the hdac extended core on arm fails with below error:

  sound/hda/ext/hdac_ext_bus.c: In function 'hdac_ext_writel':
>> sound/hda/ext/hdac_ext_bus.c:29:2: error: implicit declaration of
>> function
+'writel' [-Werror=implicit-function-declaration]
     writel(value, addr);
     ^
   sound/hda/ext/hdac_ext_bus.c: In function 'hdac_ext_readl':
>> sound/hda/ext/hdac_ext_bus.c:34:2: error: implicit declaration of
>> function
+'readl' [-Werror=implicit-function-declaration]
     return readl(addr);

This is fixed by explicitly including io.h

Fixes: 99463b3a3994 - ('ALSA: hda: provide default bus io ops extended hdac')
Reported-by: kbuild test robot <lkp@intel.com>
Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/hda/ext/hdac_ext_bus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/hda/ext/hdac_ext_bus.c b/sound/hda/ext/hdac_ext_bus.c
index 4449d1a990893..2433f7c814728 100644
--- a/sound/hda/ext/hdac_ext_bus.c
+++ b/sound/hda/ext/hdac_ext_bus.c
@@ -19,6 +19,7 @@
 
 #include <linux/module.h>
 #include <linux/slab.h>
+#include <linux/io.h>
 #include <sound/hdaudio_ext.h>
 
 MODULE_DESCRIPTION("HDA extended core");

From e8d65a8d985271a102f07c7456da5b86c19ffe16 Mon Sep 17 00:00:00 2001
From: David Henningsson <david.henningsson@canonical.com>
Date: Tue, 13 Oct 2015 10:10:18 +0200
Subject: [PATCH 032/199] ALSA: hda - Fix inverted internal mic on Lenovo
 G50-80

Add the appropriate quirk to indicate the Lenovo G50-80 has a stereo
mic input where one channel has reverse polarity.

Alsa-info available at:
https://launchpadlibrarian.net/220846272/AlsaInfo.txt

Cc: stable@vger.kernel.org
BugLink: https://bugs.launchpad.net/bugs/1504778
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/pci/hda/patch_conexant.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
index ca03c40609fcf..2f0ec7c45fc70 100644
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -819,6 +819,7 @@ static const struct snd_pci_quirk cxt5066_fixups[] = {
 	SND_PCI_QUIRK(0x17aa, 0x21da, "Lenovo X220", CXT_PINCFG_LENOVO_TP410),
 	SND_PCI_QUIRK(0x17aa, 0x21db, "Lenovo X220-tablet", CXT_PINCFG_LENOVO_TP410),
 	SND_PCI_QUIRK(0x17aa, 0x38af, "Lenovo IdeaPad Z560", CXT_FIXUP_MUTE_LED_EAPD),
+	SND_PCI_QUIRK(0x17aa, 0x390b, "Lenovo G50-80", CXT_FIXUP_STEREO_DMIC),
 	SND_PCI_QUIRK(0x17aa, 0x3975, "Lenovo U300s", CXT_FIXUP_STEREO_DMIC),
 	SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_FIXUP_STEREO_DMIC),
 	SND_PCI_QUIRK(0x17aa, 0x397b, "Lenovo S205", CXT_FIXUP_STEREO_DMIC),

From e797e4b71777877b19b50e3d736331c947ccffe7 Mon Sep 17 00:00:00 2001
From: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue, 6 Oct 2015 14:53:01 +0200
Subject: [PATCH 033/199] drm/i915: Fix kerneldoc for i915_gem_shrink_all

I've botched this in

commit eb0b44adc08c0be01a027eb009e9cdadc31e65a2
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Mar 18 14:47:59 2015 +0100

    drm/i915: kerneldoc for i915_gem_shrinker.c

so let's fix it.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index f6ecbda2c6047..674341708033b 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -143,7 +143,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
 }
 
 /**
- * i915_gem_shrink - Shrink buffer object caches completely
+ * i915_gem_shrink_all - Shrink buffer object caches completely
  * @dev_priv: i915 device
  *
  * This is a simple wraper around i915_gem_shrink() to aggressively shrink all

From 40a24488f5250d63341e74b9994159afc4589606 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri, 21 Aug 2015 16:08:41 +0100
Subject: [PATCH 034/199] drm/i915: Flush pipecontrol post-sync writes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In order to flush the results from in-batch pipecontrol writes (used for
example in glQuery) before declaring the batch complete (and so declaring
the query results coherent), we need to set the FlushEnable bit in our
flushing pipecontrol. The FlushEnable bit "waits until all previous
writes of immediate data from post-sync circles are complete before
executing the next command".

I get GPU hangs on byt without flushing these writes (running ue4).
piglit has examples where the flush is required for correct rendering.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7412caedcf7f9..29dd4488dc498 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1659,6 +1659,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
 	if (flush_domains) {
 		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
 		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_FLUSH_ENABLE;
 	}
 
 	if (invalidate_domains) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6e6b8db996ef2..61b451fbd09e6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -347,6 +347,7 @@ gen7_render_ring_flush(struct drm_i915_gem_request *req,
 	if (flush_domains) {
 		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
 		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_FLUSH_ENABLE;
 	}
 	if (invalidate_domains) {
 		flags |= PIPE_CONTROL_TLB_INVALIDATE;
@@ -418,6 +419,7 @@ gen8_render_ring_flush(struct drm_i915_gem_request *req,
 	if (flush_domains) {
 		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
 		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_FLUSH_ENABLE;
 	}
 	if (invalidate_domains) {
 		flags |= PIPE_CONTROL_TLB_INVALIDATE;

From 8e7a65aa70bcc1235a44e40ae0da5056525fe081 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala@linux.intel.com>
Date: Wed, 7 Oct 2015 22:08:24 +0300
Subject: [PATCH 035/199] drm/i915: Restore lost DPLL register write on gen2-4
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We accidentally lost the initial DPLL register write in
1c4e02746147 drm/i915: Fix DVO 2x clock enable on 830M

The "three times for luck" hack probably saved us from a total
disaster. But anyway, bring the initial write back so that the
code actually makes some sense.

Reported-and-tested-by: Nick Bowler <nbowler@draconx.ca>
References: http://mid.gmane.org/CAN_QmVyMaArxYgEcVVsGvsMo7-6ohZr8HmF5VhkkL4i9KOmrhw@mail.gmail.com
Cc: stable@vger.kernel.org
Cc: Nick Bowler <nbowler@draconx.ca>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index cf418be7d30a5..bdfac53dd9452 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1724,6 +1724,8 @@ static void i9xx_enable_pll(struct intel_crtc *crtc)
 			   I915_READ(DPLL(!crtc->pipe)) | DPLL_DVO_2X_MODE);
 	}
 
+	I915_WRITE(reg, dpll);
+
 	/* Wait for the clocks to stabilize. */
 	POSTING_READ(reg);
 	udelay(150);

From c2b63374461c0986147902f719c26412d1f26fbc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala@linux.intel.com>
Date: Wed, 7 Oct 2015 22:08:25 +0300
Subject: [PATCH 036/199] drm/i915: Enable DPLL VGA mode before P1/P2 divider
 write
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Apparently writing the DPLL register P1/P2 divider fields won't trigger
an actual change in the DPLL output unless VGA mode is enabled for
prior to the register write that changes the P1/P2 dividers. The write
with the new P1/P2 divider can itself disable VGA mode again without
problems.

I tested the behaviour on my 946GZ, and when manually frobbing the
register with the display on, the behaviour is very clear. However I
can't explain why this machine actually works. The P1/P2 divider
changes caused by normal modesets do seem to make it through to the
hardware somehow since I get a stable picture on the monitor with
any resolution. Maybe it's the "three times for luck" stuff that
somehow masks the problem, or something.

But apparently there are machines (eg. Nick Bowler's G45) where that
isn't the case and we fail to get the correct clock from the DPLL.

Things used to work because we enabled VGA mode for disabled DPLLs,
so when re-enabling the DPLL VGA mode was enabled just prior to the
first register write, and hence the P1/P2 change went through without
a hitch. That got changed in

b8afb9113c51 drm/i915: Keep GMCH DPLL VGA mode always disabled

in the name of consistency. In order to keep the consistency part,
leave VGA mode disabled for disabled DPLLs, but turn it on just prior
to updating the P1/P2 dividers to make sure the hardware picks up
on the new values.

Cc: Nick Bowler <nbowler@draconx.ca>
Reported-by: Nick Bowler <nbowler@draconx.ca>
Tested-by: Nick Bowler <nbowler@draconx.ca>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index bdfac53dd9452..96e6c41783df4 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1724,6 +1724,13 @@ static void i9xx_enable_pll(struct intel_crtc *crtc)
 			   I915_READ(DPLL(!crtc->pipe)) | DPLL_DVO_2X_MODE);
 	}
 
+	/*
+	 * Apparently we need to have VGA mode enabled prior to changing
+	 * the P1/P2 dividers. Otherwise the DPLL will keep using the old
+	 * dividers, even though the register value does change.
+	 */
+	I915_WRITE(reg, 0);
+
 	I915_WRITE(reg, dpll);
 
 	/* Wait for the clocks to stabilize. */

From cc917ab43541db3ff66d0136042686d40a1b4c9a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue, 13 Oct 2015 14:22:26 +0100
Subject: [PATCH 037/199] drm/i915: Deny wrapping an userptr into a framebuffer
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Pinning a userptr onto the hardware raises interesting questions about
the lifetime of such a surface as the framebuffer extends that life
beyond the client's address space. That is the hardware will need to
keep scanning out from the backing storage even after the client wants
to remap its address space. As the hardware pins the backing storage,
the userptr becomes invalid and this raises a WARN when the clients
tries to unmap its address space. The situation can be even more
complicated when the buffer is passed between processes, between a
client and display server, where the lifetime and hardware access is
even more confusing. Deny it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 5 ++++-
 drivers/gpu/drm/i915/intel_display.c    | 5 +++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 8fd431bcdfd3a..a96b9006a51e5 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -804,7 +804,10 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
  * Also note, that the object created here is not currently a "first class"
  * object, in that several ioctls are banned. These are the CPU access
  * ioctls: mmap(), pwrite and pread. In practice, you are expected to use
- * direct access via your pointer rather than use those ioctls.
+ * direct access via your pointer rather than use those ioctls. Another
+ * restriction is that we do not allow userptr surfaces to be pinned to the
+ * hardware and so we reject any attempt to create a framebuffer out of a
+ * userptr.
  *
  * If you think this is a good interface to use to pass GPU memory between
  * drivers, please use dma-buf instead. In fact, wherever possible use
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 96e6c41783df4..2bf248b045428 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14116,6 +14116,11 @@ static int intel_user_framebuffer_create_handle(struct drm_framebuffer *fb,
 	struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 
+	if (obj->userptr.mm) {
+		DRM_DEBUG("attempting to use a userptr for a framebuffer, denied\n");
+		return -EINVAL;
+	}
+
 	return drm_gem_handle_create(file, &obj->base, handle);
 }
 

From ba2374fd2bf379f933773811fdb06cb6a5445f41 Mon Sep 17 00:00:00 2001
From: Christian Zander <christian@nervanasys.com>
Date: Wed, 10 Jun 2015 09:41:45 -0700
Subject: [PATCH 038/199] iommu/vt-d: fix range computation when making room
 for large pages

In preparation for the installation of a large page, any small page
tables that may still exist in the target IOV address range are
removed.  However, if a scatter/gather list entry is large enough to
fit more than one large page, the address space for any subsequent
large pages is not cleared of conflicting small page tables.

This can cause legitimate mapping requests to fail with errors of the
form below, potentially followed by a series of IOMMU faults:

ERROR: DMA PTE for vPFN 0xfde00 already set (to 7f83a4003 not 7e9e00083)

In this example, a 4MiB scatter/gather list entry resulted in the
successful installation of a large page @ vPFN 0xfdc00, followed by
a failed attempt to install another large page @ vPFN 0xfde00, due to
the presence of a pointer to a small page table @ 0x7f83a4000.

To address this problem, compute the number of large pages that fit
into a given scatter/gather list entry, and use it to derive the
last vPFN covered by the large page(s).

Cc: stable@vger.kernel.org
Signed-off-by: Christian Zander <christian@nervanasys.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/iommu/intel-iommu.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 041bc1810a861..df53855a5f3eb 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2115,15 +2115,19 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 				return -ENOMEM;
 			/* It is large page*/
 			if (largepage_lvl > 1) {
+				unsigned long nr_superpages, end_pfn;
+
 				pteval |= DMA_PTE_LARGE_PAGE;
 				lvl_pages = lvl_to_nr_pages(largepage_lvl);
+
+				nr_superpages = sg_res / lvl_pages;
+				end_pfn = iov_pfn + nr_superpages * lvl_pages - 1;
+
 				/*
 				 * Ensure that old small page tables are
-				 * removed to make room for superpage,
-				 * if they exist.
+				 * removed to make room for superpage(s).
 				 */
-				dma_pte_free_pagetable(domain, iov_pfn,
-						       iov_pfn + lvl_pages - 1);
+				dma_pte_free_pagetable(domain, iov_pfn, end_pfn);
 			} else {
 				pteval &= ~(uint64_t)DMA_PTE_LARGE_PAGE;
 			}

From 2e2edebefceef201624dcc323a1f7761e0040cf5 Mon Sep 17 00:00:00 2001
From: Jani Nikula <jani.nikula@intel.com>
Date: Wed, 14 Oct 2015 11:29:01 +0300
Subject: [PATCH 039/199] Revert "drm/i915: Add primary plane to mask if it's
 visible"

This reverts commit 721a09f7393de6c28a07516dccd654c6e995944a.

There is nothing wrong with the commit per se. We had two versions of
the commit, one in -next headed for v4.4 and this one for v4.3. Turns
out we'll need to backport more fixes from -next, and they conflict with
the v4.3 version. It gets messy. It will be easiest to revert this one,
and backport all the relevant commits from -next without modifications;
they apply cleanly after this revert.

Requested-by: Joseph Yasi <joe.yasi@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=91910#c4
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 2bf248b045428..7704315e067f6 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15101,12 +15101,9 @@ static void readout_plane_state(struct intel_crtc *crtc,
 
 		plane_state = to_intel_plane_state(p->base.state);
 
-		if (p->base.type == DRM_PLANE_TYPE_PRIMARY) {
+		if (p->base.type == DRM_PLANE_TYPE_PRIMARY)
 			plane_state->visible = primary_get_hw_state(crtc);
-			if (plane_state->visible)
-				crtc->base.state->plane_mask |=
-					1 << drm_plane_index(&p->base);
-		} else {
+		else {
 			if (active)
 				p->disable_plane(&p->base, &crtc->base);
 

From c4816c7389d8dbcad036be7e5a34584289d9f590 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala@linux.intel.com>
Date: Thu, 10 Sep 2015 18:59:07 +0300
Subject: [PATCH 040/199] drm/i915: Assign hwmode after encoder state readout
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The dotclock is often calculated in encoder .get_config(), so we
shouldn't copy the adjusted_mode to hwmode until we have read out the
dotclock.

Gets rid of some warnings like these:
[drm:drm_calc_timestamping_constants [drm]] *ERROR* crtc 21: Can't calculate constants, dotclock = 0!
[drm:i915_get_vblank_timestamp] crtc 0 is disabled

v2: Steal Maarten's idea to move crtc->mode etc. assignment too

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91428
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[Jani: cherry-picked from -next to v4.3]
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 57 +++++++++++++++-------------
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 7704315e067f6..ed87a7e4c32ac 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15132,33 +15132,6 @@ static void intel_modeset_readout_hw_state(struct drm_device *dev)
 		crtc->base.state->active = crtc->active;
 		crtc->base.enabled = crtc->active;
 
-		memset(&crtc->base.mode, 0, sizeof(crtc->base.mode));
-		if (crtc->base.state->active) {
-			intel_mode_from_pipe_config(&crtc->base.mode, crtc->config);
-			intel_mode_from_pipe_config(&crtc->base.state->adjusted_mode, crtc->config);
-			WARN_ON(drm_atomic_set_mode_for_crtc(crtc->base.state, &crtc->base.mode));
-
-			/*
-			 * The initial mode needs to be set in order to keep
-			 * the atomic core happy. It wants a valid mode if the
-			 * crtc's enabled, so we do the above call.
-			 *
-			 * At this point some state updated by the connectors
-			 * in their ->detect() callback has not run yet, so
-			 * no recalculation can be done yet.
-			 *
-			 * Even if we could do a recalculation and modeset
-			 * right now it would cause a double modeset if
-			 * fbdev or userspace chooses a different initial mode.
-			 *
-			 * If that happens, someone indicated they wanted a
-			 * mode change, which means it's safe to do a full
-			 * recalculation.
-			 */
-			crtc->base.state->mode.private_flags = I915_MODE_FLAG_INHERITED;
-		}
-
-		crtc->base.hwmode = crtc->config->base.adjusted_mode;
 		readout_plane_state(crtc, to_intel_crtc_state(crtc->base.state));
 
 		DRM_DEBUG_KMS("[CRTC:%d] hw state readout: %s\n",
@@ -15218,6 +15191,36 @@ static void intel_modeset_readout_hw_state(struct drm_device *dev)
 			      connector->base.name,
 			      connector->base.encoder ? "enabled" : "disabled");
 	}
+
+	for_each_intel_crtc(dev, crtc) {
+		crtc->base.hwmode = crtc->config->base.adjusted_mode;
+
+		memset(&crtc->base.mode, 0, sizeof(crtc->base.mode));
+		if (crtc->base.state->active) {
+			intel_mode_from_pipe_config(&crtc->base.mode, crtc->config);
+			intel_mode_from_pipe_config(&crtc->base.state->adjusted_mode, crtc->config);
+			WARN_ON(drm_atomic_set_mode_for_crtc(crtc->base.state, &crtc->base.mode));
+
+			/*
+			 * The initial mode needs to be set in order to keep
+			 * the atomic core happy. It wants a valid mode if the
+			 * crtc's enabled, so we do the above call.
+			 *
+			 * At this point some state updated by the connectors
+			 * in their ->detect() callback has not run yet, so
+			 * no recalculation can be done yet.
+			 *
+			 * Even if we could do a recalculation and modeset
+			 * right now it would cause a double modeset if
+			 * fbdev or userspace chooses a different initial mode.
+			 *
+			 * If that happens, someone indicated they wanted a
+			 * mode change, which means it's safe to do a full
+			 * recalculation.
+			 */
+			crtc->base.state->mode.private_flags = I915_MODE_FLAG_INHERITED;
+		}
+	}
 }
 
 /* Scan out the current hw modeset state,

From 0836e6d8c47416d6eb60bb68a5d7213c0c2d0d29 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala@linux.intel.com>
Date: Thu, 10 Sep 2015 18:59:08 +0300
Subject: [PATCH 041/199] drm/i915: Move sprite/cursor plane disable to
 intel_sanitize_crtc()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Move the sprite/cursor plane disabling to occur in intel_sanitize_crtc()
where it belongs instead of doing it in intel_modeset_readout_hw_state().

The plane disabling was first added in
4cf0ebbd4fafbdf8e6431dbb315e5511c3efdc3b drm/i915: Rework plane readout.

I got the idea from some patches from Partik and/or Maarten but those
moved also the plane state readout to intel_sanitize_crtc() which isn't
quite right in my opinion.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=91910
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[Jani: cherry-picked from -next to v4.3]
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 44 +++++++++++++---------------
 1 file changed, 20 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ed87a7e4c32ac..ac407e3e1eddf 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14911,9 +14911,19 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc)
 	/* restore vblank interrupts to correct state */
 	drm_crtc_vblank_reset(&crtc->base);
 	if (crtc->active) {
+		struct intel_plane *plane;
+
 		drm_calc_timestamping_constants(&crtc->base, &crtc->base.hwmode);
 		update_scanline_offset(crtc);
 		drm_crtc_vblank_on(&crtc->base);
+
+		/* Disable everything but the primary plane */
+		for_each_intel_plane_on_crtc(dev, crtc, plane) {
+			if (plane->base.type == DRM_PLANE_TYPE_PRIMARY)
+				continue;
+
+			plane->disable_plane(&plane->base, &crtc->base);
+		}
 	}
 
 	/* We need to sanitize the plane -> pipe mapping first because this will
@@ -15081,35 +15091,21 @@ void i915_redisable_vga(struct drm_device *dev)
 	i915_redisable_vga_power_on(dev);
 }
 
-static bool primary_get_hw_state(struct intel_crtc *crtc)
+static bool primary_get_hw_state(struct intel_plane *plane)
 {
-	struct drm_i915_private *dev_priv = crtc->base.dev->dev_private;
+	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 
-	return !!(I915_READ(DSPCNTR(crtc->plane)) & DISPLAY_PLANE_ENABLE);
+	return I915_READ(DSPCNTR(plane->plane)) & DISPLAY_PLANE_ENABLE;
 }
 
-static void readout_plane_state(struct intel_crtc *crtc,
-				struct intel_crtc_state *crtc_state)
+/* FIXME read out full plane state for all planes */
+static void readout_plane_state(struct intel_crtc *crtc)
 {
-	struct intel_plane *p;
-	struct intel_plane_state *plane_state;
-	bool active = crtc_state->base.active;
+	struct intel_plane_state *plane_state =
+		to_intel_plane_state(crtc->base.primary->state);
 
-	for_each_intel_plane(crtc->base.dev, p) {
-		if (crtc->pipe != p->pipe)
-			continue;
-
-		plane_state = to_intel_plane_state(p->base.state);
-
-		if (p->base.type == DRM_PLANE_TYPE_PRIMARY)
-			plane_state->visible = primary_get_hw_state(crtc);
-		else {
-			if (active)
-				p->disable_plane(&p->base, &crtc->base);
-
-			plane_state->visible = false;
-		}
-	}
+	plane_state->visible =
+		primary_get_hw_state(to_intel_plane(crtc->base.primary));
 }
 
 static void intel_modeset_readout_hw_state(struct drm_device *dev)
@@ -15132,7 +15128,7 @@ static void intel_modeset_readout_hw_state(struct drm_device *dev)
 		crtc->base.state->active = crtc->active;
 		crtc->base.enabled = crtc->active;
 
-		readout_plane_state(crtc, to_intel_crtc_state(crtc->base.state));
+		readout_plane_state(crtc);
 
 		DRM_DEBUG_KMS("[CRTC:%d] hw state readout: %s\n",
 			      crtc->base.base.id,

From 18e9345b0db9fe7bd18c3c43967789fe0a2fdb52 Mon Sep 17 00:00:00 2001
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date: Wed, 23 Sep 2015 16:11:41 +0200
Subject: [PATCH 042/199] drm/i915: Add primary plane to mask if it's visible

This fixes the warnings like

"plane A assertion failure, should be disabled but not"

that on the initial modeset during boot. This can happen if
the primary plane is enabled by the firmware, but inheriting
it fails because the DMAR is active or for other reasons.

Most likely caused by

commit 36750f284b3a4f19b304fda1bb7d6e9e1275ea8d
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Mon Jun 1 12:49:54 2015 +0200

    drm/i915: update plane state during init

This is a new version of

commit 721a09f7393de6c28a07516dccd654c6e995944a
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Sep 15 14:28:54 2015 +0200

    drm/i915: Add primary plane to mask if it's visible

That was reverted in order to facilitate easier backporting of some
commits from -next to v4.3.

Reported-by: Andreas Reis <andreas.reis@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91429
Reported-and-tested-by: Emil Renner Berthing <kernel@esmil.dk>
Tested-by: Andreas Reis <andreas.reis@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[Jani: cherry-picked from -next to v4.3]
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ac407e3e1eddf..b2270d576979b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15101,11 +15101,15 @@ static bool primary_get_hw_state(struct intel_plane *plane)
 /* FIXME read out full plane state for all planes */
 static void readout_plane_state(struct intel_crtc *crtc)
 {
+	struct drm_plane *primary = crtc->base.primary;
 	struct intel_plane_state *plane_state =
-		to_intel_plane_state(crtc->base.primary->state);
+		to_intel_plane_state(primary->state);
 
 	plane_state->visible =
-		primary_get_hw_state(to_intel_plane(crtc->base.primary));
+		primary_get_hw_state(to_intel_plane(primary));
+
+	if (plane_state->visible)
+		crtc->base.state->plane_mask |= 1 << drm_plane_index(primary);
 }
 
 static void intel_modeset_readout_hw_state(struct drm_device *dev)

From 8a53554e12e98d1759205afd7b8e9e2ea0936f48 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?K=C5=91v=C3=A1g=C3=B3=2C=20Zolt=C3=A1n?=
 <dirty.ice.hu@gmail.com>
Date: Mon, 12 Oct 2015 15:13:56 +0100
Subject: [PATCH 043/199] x86/efi: Fix multiple GOP device support
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When multiple GOP devices exists, but none of them implements
ConOut, the code should just choose the first GOP (according to
the comments). But currently 'fb_base' will refer to the last GOP,
while other parameters to the first GOP, which will likely
result in a garbled display.

I can reliably reproduce this bug using my ASRock Z87M Extreme4
motherboard with CSM and integrated GPU disabled, and two PCIe
video cards (NVidia GT640 and GTX980), booting from efi-stub
(booting from grub works fine).  On the primary display the
ASRock logo remains and on the secondary screen it is garbled
up completely.

Signed-off-by: Kővágó, Zoltán <DirtY.iCE.hu@gmail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444659236-24837-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/eboot.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index ee1b6d346b983..db51c1f274461 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -667,6 +667,7 @@ setup_gop32(struct screen_info *si, efi_guid_t *proto,
 		bool conout_found = false;
 		void *dummy = NULL;
 		u32 h = handles[i];
+		u32 current_fb_base;
 
 		status = efi_call_early(handle_protocol, h,
 					proto, (void **)&gop32);
@@ -678,7 +679,7 @@ setup_gop32(struct screen_info *si, efi_guid_t *proto,
 		if (status == EFI_SUCCESS)
 			conout_found = true;
 
-		status = __gop_query32(gop32, &info, &size, &fb_base);
+		status = __gop_query32(gop32, &info, &size, &current_fb_base);
 		if (status == EFI_SUCCESS && (!first_gop || conout_found)) {
 			/*
 			 * Systems that use the UEFI Console Splitter may
@@ -692,6 +693,7 @@ setup_gop32(struct screen_info *si, efi_guid_t *proto,
 			pixel_format = info->pixel_format;
 			pixel_info = info->pixel_information;
 			pixels_per_scan_line = info->pixels_per_scan_line;
+			fb_base = current_fb_base;
 
 			/*
 			 * Once we've found a GOP supporting ConOut,
@@ -770,6 +772,7 @@ setup_gop64(struct screen_info *si, efi_guid_t *proto,
 		bool conout_found = false;
 		void *dummy = NULL;
 		u64 h = handles[i];
+		u32 current_fb_base;
 
 		status = efi_call_early(handle_protocol, h,
 					proto, (void **)&gop64);
@@ -781,7 +784,7 @@ setup_gop64(struct screen_info *si, efi_guid_t *proto,
 		if (status == EFI_SUCCESS)
 			conout_found = true;
 
-		status = __gop_query64(gop64, &info, &size, &fb_base);
+		status = __gop_query64(gop64, &info, &size, &current_fb_base);
 		if (status == EFI_SUCCESS && (!first_gop || conout_found)) {
 			/*
 			 * Systems that use the UEFI Console Splitter may
@@ -795,6 +798,7 @@ setup_gop64(struct screen_info *si, efi_guid_t *proto,
 			pixel_format = info->pixel_format;
 			pixel_info = info->pixel_information;
 			pixels_per_scan_line = info->pixels_per_scan_line;
+			fb_base = current_fb_base;
 
 			/*
 			 * Once we've found a GOP supporting ConOut,

From 6391074598442b8a8d33e2cfdf277d5568b57f2d Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Mon, 12 Oct 2015 15:44:49 +0200
Subject: [PATCH 044/199] ARM: pxa: fix pxa3xx DFI lockup hack

Some recently added code to avoid a bug introduced a build error
when CONFIG_PM is disabled and a macro is hidden:

arch/arm/mach-pxa/pxa3xx.c: In function 'pxa3xx_init':
arch/arm/mach-pxa/pxa3xx.c:439:3: error: 'NDCR' undeclared (first use in this function)
   NDCR = (NDCR & ~NDCR_ND_ARB_EN) | NDCR_ND_ARB_CNTL;
   ^

This moves the macro outside of the #ifdef so it can be
referenced correctly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: adf3442cc890 ("ARM: pxa: fix DFI bus lockups on startup")
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 arch/arm/mach-pxa/pxa3xx.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/arm/mach-pxa/pxa3xx.c b/arch/arm/mach-pxa/pxa3xx.c
index 06005d3f2ba33..20ce2d386f172 100644
--- a/arch/arm/mach-pxa/pxa3xx.c
+++ b/arch/arm/mach-pxa/pxa3xx.c
@@ -42,10 +42,6 @@
 #define PECR_IS(n)	((1 << ((n) * 2)) << 29)
 
 extern void __init pxa_dt_irq_init(int (*fn)(struct irq_data *, unsigned int));
-#ifdef CONFIG_PM
-
-#define ISRAM_START	0x5c000000
-#define ISRAM_SIZE	SZ_256K
 
 /*
  * NAND NFC: DFI bus arbitration subset
@@ -54,6 +50,11 @@ extern void __init pxa_dt_irq_init(int (*fn)(struct irq_data *, unsigned int));
 #define NDCR_ND_ARB_EN		(1 << 12)
 #define NDCR_ND_ARB_CNTL	(1 << 19)
 
+#ifdef CONFIG_PM
+
+#define ISRAM_START	0x5c000000
+#define ISRAM_SIZE	SZ_256K
+
 static void __iomem *sram;
 static unsigned long wakeup_src;
 

From 83bf6b13834d9c926905e45cdfda23fe218fc598 Mon Sep 17 00:00:00 2001
From: Linus Walleij <linus.walleij@linaro.org>
Date: Tue, 13 Oct 2015 19:46:54 +0200
Subject: [PATCH 045/199] ARM: ux500: modify initial levelshifter status

commit 1d8aca9df612f5751892fb2642d72536f2f48fd0
"ARM: ux500: fix MMC/SD card regression"
fixed broken the level shifter: it should be default ON
but became default OFF.

Fixes: 1d8aca9df612 "ARM: ux500: fix MMC/SD card regression"
Reported-and-tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/boot/dts/ste-hrefv60plus.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/ste-hrefv60plus.dtsi b/arch/arm/boot/dts/ste-hrefv60plus.dtsi
index 810cda743b6d5..9c2387b34d0c7 100644
--- a/arch/arm/boot/dts/ste-hrefv60plus.dtsi
+++ b/arch/arm/boot/dts/ste-hrefv60plus.dtsi
@@ -56,7 +56,7 @@
 					/* VMMCI level-shifter enable */
 					default_hrefv60_cfg2 {
 						pins = "GPIO169_D22";
-						ste,config = <&gpio_out_lo>;
+						ste,config = <&gpio_out_hi>;
 					};
 					/* VMMCI level-shifter voltage select */
 					default_hrefv60_cfg3 {

From 5c6dcd7f3b26736a88593586fbeec28b6a1ea78d Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime.ripard@free-electrons.com>
Date: Wed, 7 Oct 2015 18:39:40 +0100
Subject: [PATCH 046/199] MAINTAINERS: Update Allwinner entry and add new
 maintainer

Add Chen-Yu Tsai as a co-maintainer to the ARM sunxi support.

While we are doing so, also update the entry for new SoCs.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 MAINTAINERS | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5f467845ef725..23b72245d829f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -894,11 +894,12 @@ M:	Lennert Buytenhek <kernel@wantstofly.org>
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Maintained
 
-ARM/Allwinner A1X SoC support
+ARM/Allwinner sunXi SoC support
 M:	Maxime Ripard <maxime.ripard@free-electrons.com>
+M:	Chen-Yu Tsai <wens@csie.org>
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Maintained
-N:	sun[x4567]i
+N:	sun[x456789]i
 
 ARM/Allwinner SoC Clock Support
 M:	Emilio López <emilio@elopez.com.ar>

From f9e5ca86eeaae430e70b05ec312372dca1055d8d Mon Sep 17 00:00:00 2001
From: Carlo Caione <carlo@endlessm.com>
Date: Thu, 1 Oct 2015 12:52:40 +0200
Subject: [PATCH 047/199] ARM: meson6: DTS: Fix wrong reg mapping and IRQ
 numbers

The DTS erronously uses the wrong reg mapping and IRQ numbers for some
UART, WDT and timer nodes. Fix this.

Reported-by: John Wehle <john@feith.com>
Signed-off-by: Carlo Caione <carlo@endlessm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/boot/dts/meson.dtsi | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/arm/boot/dts/meson.dtsi b/arch/arm/boot/dts/meson.dtsi
index 548441384d2a3..8c77c87660cdf 100644
--- a/arch/arm/boot/dts/meson.dtsi
+++ b/arch/arm/boot/dts/meson.dtsi
@@ -67,7 +67,7 @@
 
 	timer@c1109940 {
 		compatible = "amlogic,meson6-timer";
-		reg = <0xc1109940 0x14>;
+		reg = <0xc1109940 0x18>;
 		interrupts = <0 10 1>;
 	};
 
@@ -80,36 +80,37 @@
 		wdt: watchdog@c1109900 {
 			compatible = "amlogic,meson6-wdt";
 			reg = <0xc1109900 0x8>;
+			interrupts = <0 0 1>;
 		};
 
 		uart_AO: serial@c81004c0 {
 			compatible = "amlogic,meson-uart";
-			reg = <0xc81004c0 0x14>;
+			reg = <0xc81004c0 0x18>;
 			interrupts = <0 90 1>;
 			clocks = <&clk81>;
 			status = "disabled";
 		};
 
-		uart_A: serial@c81084c0 {
+		uart_A: serial@c11084c0 {
 			compatible = "amlogic,meson-uart";
-			reg = <0xc81084c0 0x14>;
-			interrupts = <0 90 1>;
+			reg = <0xc11084c0 0x18>;
+			interrupts = <0 26 1>;
 			clocks = <&clk81>;
 			status = "disabled";
 		};
 
-		uart_B: serial@c81084dc {
+		uart_B: serial@c11084dc {
 			compatible = "amlogic,meson-uart";
-			reg = <0xc81084dc 0x14>;
-			interrupts = <0 90 1>;
+			reg = <0xc11084dc 0x18>;
+			interrupts = <0 75 1>;
 			clocks = <&clk81>;
 			status = "disabled";
 		};
 
-		uart_C: serial@c8108700 {
+		uart_C: serial@c1108700 {
 			compatible = "amlogic,meson-uart";
-			reg = <0xc8108700 0x14>;
-			interrupts = <0 90 1>;
+			reg = <0xc1108700 0x18>;
+			interrupts = <0 93 1>;
 			clocks = <&clk81>;
 			status = "disabled";
 		};

From db347f1a5304d68c68c52f19971924b1e5842f3c Mon Sep 17 00:00:00 2001
From: Marcin Wojtas <mw@semihalf.com>
Date: Thu, 15 Oct 2015 03:17:08 +0200
Subject: [PATCH 048/199] ARM: mvebu: correct a385-db-ap compatible string

This commit enables standby support on Armada 385 DB-AP board, because
the PM initalization routine requires "marvell,armada380" compatible
string for all Armada 38x-based platforms.

Beside the compatible "marvell,armada38x" was wrong and should be fixed
in the stable kernels too.

[gregory.clement@free-electrons.com: add information, about the fixes]
Fixes: e5ee12817e9ea ("ARM: mvebu: Add Armada 385 Access Point
Development Board support")
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: <stable@vger.kernel.org>
---
 arch/arm/boot/dts/armada-385-db-ap.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/armada-385-db-ap.dts b/arch/arm/boot/dts/armada-385-db-ap.dts
index 89f5a95954ed9..4047621b137e6 100644
--- a/arch/arm/boot/dts/armada-385-db-ap.dts
+++ b/arch/arm/boot/dts/armada-385-db-ap.dts
@@ -46,7 +46,7 @@
 
 / {
 	model = "Marvell Armada 385 Access Point Development Board";
-	compatible = "marvell,a385-db-ap", "marvell,armada385", "marvell,armada38x";
+	compatible = "marvell,a385-db-ap", "marvell,armada385", "marvell,armada380";
 
 	chosen {
 		stdout-path = "serial1:115200n8";

From d14f6fced5f9360edca5a1325ddb7077aab1203b Mon Sep 17 00:00:00 2001
From: Jay Cornwall <jay@jcornwall.me>
Date: Wed, 16 Sep 2015 14:10:03 -0500
Subject: [PATCH 049/199] iommu/amd: Fix BUG when faulting a PROT_NONE VMA

handle_mm_fault indirectly triggers a BUG in do_numa_page
when given a VMA without read/write/execute access. Check
this condition in do_fault.

do_fault -> handle_mm_fault -> handle_pte_fault -> do_numa_page

  mm/memory.c
  3147  static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
  ....
  3159  /* A PROT_NONE fault should not end up here */
  3160  BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));

Signed-off-by: Jay Cornwall <jay@jcornwall.me>
Cc: <stable@vger.kernel.org> # v4.1+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu_v2.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 1131664b918b0..d21d4edf7236a 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -516,6 +516,13 @@ static void do_fault(struct work_struct *work)
 		goto out;
 	}
 
+	if (!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))) {
+		/* handle_mm_fault would BUG_ON() */
+		up_read(&mm->mmap_sem);
+		handle_fault_error(fault);
+		goto out;
+	}
+
 	ret = handle_mm_fault(mm, vma, address, write);
 	if (ret & VM_FAULT_ERROR) {
 		/* failed to service fault */

From f42d79ab67322e51b92dd7aa965e310c71352a64 Mon Sep 17 00:00:00 2001
From: Junichi Nomura <j-nomura@ce.jp.nec.com>
Date: Wed, 14 Oct 2015 05:02:15 +0000
Subject: [PATCH 050/199] blk-mq: fix use-after-free in blk_mq_free_tag_set()

tags is freed in blk_mq_free_rq_map() and should not be used after that.
The problem doesn't manifest if CONFIG_CPUMASK_OFFSTACK is false because
free_cpumask_var() is nop.

tags->cpumask is allocated in blk_mq_init_tags() so it's natural to
free cpumask in its counter part, blk_mq_free_tags().

Fixes: f26cdc8536ad ("blk-mq: Shared tag enhancements")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 block/blk-mq-tag.c | 1 +
 block/blk-mq.c     | 4 +---
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index ed96474d75cb6..ec2d11915142a 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -641,6 +641,7 @@ void blk_mq_free_tags(struct blk_mq_tags *tags)
 {
 	bt_free(&tags->bitmap_tags);
 	bt_free(&tags->breserved_tags);
+	free_cpumask_var(tags->cpumask);
 	kfree(tags);
 }
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7785ae96267a1..85f014327342e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2296,10 +2296,8 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set)
 	int i;
 
 	for (i = 0; i < set->nr_hw_queues; i++) {
-		if (set->tags[i]) {
+		if (set->tags[i])
 			blk_mq_free_rq_map(set, set->tags[i], i);
-			free_cpumask_var(set->tags[i]->cpumask);
-		}
 	}
 
 	kfree(set->tags);

From b20519fd5007908c6a816cab1b9db911915e9fbd Mon Sep 17 00:00:00 2001
From: Pawel Moll <pawel.moll@arm.com>
Date: Thu, 15 Oct 2015 14:32:45 +0100
Subject: [PATCH 051/199] bus: arm-ccn: Handle correctly no-more-cpus case

When migrating events the driver picks another cpu using
cpumask_any_but() function, which returns value >= nr_cpu_ids
when there is none available, not a negative value as the code
assumed. Fixed now.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/bus/arm-ccn.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/bus/arm-ccn.c b/drivers/bus/arm-ccn.c
index 7d9879e166cf4..cc322fbde9af6 100644
--- a/drivers/bus/arm-ccn.c
+++ b/drivers/bus/arm-ccn.c
@@ -1184,7 +1184,7 @@ static int arm_ccn_pmu_cpu_notifier(struct notifier_block *nb,
 		if (!cpumask_test_and_clear_cpu(cpu, &dt->cpu))
 			break;
 		target = cpumask_any_but(cpu_online_mask, cpu);
-		if (target < 0)
+		if (target >= nr_cpu_ids)
 			break;
 		perf_pmu_migrate_context(&dt->pmu, cpu, target);
 		cpumask_set_cpu(target, &dt->cpu);

From a0bcbe969f564d1ec08658170dda72a1b7e9053a Mon Sep 17 00:00:00 2001
From: Pawel Moll <pawel.moll@arm.com>
Date: Thu, 15 Oct 2015 14:32:46 +0100
Subject: [PATCH 052/199] bus: arm-ccn: Fix irq affinity setting on CPU
 migration

When PMU context is migrating between CPUs, interrupt affinity is set as
well. Only this should not happen when the CCN interrupt is not being
used at all (the driver is using a hrtimer tick instead).

Fixed now.

Cc: <stable@vger.kernel.org> # 4.2+
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/bus/arm-ccn.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/arm-ccn.c b/drivers/bus/arm-ccn.c
index cc322fbde9af6..7082c72688456 100644
--- a/drivers/bus/arm-ccn.c
+++ b/drivers/bus/arm-ccn.c
@@ -1188,7 +1188,8 @@ static int arm_ccn_pmu_cpu_notifier(struct notifier_block *nb,
 			break;
 		perf_pmu_migrate_context(&dt->pmu, cpu, target);
 		cpumask_set_cpu(target, &dt->cpu);
-		WARN_ON(irq_set_affinity(ccn->irq, &dt->cpu) != 0);
+		if (ccn->irq)
+			WARN_ON(irq_set_affinity(ccn->irq, &dt->cpu) != 0);
 	default:
 		break;
 	}

From fb659882cc6482bd2e32ec0ab8ab7afeda649413 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Mon, 12 Oct 2015 14:48:39 +0100
Subject: [PATCH 053/199] drivers/perf: arm_pmu: avoid CPU device_node
 reference leak

of_cpu_device_node_get increments the reference count on the CPU
device_node, so we must take care to of_node_put once we've finished
with it.

This patch fixes the perf IRQ probing code to avoid the leak.

Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/perf/arm_pmu.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2365a32a595e4..be3755c973e96 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -823,9 +823,15 @@ static int of_pmu_irq_cfg(struct arm_pmu *pmu)
 		}
 
 		/* Now look up the logical CPU number */
-		for_each_possible_cpu(cpu)
-			if (dn == of_cpu_device_node_get(cpu))
+		for_each_possible_cpu(cpu) {
+			struct device_node *cpu_dn;
+
+			cpu_dn = of_cpu_device_node_get(cpu);
+			of_node_put(cpu_dn);
+
+			if (dn == cpu_dn)
 				break;
+		}
 
 		if (cpu >= nr_cpu_ids) {
 			pr_warn("Failed to find logical CPU for %s\n",

From 2e4e5da55afaf9315f2398e85424fd3824459220 Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <yamada.masahiro@socionext.com>
Date: Thu, 15 Oct 2015 20:32:05 +0900
Subject: [PATCH 054/199] ARM: dts: uniphier: fix IRQ number for devices on
 PH1-LD6b ref board

The IRQ signal from external devices on this board is connected to
the XIRQ4 pin of the SoC.  The IRQ number should be 52, not 50.

Fixes: a5e921b4771f ("ARM: dts: uniphier: add ProXstream2 and PH1-LD6b SoC/board support")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/boot/dts/uniphier-ph1-ld6b-ref.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/uniphier-ph1-ld6b-ref.dts b/arch/arm/boot/dts/uniphier-ph1-ld6b-ref.dts
index 33963acd7e8f9..f80f772d99fb5 100644
--- a/arch/arm/boot/dts/uniphier-ph1-ld6b-ref.dts
+++ b/arch/arm/boot/dts/uniphier-ph1-ld6b-ref.dts
@@ -85,7 +85,7 @@
 };
 
 &ethsc {
-	interrupts = <0 50 4>;
+	interrupts = <0 52 4>;
 };
 
 &serial0 {

From 81c04b943872e0332872df18cec1dec89b178b4d Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Mon, 12 Oct 2015 21:23:39 +0200
Subject: [PATCH 055/199] nvme: use an integer value to Linux errno values

Use a separate integer variable to hold the signed Linux errno
values we pass back to the block layer.  Note that for pass through
commands those might still be NVMe values, but those fit into the
int as well.

Fixes: f4829a9b7a61: ("blk-mq: fix racy updates of rq->errors")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 drivers/block/nvme-core.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index fce353ba5f66c..84e4a80883867 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -603,8 +603,8 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 	struct nvme_iod *iod = ctx;
 	struct request *req = iod_get_private(iod);
 	struct nvme_cmd_info *cmd_rq = blk_mq_rq_to_pdu(req);
-
 	u16 status = le16_to_cpup(&cqe->status) >> 1;
+	int error = 0;
 
 	if (unlikely(status)) {
 		if (!(status & NVME_SC_DNR || blk_noretry_request(req))
@@ -621,9 +621,11 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 
 		if (req->cmd_type == REQ_TYPE_DRV_PRIV) {
 			if (cmd_rq->ctx == CMD_CTX_CANCELLED)
-				status = -EINTR;
+				error = -EINTR;
+			else
+				error = status;
 		} else {
-			status = nvme_error_status(status);
+			error = nvme_error_status(status);
 		}
 	}
 
@@ -635,7 +637,7 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 	if (cmd_rq->aborted)
 		dev_warn(nvmeq->dev->dev,
 			"completing aborted command with status:%04x\n",
-			status);
+			error);
 
 	if (iod->nents) {
 		dma_unmap_sg(nvmeq->dev->dev, iod->sg, iod->nents,
@@ -649,7 +651,7 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 	}
 	nvme_free_iod(nvmeq->dev, iod);
 
-	blk_mq_complete_request(req, status);
+	blk_mq_complete_request(req, error);
 }
 
 /* length is in bytes.  gfp flags indicates whether we may sleep. */

From b02176f30cd30acccd3b633ab7d9aed8b5da52ff Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 8 Sep 2015 12:20:22 -0400
Subject: [PATCH 056/199] block: don't release bdi while request_queue has live
 references

bdi's are initialized in two steps, bdi_init() and bdi_register(), but
destroyed in a single step by bdi_destroy() which, for a bdi embedded
in a request_queue, is called during blk_cleanup_queue() which makes
the queue invisible and starts the draining of remaining usages.

A request_queue's user can access the congestion state of the embedded
bdi as long as it holds a reference to the queue.  As such, it may
access the congested state of a queue which finished
blk_cleanup_queue() but hasn't reached blk_release_queue() yet.
Because the congested state was embedded in backing_dev_info which in
turn is embedded in request_queue, accessing the congested state after
bdi_destroy() was called was fine.  The bdi was destroyed but the
memory region for the congested state remained accessible till the
queue got released.

a13f35e87140 ("writeback: don't embed root bdi_writeback_congested in
bdi_writeback") changed the situation.  Now, the root congested state
which is expected to be pinned while request_queue remains accessible
is separately reference counted and the base ref is put during
bdi_destroy().  This means that the root congested state may go away
prematurely while the queue is between bdi_dstroy() and
blk_cleanup_queue(), which was detected by Andrey's KASAN tests.

The root cause of this problem is that bdi doesn't distinguish the two
steps of destruction, unregistration and release, and now the root
congested state actually requires a separate release step.  To fix the
issue, this patch separates out bdi_unregister() and bdi_exit() from
bdi_destroy().  bdi_unregister() is called from blk_cleanup_queue()
and bdi_exit() from blk_release_queue().  bdi_destroy() is now just a
simple wrapper calling the two steps back-to-back.

While at it, the prototype of bdi_destroy() is moved right below
bdi_setup_and_register() so that the counterpart operations are
located together.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: a13f35e87140 ("writeback: don't embed root bdi_writeback_congested in bdi_writeback")
Cc: stable@vger.kernel.org # v4.2+
Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com>
Link: http://lkml.kernel.org/g/CAAeHK+zUJ74Zn17=rOyxacHU18SgCfC6bsYW=6kCY5GXJBwGfQ@mail.gmail.com
Reviewed-by: Jan Kara <jack@suse.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 block/blk-core.c            |  2 +-
 block/blk-sysfs.c           |  1 +
 include/linux/backing-dev.h |  6 +++++-
 mm/backing-dev.c            | 12 +++++++++++-
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2eb722d48773c..18e92a6645e24 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -576,7 +576,7 @@ void blk_cleanup_queue(struct request_queue *q)
 		q->queue_lock = &q->__queue_lock;
 	spin_unlock_irq(lock);
 
-	bdi_destroy(&q->backing_dev_info);
+	bdi_unregister(&q->backing_dev_info);
 
 	/* @q is and will stay empty, shutdown and put */
 	blk_put_queue(q);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 3e44a9da2a135..07b42f5ad797b 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -540,6 +540,7 @@ static void blk_release_queue(struct kobject *kobj)
 	struct request_queue *q =
 		container_of(kobj, struct request_queue, kobj);
 
+	bdi_exit(&q->backing_dev_info);
 	blkcg_exit_queue(q);
 
 	if (q->elevator) {
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 78677e5a65bf1..c85f74946a8ba 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -19,13 +19,17 @@
 #include <linux/slab.h>
 
 int __must_check bdi_init(struct backing_dev_info *bdi);
-void bdi_destroy(struct backing_dev_info *bdi);
+void bdi_exit(struct backing_dev_info *bdi);
 
 __printf(3, 4)
 int bdi_register(struct backing_dev_info *bdi, struct device *parent,
 		const char *fmt, ...);
 int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev);
+void bdi_unregister(struct backing_dev_info *bdi);
+
 int __must_check bdi_setup_and_register(struct backing_dev_info *, char *);
+void bdi_destroy(struct backing_dev_info *bdi);
+
 void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 			bool range_cyclic, enum wb_reason reason);
 void wb_start_background_writeback(struct bdi_writeback *wb);
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index e92d77937fd36..9e841399041a4 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -835,7 +835,7 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi)
 	synchronize_rcu_expedited();
 }
 
-void bdi_destroy(struct backing_dev_info *bdi)
+void bdi_unregister(struct backing_dev_info *bdi)
 {
 	/* make sure nobody finds us on the bdi_list anymore */
 	bdi_remove_from_list(bdi);
@@ -847,9 +847,19 @@ void bdi_destroy(struct backing_dev_info *bdi)
 		device_unregister(bdi->dev);
 		bdi->dev = NULL;
 	}
+}
 
+void bdi_exit(struct backing_dev_info *bdi)
+{
+	WARN_ON_ONCE(bdi->dev);
 	wb_exit(&bdi->wb);
 }
+
+void bdi_destroy(struct backing_dev_info *bdi)
+{
+	bdi_unregister(bdi);
+	bdi_exit(bdi);
+}
 EXPORT_SYMBOL(bdi_destroy);
 
 /*

From 4f1d841475e1f6e9e32496dda11215db56f4ea73 Mon Sep 17 00:00:00 2001
From: Thierry Reding <treding@nvidia.com>
Date: Fri, 9 Oct 2015 17:51:47 +0200
Subject: [PATCH 057/199] ARM: tegra: Comment out gpio-ranges properties

While the addition of these properties is technically correct it unveils
a bug with deferred probe. The problem is that the presence of the gpio-
range property causes the gpio-tegra driver to defer probe (it needs the
pinctrl driver to be ready). That's technically correct, but it causes a
couple of issues:

  - The keyboard on Chromebooks stops working. The reason for that is
    that the gpio-tegra device has not registered an IRQ domain by the
    time the EC SPI device is registered, hence the interrupt number
    resolves to 0. This is technically a bug in the SPI core, since it
    should really resolve the interrupt at probe time and defer if the
    IRQ domain isn't available yet. This is similar to what's done for
    I2C and platform device already.

  - The gpio-tegra device deferring probe means that it is moved to the
    end of the dpm_list. This list defines the suspend/resume order for
    devices. However the core lacks a way to move all users of the
    gpio-tegra device to the end of the dpm_list at the same time. This
    in turn results in a subtle bug on Jetson TK1, where the gpio-keys
    device is used to expose the power key as input. The power key is a
    convenient way to wake the system from suspend. Interestingly, the
    gpio-keys device ends up getting probed at a point after gpio-tegra
    has been probed successfully from having been deferred earlier. As
    such the driver doesn't need to defer the probe itself, and hence
    the device isn't moved to the end of the dpm_list. This causes the
    gpio-tegra device to be suspended before gpio-keys, which in turn
    leaves gpio-keys unable to wake the system from suspend.

There are patches in the works to fix both of the above issues, but they
are too involved to make it into v4.3, so in the meantime let's fix the
regressions by commenting out the gpio-ranges properties until the fixes
have landed.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/boot/dts/tegra114.dtsi | 2 ++
 arch/arm/boot/dts/tegra124.dtsi | 2 ++
 arch/arm/boot/dts/tegra20.dtsi  | 2 ++
 arch/arm/boot/dts/tegra30.dtsi  | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/tegra114.dtsi b/arch/arm/boot/dts/tegra114.dtsi
index 9d4f86e9c50ad..d845bd1448b54 100644
--- a/arch/arm/boot/dts/tegra114.dtsi
+++ b/arch/arm/boot/dts/tegra114.dtsi
@@ -234,7 +234,9 @@
 		gpio-controller;
 		#interrupt-cells = <2>;
 		interrupt-controller;
+		/*
 		gpio-ranges = <&pinmux 0 0 246>;
+		*/
 	};
 
 	apbmisc@70000800 {
diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 1e204a6de12c3..819e2ae2cabe2 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -258,7 +258,9 @@
 		gpio-controller;
 		#interrupt-cells = <2>;
 		interrupt-controller;
+		/*
 		gpio-ranges = <&pinmux 0 0 251>;
+		*/
 	};
 
 	apbdma: dma@0,60020000 {
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index e058709e6d98c..969b828505ae4 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -244,7 +244,9 @@
 		gpio-controller;
 		#interrupt-cells = <2>;
 		interrupt-controller;
+		/*
 		gpio-ranges = <&pinmux 0 0 224>;
+		*/
 	};
 
 	apbmisc@70000800 {
diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi
index fe04fb5e155f4..c6938ad1b543f 100644
--- a/arch/arm/boot/dts/tegra30.dtsi
+++ b/arch/arm/boot/dts/tegra30.dtsi
@@ -349,7 +349,9 @@
 		gpio-controller;
 		#interrupt-cells = <2>;
 		interrupt-controller;
+		/*
 		gpio-ranges = <&pinmux 0 0 248>;
+		*/
 	};
 
 	apbmisc@70000800 {

From f05819df10d7b09f6d1eb6f8534a8f68e5a4fe61 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells@redhat.com>
Date: Thu, 15 Oct 2015 17:21:37 +0100
Subject: [PATCH 058/199] KEYS: Fix crash when attempt to garbage collect an
 uninstantiated keyring

The following sequence of commands:

    i=`keyctl add user a a @s`
    keyctl request2 keyring foo bar @t
    keyctl unlink $i @s

tries to invoke an upcall to instantiate a keyring if one doesn't already
exist by that name within the user's keyring set.  However, if the upcall
fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
other error code.  When the key is garbage collected, the key destroy
function is called unconditionally and keyring_destroy() uses list_empty()
on keyring->type_data.link - which is in a union with reject_error.
Subsequently, the kernel tries to unlink the keyring from the keyring names
list - which oopses like this:

	BUG: unable to handle kernel paging request at 00000000ffffff8a
	IP: [<ffffffff8126e051>] keyring_destroy+0x3d/0x88
	...
	Workqueue: events key_garbage_collector
	...
	RIP: 0010:[<ffffffff8126e051>] keyring_destroy+0x3d/0x88
	RSP: 0018:ffff88003e2f3d30  EFLAGS: 00010203
	RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000
	RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40
	RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000
	R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900
	R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000
	...
	CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0
	...
	Call Trace:
	 [<ffffffff8126c756>] key_gc_unused_keys.constprop.1+0x5d/0x10f
	 [<ffffffff8126ca71>] key_garbage_collector+0x1fa/0x351
	 [<ffffffff8105ec9b>] process_one_work+0x28e/0x547
	 [<ffffffff8105fd17>] worker_thread+0x26e/0x361
	 [<ffffffff8105faa9>] ? rescuer_thread+0x2a8/0x2a8
	 [<ffffffff810648ad>] kthread+0xf3/0xfb
	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
	 [<ffffffff815f2ccf>] ret_from_fork+0x3f/0x70
	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2

Note the value in RAX.  This is a 32-bit representation of -ENOKEY.

The solution is to only call ->destroy() if the key was successfully
instantiated.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
---
 security/keys/gc.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/security/keys/gc.c b/security/keys/gc.c
index 39eac1fd5706c..addf060399e09 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -134,8 +134,10 @@ static noinline void key_gc_unused_keys(struct list_head *keys)
 		kdebug("- %u", key->serial);
 		key_check(key);
 
-		/* Throw away the key data */
-		if (key->type->destroy)
+		/* Throw away the key data if the key is instantiated */
+		if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags) &&
+		    !test_bit(KEY_FLAG_NEGATIVE, &key->flags) &&
+		    key->type->destroy)
 			key->type->destroy(key);
 
 		security_key_free(key);

From 17b38fb89055bf5df402980c9546a8b046552f2b Mon Sep 17 00:00:00 2001
From: Doron Tsur <doront@mellanox.com>
Date: Thu, 15 Oct 2015 15:01:02 +0300
Subject: [PATCH 059/199] IB/core: Fix memory corruption in
 ib_cache_gid_set_default_gid

When ib_cache_gid_set_default_gid is called from several threads,
updating the table could make find_gid fail, therefore a negative
index will be retruned and an invalid table entry will be used.
Locking find_gid as well fixes this problem.

Fixes: 03db3a2d81e6 ('IB/core: Add RoCE GID table management')
Signed-off-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/core/cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 8f66c67ff0df0..87471ef371986 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -508,12 +508,12 @@ void ib_cache_gid_set_default_gid(struct ib_device *ib_dev, u8 port,
 	memset(&gid_attr, 0, sizeof(gid_attr));
 	gid_attr.ndev = ndev;
 
+	mutex_lock(&table->lock);
 	ix = find_gid(table, NULL, NULL, true, GID_ATTR_FIND_MASK_DEFAULT);
 
 	/* Coudn't find default GID location */
 	WARN_ON(ix < 0);
 
-	mutex_lock(&table->lock);
 	if (!__ib_cache_gid_get(ib_dev, port, ix,
 				&current_gid, &current_gid_attr) &&
 	    mode == IB_CACHE_GID_DEFAULT_MODE_SET &&

From 0dfc70c33409afc232ef0b9ec210535dfbf9bc61 Mon Sep 17 00:00:00 2001
From: Keith Busch <keith.busch@intel.com>
Date: Thu, 15 Oct 2015 13:38:48 -0600
Subject: [PATCH 060/199] NVMe: Fix memory leak on retried commands

Resources are reallocated for requeued commands, so unmap and release
the iod for the failed command.

It's a pretty bad memory leak and causes a kernel hang if you remove a
drive because of a busy dma pool. You'll get messages spewing like this:

  nvme 0000:xx:xx.x: dma_pool_destroy prp list 256, ffff880420dec000 busy

and lock up pci and the driver since removal never completes while
holding a lock.

Cc: stable@vger.kernel.org
Cc: <stable@vger.kernel.org> # 4.0.x-
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 drivers/block/nvme-core.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 84e4a80883867..ccc0c1f93daa4 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -604,6 +604,7 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 	struct request *req = iod_get_private(iod);
 	struct nvme_cmd_info *cmd_rq = blk_mq_rq_to_pdu(req);
 	u16 status = le16_to_cpup(&cqe->status) >> 1;
+	bool requeue = false;
 	int error = 0;
 
 	if (unlikely(status)) {
@@ -611,12 +612,13 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 		    && (jiffies - req->start_time) < req->timeout) {
 			unsigned long flags;
 
+			requeue = true;
 			blk_mq_requeue_request(req);
 			spin_lock_irqsave(req->q->queue_lock, flags);
 			if (!blk_queue_stopped(req->q))
 				blk_mq_kick_requeue_list(req->q);
 			spin_unlock_irqrestore(req->q->queue_lock, flags);
-			return;
+			goto release_iod;
 		}
 
 		if (req->cmd_type == REQ_TYPE_DRV_PRIV) {
@@ -639,6 +641,7 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 			"completing aborted command with status:%04x\n",
 			error);
 
+release_iod:
 	if (iod->nents) {
 		dma_unmap_sg(nvmeq->dev->dev, iod->sg, iod->nents,
 			rq_data_dir(req) ? DMA_TO_DEVICE : DMA_FROM_DEVICE);
@@ -651,7 +654,8 @@ static void req_completion(struct nvme_queue *nvmeq, void *ctx,
 	}
 	nvme_free_iod(nvmeq->dev, iod);
 
-	blk_mq_complete_request(req, error);
+	if (likely(!requeue))
+		blk_mq_complete_request(req, error);
 }
 
 /* length is in bytes.  gfp flags indicates whether we may sleep. */

From f5f3497cad8c8416a74b9aaceb127908755d020a Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 14 Oct 2015 13:30:45 +0200
Subject: [PATCH 061/199] x86/setup: Extend low identity map to cover whole
 kernel range

On 32-bit systems, the initial_page_table is reused by
efi_call_phys_prolog as an identity map to call
SetVirtualAddressMap.  efi_call_phys_prolog takes care of
converting the current CPU's GDT to a physical address too.

For PAE kernels the identity mapping is achieved by aliasing the
first PDPE for the kernel memory mapping into the first PDPE
of initial_page_table.  This makes the EFI stub's trick "just work".

However, for non-PAE kernels there is no guarantee that the identity
mapping in the initial_page_table extends as far as the GDT; in this
case, accesses to the GDT will cause a page fault (which quickly becomes
a triple fault).  Fix this by copying the kernel mappings from
swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at
identity mapping.

For some reason, this is only reproducible with QEMU's dynamic translation
mode, and not for example with KVM.  However, even under KVM one can clearly
see that the page table is bogus:

    $ qemu-system-i386 -pflash OVMF.fd -M q35 vmlinuz0 -s -S -daemonize
    $ gdb
    (gdb) target remote localhost:1234
    (gdb) hb *0x02858f6f
    Hardware assisted breakpoint 1 at 0x2858f6f
    (gdb) c
    Continuing.

    Breakpoint 1, 0x02858f6f in ?? ()
    (gdb) monitor info registers
    ...
    GDT=     0724e000 000000ff
    IDT=     fffbb000 000007ff
    CR0=0005003b CR2=ff896000 CR3=032b7000 CR4=00000690
    ...

The page directory is sane:

    (gdb) x/4wx 0x32b7000
    0x32b7000:	0x03398063	0x03399063	0x0339a063	0x0339b063
    (gdb) x/4wx 0x3398000
    0x3398000:	0x00000163	0x00001163	0x00002163	0x00003163
    (gdb) x/4wx 0x3399000
    0x3399000:	0x00400003	0x00401003	0x00402003	0x00403003

but our particular page directory entry is empty:

    (gdb) x/1wx 0x32b7000 + (0x724e000 >> 22) * 4
    0x32b7070:	0x00000000

[ It appears that you can skate past this issue if you don't receive
  any interrupts while the bogus GDT pointer is loaded, or if you avoid
  reloading the segment registers in general.

  Andy Lutomirski provides some additional insight:

   "AFAICT it's entirely permissible for the GDTR and/or LDT
    descriptor to point to unmapped memory.  Any attempt to use them
    (segment loads, interrupts, IRET, etc) will try to access that memory
    as if the access came from CPL 0 and, if the access fails, will
    generate a valid page fault with CR2 pointing into the GDT or
    LDT."

  Up until commit 23a0d4e8fa6d ("efi: Disable interrupts around EFI
  calls, not in the epilog/prolog calls") interrupts were disabled
  around the prolog and epilog calls, and the functional GDT was
  re-installed before interrupts were re-enabled.

  Which explains why no one has hit this issue until now. ]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[ Updated changelog. ]
---
 arch/x86/kernel/setup.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fdb7f2a2d3286..a3cccbfc5f777 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1173,6 +1173,14 @@ void __init setup_arch(char **cmdline_p)
 	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
 			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
 			KERNEL_PGD_PTRS);
+
+	/*
+	 * sync back low identity map too.  It is used for example
+	 * in the 32-bit EFI stub.
+	 */
+	clone_pgd_range(initial_page_table,
+			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
+			KERNEL_PGD_PTRS);
 #endif
 
 	tboot_probe();

From 7ba6e4ef76c7e43101bd5e0f8987c11a8ed0d325 Mon Sep 17 00:00:00 2001
From: Bard Liao <bardliao@realtek.com>
Date: Fri, 16 Oct 2015 15:21:32 +0800
Subject: [PATCH 062/199] ASoC: rt298: correct index default value

Some of the default value on rt298_index_def are incorrect. Change
them to the correct value.

Signed-off-by: Bard Liao <bardliao@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 sound/soc/codecs/rt298.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/sound/soc/codecs/rt298.c b/sound/soc/codecs/rt298.c
index 3c2f0f8d6266e..d3e30a645ae30 100644
--- a/sound/soc/codecs/rt298.c
+++ b/sound/soc/codecs/rt298.c
@@ -50,24 +50,24 @@ struct rt298_priv {
 };
 
 static struct reg_default rt298_index_def[] = {
-	{ 0x01, 0xaaaa },
-	{ 0x02, 0x8aaa },
+	{ 0x01, 0xa5a8 },
+	{ 0x02, 0x8e95 },
 	{ 0x03, 0x0002 },
-	{ 0x04, 0xaf01 },
-	{ 0x08, 0x000d },
-	{ 0x09, 0xd810 },
-	{ 0x0a, 0x0120 },
+	{ 0x04, 0xaf67 },
+	{ 0x08, 0x200f },
+	{ 0x09, 0xd010 },
+	{ 0x0a, 0x0100 },
 	{ 0x0b, 0x0000 },
 	{ 0x0d, 0x2800 },
-	{ 0x0f, 0x0000 },
-	{ 0x19, 0x0a17 },
+	{ 0x0f, 0x0022 },
+	{ 0x19, 0x0217 },
 	{ 0x20, 0x0020 },
 	{ 0x33, 0x0208 },
 	{ 0x46, 0x0300 },
-	{ 0x49, 0x0004 },
-	{ 0x4f, 0x50e9 },
-	{ 0x50, 0x2000 },
-	{ 0x63, 0x2902 },
+	{ 0x49, 0x4004 },
+	{ 0x4f, 0x50c9 },
+	{ 0x50, 0x3000 },
+	{ 0x63, 0x1b02 },
 	{ 0x67, 0x1111 },
 	{ 0x68, 0x1016 },
 	{ 0x69, 0x273f },

From c0ff971ef9acacd4d2caa508e444edad958dead9 Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Thu, 15 Oct 2015 19:42:23 +0200
Subject: [PATCH 063/199] x86/ioapic: Disable interrupts when re-routing legacy
 IRQs

A sporadic hang with consequent crash is observed when booting Hyper-V Gen1
guests:

 Call Trace:
  <IRQ>
  [<ffffffff810ab68d>] ? trace_hardirqs_off+0xd/0x10
  [<ffffffff8107b616>] queue_work_on+0x46/0x90
  [<ffffffff81365696>] ? add_interrupt_randomness+0x176/0x1d0
  ...
  <EOI>
  [<ffffffff81471ddb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60
  [<ffffffff810c295e>] __irq_put_desc_unlock+0x1e/0x40
  [<ffffffff810c5c35>] irq_modify_status+0xb5/0xd0
  [<ffffffff8104adbb>] mp_register_handler+0x4b/0x70
  [<ffffffff8104c55a>] mp_irqdomain_alloc+0x1ea/0x2a0
  [<ffffffff810c7f10>] irq_domain_alloc_irqs_recursive+0x40/0xa0
  [<ffffffff810c860c>] __irq_domain_alloc_irqs+0x13c/0x2b0
  [<ffffffff8104b070>] alloc_isa_irq_from_domain.isra.1+0xc0/0xe0
  [<ffffffff8104bfa5>] mp_map_pin_to_irq+0x165/0x2d0
  [<ffffffff8104c157>] pin_2_irq+0x47/0x80
  [<ffffffff81744253>] setup_IO_APIC+0xfe/0x802
  ...
  [<ffffffff814631c0>] ? rest_init+0x140/0x140

The issue is easily reproducible with a simple instrumentation: if
mdelay(10) is put between mp_setup_entry() and mp_register_handler() calls
in mp_irqdomain_alloc() Hyper-V guest always fails to boot when re-routing
IRQ0. The issue seems to be caused by the fact that we don't disable
interrupts while doing IOPIC programming for legacy IRQs and IRQ0 actually
happens.

Protect the setup sequence against concurrent interrupts.

[ tglx: Make the protection unconditional and not only for legacy
  	interrupts ]

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1444930943-19336-1-git-send-email-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/io_apic.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 5c60bb1626220..bb6bfc01cb82d 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2907,6 +2907,7 @@ int mp_irqdomain_alloc(struct irq_domain *domain, unsigned int virq,
 	struct irq_data *irq_data;
 	struct mp_chip_data *data;
 	struct irq_alloc_info *info = arg;
+	unsigned long flags;
 
 	if (!info || nr_irqs > 1)
 		return -EINVAL;
@@ -2939,11 +2940,14 @@ int mp_irqdomain_alloc(struct irq_domain *domain, unsigned int virq,
 
 	cfg = irqd_cfg(irq_data);
 	add_pin_to_irq_node(data, ioapic_alloc_attr_node(info), ioapic, pin);
+
+	local_irq_save(flags);
 	if (info->ioapic_entry)
 		mp_setup_entry(cfg, data, info->ioapic_entry);
 	mp_register_handler(virq, data->trigger);
 	if (virq < nr_legacy_irqs())
 		legacy_pic->mask(virq);
+	local_irq_restore(flags);
 
 	apic_printk(APIC_VERBOSE, KERN_DEBUG
 		    "IOAPIC[%d]: Set routing entry (%d-%d -> 0x%x -> IRQ %d Mode:%i Active:%i Dest:%d)\n",

From 34198710f55b5f359f43e67d9a08fe5aadfbca1b Mon Sep 17 00:00:00 2001
From: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Date: Wed, 14 Oct 2015 13:31:24 +0100
Subject: [PATCH 064/199] ASoC: Add info callback for SX_TLV controls

SX_TLV controls are intended for situations where the register behind
the control has some non-zero value indicating the minimum gain
and then gains increasing from there and eventually overflowing through
zero.

Currently every CODEC implementing these controls specifies the minimum
as the non-zero value for the minimum and the maximum as the number of
gain settings available.

This means when the info callback subtracts the minimum value from the
maximum value to calculate the number of gain levels available it is
actually under reporting the available levels. This patch fixes this
issue by adding a new snd_soc_info_volsw_sx callback that does not
subtract the minimum value.

Fixes: 1d99f2436d0d ("ASoC: core: Rework SOC_DOUBLE_R_SX_TLV add SOC_SINGLE_SX_TLV")
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: Brian Austin <brian.austin@cirrus.com>
Tested-by: Brian Austin <brian.austin@cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
---
 include/sound/soc.h |  6 ++++--
 sound/soc/soc-ops.c | 28 ++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/include/sound/soc.h b/include/sound/soc.h
index 884e728b09d9a..26ede14597dab 100644
--- a/include/sound/soc.h
+++ b/include/sound/soc.h
@@ -86,7 +86,7 @@
 	.access = SNDRV_CTL_ELEM_ACCESS_TLV_READ | \
 	SNDRV_CTL_ELEM_ACCESS_READWRITE, \
 	.tlv.p  = (tlv_array),\
-	.info = snd_soc_info_volsw, \
+	.info = snd_soc_info_volsw_sx, \
 	.get = snd_soc_get_volsw_sx,\
 	.put = snd_soc_put_volsw_sx, \
 	.private_value = (unsigned long)&(struct soc_mixer_control) \
@@ -156,7 +156,7 @@
 	.access = SNDRV_CTL_ELEM_ACCESS_TLV_READ | \
 	SNDRV_CTL_ELEM_ACCESS_READWRITE, \
 	.tlv.p  = (tlv_array), \
-	.info = snd_soc_info_volsw, \
+	.info = snd_soc_info_volsw_sx, \
 	.get = snd_soc_get_volsw_sx, \
 	.put = snd_soc_put_volsw_sx, \
 	.private_value = (unsigned long)&(struct soc_mixer_control) \
@@ -574,6 +574,8 @@ int snd_soc_put_enum_double(struct snd_kcontrol *kcontrol,
 	struct snd_ctl_elem_value *ucontrol);
 int snd_soc_info_volsw(struct snd_kcontrol *kcontrol,
 	struct snd_ctl_elem_info *uinfo);
+int snd_soc_info_volsw_sx(struct snd_kcontrol *kcontrol,
+			  struct snd_ctl_elem_info *uinfo);
 #define snd_soc_info_bool_ext		snd_ctl_boolean_mono_info
 int snd_soc_get_volsw(struct snd_kcontrol *kcontrol,
 	struct snd_ctl_elem_value *ucontrol);
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 100d92b5b77ef..05977ae1ff2a3 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -206,6 +206,34 @@ int snd_soc_info_volsw(struct snd_kcontrol *kcontrol,
 }
 EXPORT_SYMBOL_GPL(snd_soc_info_volsw);
 
+/**
+ * snd_soc_info_volsw_sx - Mixer info callback for SX TLV controls
+ * @kcontrol: mixer control
+ * @uinfo: control element information
+ *
+ * Callback to provide information about a single mixer control, or a double
+ * mixer control that spans 2 registers of the SX TLV type. SX TLV controls
+ * have a range that represents both positive and negative values either side
+ * of zero but without a sign bit.
+ *
+ * Returns 0 for success.
+ */
+int snd_soc_info_volsw_sx(struct snd_kcontrol *kcontrol,
+			  struct snd_ctl_elem_info *uinfo)
+{
+	struct soc_mixer_control *mc =
+		(struct soc_mixer_control *)kcontrol->private_value;
+
+	snd_soc_info_volsw(kcontrol, uinfo);
+	/* Max represents the number of levels in an SX control not the
+	 * maximum value, so add the minimum value back on
+	 */
+	uinfo->value.integer.max += mc->min;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snd_soc_info_volsw_sx);
+
 /**
  * snd_soc_get_volsw - single mixer get callback
  * @kcontrol: mixer control

From 6a3b764b8dc781c36f0f94287df5b2ec23b8fdd7 Mon Sep 17 00:00:00 2001
From: Tony Lindgren <tony@atomide.com>
Date: Fri, 16 Oct 2015 12:16:21 -0700
Subject: [PATCH 065/199] ARM: OMAP2+: Fix oops with LPAE and more than 2GB of
 memory

On boards with more than 2GB of RAM booting goes wrong with things not
working and we're getting lots of l3 warnings:

WARNING: CPU: 0 PID: 1 at drivers/bus/omap_l3_noc.c:147
l3_interrupt_handler+0x260/0x384()
44000000.ocp:L3 Custom Error: MASTER MMC6 TARGET DMM1 (Idle):
Data Access in User mode during Functional access
...
[<c044e158>] (scsi_add_host_with_dma) from [<c04705c8>]
(ata_scsi_add_hosts+0x5c/0x18c)
[<c04705c8>] (ata_scsi_add_hosts) from [<c046b13c>]
(ata_host_register+0x150/0x2cc)
[<c046b13c>] (ata_host_register) from [<c046b38c>]
(ata_host_activate+0xd4/0x124)
[<c046b38c>] (ata_host_activate) from [<c047f42c>]
(ahci_host_activate+0x5c/0x194)
[<c047f42c>] (ahci_host_activate) from [<c0480854>]
(ahci_platform_init_host+0x1f0/0x3f0)
[<c0480854>] (ahci_platform_init_host) from [<c047c9dc>]
(ahci_probe+0x70/0x98)
[<c047c9dc>] (ahci_probe) from [<c04220cc>]
(platform_drv_probe+0x54/0xb4)

Let's fix the issue by enabling ZONE_DMA for LPAE. Note that we need to
limit dma_zone_size to 2GB as the rest of the RAM is beyond the 4GB limit.

Let's also fix things for dra7 as done in similar patches in the TI tree
by Lokesh Vutla <lokeshvutla@ti.com>.

Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 arch/arm/mach-omap2/Kconfig         | 2 ++
 arch/arm/mach-omap2/board-generic.c | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index b3a0dff67e3fc..33d1460a56391 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -49,6 +49,7 @@ config SOC_OMAP5
 	select OMAP_INTERCONNECT
 	select OMAP_INTERCONNECT_BARRIER
 	select PM_OPP if PM
+	select ZONE_DMA if ARM_LPAE
 
 config SOC_AM33XX
 	bool "TI AM33XX"
@@ -78,6 +79,7 @@ config SOC_DRA7XX
 	select OMAP_INTERCONNECT
 	select OMAP_INTERCONNECT_BARRIER
 	select PM_OPP if PM
+	select ZONE_DMA if ARM_LPAE
 
 config ARCH_OMAP2PLUS
 	bool
diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c
index 6133eaac685df..6b59230cd6be7 100644
--- a/arch/arm/mach-omap2/board-generic.c
+++ b/arch/arm/mach-omap2/board-generic.c
@@ -243,6 +243,9 @@ static const char *const omap5_boards_compat[] __initconst = {
 };
 
 DT_MACHINE_START(OMAP5_DT, "Generic OMAP5 (Flattened Device Tree)")
+#if defined(CONFIG_ZONE_DMA) && defined(CONFIG_ARM_LPAE)
+	.dma_zone_size	= SZ_2G,
+#endif
 	.reserve	= omap_reserve,
 	.smp		= smp_ops(omap4_smp_ops),
 	.map_io		= omap5_map_io,
@@ -288,6 +291,9 @@ static const char *const dra74x_boards_compat[] __initconst = {
 };
 
 DT_MACHINE_START(DRA74X_DT, "Generic DRA74X (Flattened Device Tree)")
+#if defined(CONFIG_ZONE_DMA) && defined(CONFIG_ARM_LPAE)
+	.dma_zone_size	= SZ_2G,
+#endif
 	.reserve	= omap_reserve,
 	.smp		= smp_ops(omap4_smp_ops),
 	.map_io		= dra7xx_map_io,
@@ -308,6 +314,9 @@ static const char *const dra72x_boards_compat[] __initconst = {
 };
 
 DT_MACHINE_START(DRA72X_DT, "Generic DRA72X (Flattened Device Tree)")
+#if defined(CONFIG_ZONE_DMA) && defined(CONFIG_ARM_LPAE)
+	.dma_zone_size	= SZ_2G,
+#endif
 	.reserve	= omap_reserve,
 	.map_io		= dra7xx_map_io,
 	.init_early	= dra7xx_init_early,

From b28fec1324bf8f5010d2c3c5d57db4115bda66d4 Mon Sep 17 00:00:00 2001
From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Date: Sat, 17 Oct 2015 08:08:56 +0900
Subject: [PATCH 066/199] thermal: exynos: Fix register read in TMU

The value of emul_con was getting overwritten if the selected soc is
SOC_ARCH_EXYNOS5260. And so as a result we were reading from the wrong
register in the case of SOC_ARCH_EXYNOS5260.

Fixes: 488c7455d74c ("thermal: exynos: Add the support for Exynos5433 TMU")
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
---
 drivers/thermal/samsung/exynos_tmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/samsung/exynos_tmu.c b/drivers/thermal/samsung/exynos_tmu.c
index 0bae8cc6c23a0..ca920b0ecf8f8 100644
--- a/drivers/thermal/samsung/exynos_tmu.c
+++ b/drivers/thermal/samsung/exynos_tmu.c
@@ -932,7 +932,7 @@ static void exynos4412_tmu_set_emulation(struct exynos_tmu_data *data,
 
 	if (data->soc == SOC_ARCH_EXYNOS5260)
 		emul_con = EXYNOS5260_EMUL_CON;
-	if (data->soc == SOC_ARCH_EXYNOS5433)
+	else if (data->soc == SOC_ARCH_EXYNOS5433)
 		emul_con = EXYNOS5433_TMU_EMUL_CON;
 	else if (data->soc == SOC_ARCH_EXYNOS7)
 		emul_con = EXYNOS7_TMU_REG_EMUL_CON;

From e210c422b6fdd2dc123bedc588f399aefd8bf9de Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Mon, 12 Oct 2015 11:30:11 +0300
Subject: [PATCH 067/199] xhci: don't finish a TD if we get a short transfer
 event mid TD

If the difference is big enough between the bytes asked and received
in a bulk transfer we can get a short transfer event pointing to a TRB in
the middle of the TD. We don't want to handle the TD yet as we will anyway
receive a new event for the last TRB in the TD.

Hold off from finishing the TD and removing it from the list until we
receive an event for the last TRB in the TD

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/usb/host/xhci-ring.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 43291f93afeb5..79e89e6aef73b 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2191,6 +2191,10 @@ static int process_bulk_intr_td(struct xhci_hcd *xhci, struct xhci_td *td,
 		}
 	/* Fast path - was this the last TRB in the TD for this URB? */
 	} else if (event_trb == td->last_trb) {
+		if (td->urb_length_set && trb_comp_code == COMP_SHORT_TX)
+			return finish_td(xhci, td, event_trb, event, ep,
+					 status, false);
+
 		if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) {
 			td->urb->actual_length =
 				td->urb->transfer_buffer_length -
@@ -2242,6 +2246,12 @@ static int process_bulk_intr_td(struct xhci_hcd *xhci, struct xhci_td *td,
 			td->urb->actual_length +=
 				TRB_LEN(le32_to_cpu(cur_trb->generic.field[2])) -
 				EVENT_TRB_LEN(le32_to_cpu(event->transfer_len));
+
+		if (trb_comp_code == COMP_SHORT_TX) {
+			xhci_dbg(xhci, "mid bulk/intr SP, wait for last TRB event\n");
+			td->urb_length_set = true;
+			return 0;
+		}
 	}
 
 	return finish_td(xhci, td, event_trb, event, ep, status, false);

From 3b4739b8951d650becbcd855d7d6f18ac98a9a85 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Mon, 12 Oct 2015 11:30:12 +0300
Subject: [PATCH 068/199] xhci: handle no ping response error properly

If a host fails to wake up a isochronous SuperSpeed device from U1/U2
in time for a isoch transfer it will generate a "No ping response error"
Host will then move to the next transfer descriptor.

Handle this case in the same way as missed service errors, tag the
current TD as skipped and handle it on the next transfer event.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/usb/host/xhci-ring.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 79e89e6aef73b..97ffe39972735 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2284,6 +2284,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 	u32 trb_comp_code;
 	int ret = 0;
 	int td_num = 0;
+	bool handling_skipped_tds = false;
 
 	slot_id = TRB_TO_SLOT_ID(le32_to_cpu(event->flags));
 	xdev = xhci->devs[slot_id];
@@ -2420,6 +2421,10 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 		ep->skip = true;
 		xhci_dbg(xhci, "Miss service interval error, set skip flag\n");
 		goto cleanup;
+	case COMP_PING_ERR:
+		ep->skip = true;
+		xhci_dbg(xhci, "No Ping response error, Skip one Isoc TD\n");
+		goto cleanup;
 	default:
 		if (xhci_is_vendor_info_code(xhci, trb_comp_code)) {
 			status = 0;
@@ -2556,13 +2561,18 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 						 ep, &status);
 
 cleanup:
+
+
+		handling_skipped_tds = ep->skip &&
+			trb_comp_code != COMP_MISSED_INT &&
+			trb_comp_code != COMP_PING_ERR;
+
 		/*
-		 * Do not update event ring dequeue pointer if ep->skip is set.
-		 * Will roll back to continue process missed tds.
+		 * Do not update event ring dequeue pointer if we're in a loop
+		 * processing missed tds.
 		 */
-		if (trb_comp_code == COMP_MISSED_INT || !ep->skip) {
+		if (!handling_skipped_tds)
 			inc_deq(xhci, xhci->event_ring);
-		}
 
 		if (ret) {
 			urb = td->urb;
@@ -2597,7 +2607,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 	 * Process them as short transfer until reach the td pointed by
 	 * the event.
 	 */
-	} while (ep->skip && trb_comp_code != COMP_MISSED_INT);
+	} while (handling_skipped_tds);
 
 	return 0;
 }

From fd7cd061adcf5f7503515ba52b6a724642a839c8 Mon Sep 17 00:00:00 2001
From: Laura Abbott <labbott@fedoraproject.org>
Date: Mon, 12 Oct 2015 11:30:13 +0300
Subject: [PATCH 069/199] xhci: Add spurious wakeup quirk for LynxPoint-LP
 controllers

We received several reports of systems rebooting and powering on
after an attempted shutdown. Testing showed that setting
XHCI_SPURIOUS_WAKEUP quirk in addition to the XHCI_SPURIOUS_REBOOT
quirk allowed the system to shutdown as expected for LynxPoint-LP
xHCI controllers. Set the quirk back.

Note that the quirk was originally introduced for LynxPoint and
LynxPoint-LP just for this same reason. See:

commit 638298dc66ea ("xhci: Fix spurious wakeups after S5 on Haswell")

It was later limited to only concern HP machines as it caused
regression on some machines, see both bug and commit:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66171
commit 6962d914f317 ("xhci: Limit the spurious wakeup fix only to HP machines")

Later it was discovered that the powering on after shutdown
was limited to LynxPoint-LP (Haswell-ULT) and that some non-LP HP
machine suffered from spontaneous resume from S3 (which should
not be related to the SPURIOUS_WAKEUP quirk at all). An attempt
to fix this then removed the SPURIOUS_WAKEUP flag usage completely.

commit b45abacde3d5 ("xhci: no switching back on non-ULT Haswell")

Current understanding is that LynxPoint-LP (Haswell ULT) machines
need the SPURIOUS_WAKEUP quirk, otherwise they will restart, and
plain Lynxpoint (Haswell) machines may _not_ have the quirk
set otherwise they again will restart.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Oliver Neukum <oneukum@suse.com>
[Added more history to commit message -Mathias]
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/usb/host/xhci-pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index c79d33676672d..c47d3e4805865 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -147,6 +147,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
 	if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
 		pdev->device == PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI) {
 		xhci->quirks |= XHCI_SPURIOUS_REBOOT;
+		xhci->quirks |= XHCI_SPURIOUS_WAKEUP;
 	}
 	if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
 		(pdev->device == PCI_DEVICE_ID_INTEL_SUNRISEPOINT_LP_XHCI ||

From 324700604b04954510ddd4c6841a88a06938a28c Mon Sep 17 00:00:00 2001
From: Vladimir Zapolskiy <vz@mleia.com>
Date: Sat, 17 Oct 2015 11:30:15 -0700
Subject: [PATCH 070/199] Input: lpc32xx_ts - fix warnings caused by enabling
 unprepared clock

If common clock framework is configured, the driver generates a warning,
which is fixed by this change:

    root@devkit3250:~# cat /dev/input/touchscreen0
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 720 at drivers/clk/clk.c:727 clk_core_enable+0x2c/0xa4()
    Modules linked in: sc16is7xx snd_soc_uda1380
    CPU: 0 PID: 720 Comm: cat Not tainted 4.3.0-rc2+ #199
    Hardware name: LPC32XX SoC (Flattened Device Tree)
    Backtrace:
    [<>] (dump_backtrace) from [<>] (show_stack+0x18/0x1c)
    [<>] (show_stack) from [<>] (dump_stack+0x20/0x28)
    [<>] (dump_stack) from [<>] (warn_slowpath_common+0x90/0xb8)
    [<>] (warn_slowpath_common) from [<>] (warn_slowpath_null+0x24/0x2c)
    [<>] (warn_slowpath_null) from [<>] (clk_core_enable+0x2c/0xa4)
    [<>] (clk_core_enable) from [<>] (clk_enable+0x24/0x38)
    [<>] (clk_enable) from [<>] (lpc32xx_setup_tsc+0x18/0xa0)
    [<>] (lpc32xx_setup_tsc) from [<>] (lpc32xx_ts_open+0x14/0x1c)
    [<>] (lpc32xx_ts_open) from [<>] (input_open_device+0x74/0xb0)
    [<>] (input_open_device) from [<>] (evdev_open+0x110/0x16c)
    [<>] (evdev_open) from [<>] (chrdev_open+0x1b4/0x1dc)
    [<>] (chrdev_open) from [<>] (do_dentry_open+0x1dc/0x2f4)
    [<>] (do_dentry_open) from [<>] (vfs_open+0x6c/0x70)
    [<>] (vfs_open) from [<>] (path_openat+0xb4c/0xddc)
    [<>] (path_openat) from [<>] (do_filp_open+0x40/0x8c)
    [<>] (do_filp_open) from [<>] (do_sys_open+0x124/0x1c4)
    [<>] (do_sys_open) from [<>] (SyS_open+0x2c/0x30)
    [<>] (SyS_open) from [<>] (ret_fast_syscall+0x0/0x38)

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---
 drivers/input/touchscreen/lpc32xx_ts.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/touchscreen/lpc32xx_ts.c b/drivers/input/touchscreen/lpc32xx_ts.c
index 24d704cd9f882..7fbb3b0c85715 100644
--- a/drivers/input/touchscreen/lpc32xx_ts.c
+++ b/drivers/input/touchscreen/lpc32xx_ts.c
@@ -139,14 +139,14 @@ static void lpc32xx_stop_tsc(struct lpc32xx_tsc *tsc)
 		   tsc_readl(tsc, LPC32XX_TSC_CON) &
 			     ~LPC32XX_TSC_ADCCON_AUTO_EN);
 
-	clk_disable(tsc->clk);
+	clk_disable_unprepare(tsc->clk);
 }
 
 static void lpc32xx_setup_tsc(struct lpc32xx_tsc *tsc)
 {
 	u32 tmp;
 
-	clk_enable(tsc->clk);
+	clk_prepare_enable(tsc->clk);
 
 	tmp = tsc_readl(tsc, LPC32XX_TSC_CON) & ~LPC32XX_TSC_ADCCON_POWER_UP;
 

From f967fc8f165fadb72166f2bd4785094b3ca21307 Mon Sep 17 00:00:00 2001
From: Frederic Danis <frederic.danis@linux.intel.com>
Date: Fri, 9 Oct 2015 17:14:56 +0200
Subject: [PATCH 071/199] Revert "serial: 8250_dma: don't bother DMA with small
 transfers"

This reverts commit 9119fba0cfeda6d415c9f068df66838a104b87cb.

This commit prevents from sending "big" file using Bluetooth.
When sending a lot of data quickly through the Bluetooth interface, and
after a variable amount of data sent, transfer fails with error:
    kernel: [  415.247453] Bluetooth: hci0 hardware error 0x00

Found on T100TA.

After reverting this commit, send works fine for any file size.

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Fixes: 9119fba0cfed (serial: 8250_dma: don't bother DMA with small transfers)
Cc: stable@vger.kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/tty/serial/8250/8250_dma.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_dma.c b/drivers/tty/serial/8250/8250_dma.c
index 21d01a491405a..e508939daea3f 100644
--- a/drivers/tty/serial/8250/8250_dma.c
+++ b/drivers/tty/serial/8250/8250_dma.c
@@ -80,10 +80,6 @@ int serial8250_tx_dma(struct uart_8250_port *p)
 		return 0;
 
 	dma->tx_size = CIRC_CNT_TO_END(xmit->head, xmit->tail, UART_XMIT_SIZE);
-	if (dma->tx_size < p->port.fifosize) {
-		ret = -EINVAL;
-		goto err;
-	}
 
 	desc = dmaengine_prep_slave_single(dma->txchan,
 					   dma->tx_addr + xmit->tail,

From f235f664a8afabccf863a5dee4777d2d7b676fda Mon Sep 17 00:00:00 2001
From: Scot Doyle <lkml14@scotdoyle.com>
Date: Fri, 9 Oct 2015 15:08:10 +0000
Subject: [PATCH 072/199] fbcon: initialize blink interval before calling
 fb_set_par

Since commit 27a4c827c34ac4256a190cc9d24607f953c1c459
    fbcon: use the cursor blink interval provided by vt

a PPC64LE kernel fails to boot when fbcon_add_cursor_timer uses an
uninitialized ops->cur_blink_jiffies. Prevent by initializing
in fbcon_init before the call to info->fbops->fb_set_par.

Reported-and-tested-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
Cc: <stable@vger.kernel.org> [v4.2]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/video/console/fbcon.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 1aaf89300621a..92f394927f241 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -1093,6 +1093,7 @@ static void fbcon_init(struct vc_data *vc, int init)
 		con_copy_unimap(vc, svc);
 
 	ops = info->fbcon_par;
+	ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
 	p->con_rotate = initial_rotation;
 	set_blitting_type(vc, info);
 

From c8a1978e07c412f8d79b8612f45aaafe7238ca62 Mon Sep 17 00:00:00 2001
From: Randy Dunlap <rdunlap@infradead.org>
Date: Sun, 18 Oct 2015 16:25:53 -0700
Subject: [PATCH 073/199] Input: sur40 - add dependency on VIDEO_V4L2

Fix build errors due to missing Kconfig dependency.

drivers/built-in.o: In function `sur40_disconnect':
sur40.c:(.text+0x22be6e): undefined reference to `video_unregister_device'
sur40.c:(.text+0x22be77): undefined reference to `v4l2_device_unregister'
drivers/built-in.o: In function `sur40_process_video':
sur40.c:(.text+0x22c1d4): undefined reference to `v4l2_get_timestamp'
drivers/built-in.o: In function `sur40_probe':
sur40.c:(.text+0x22ca82): undefined reference to `v4l2_device_register'
sur40.c:(.text+0x22cb1a): undefined reference to `v4l2_device_unregister'
sur40.c:(.text+0x22cbf7): undefined reference to `video_device_release_empty'
sur40.c:(.text+0x22cc53): undefined reference to `__video_register_device'
sur40.c:(.text+0x22cc90): undefined reference to `video_unregister_device'
drivers/built-in.o: In function `sur40_vidioc_querycap':
sur40.c:(.text+0x22ccb0): undefined reference to `video_devdata'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---
 drivers/input/touchscreen/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/input/touchscreen/Kconfig b/drivers/input/touchscreen/Kconfig
index 600dcceff5426..deb14c12ae8b1 100644
--- a/drivers/input/touchscreen/Kconfig
+++ b/drivers/input/touchscreen/Kconfig
@@ -1006,6 +1006,7 @@ config TOUCHSCREEN_SUN4I
 config TOUCHSCREEN_SUR40
 	tristate "Samsung SUR40 (Surface 2.0/PixelSense) touchscreen"
 	depends on USB && MEDIA_USB_SUPPORT && HAS_DMA
+	depends on VIDEO_V4L2
 	select INPUT_POLLDEV
 	select VIDEOBUF2_DMA_SG
 	help

From f1ccd249319efca4ee4faf1d904f5a362cac7c81 Mon Sep 17 00:00:00 2001
From: Len Brown <len.brown@intel.com>
Date: Fri, 16 Oct 2015 00:14:28 -0400
Subject: [PATCH 074/199] x86/smpboot: Fix cpu_init_udelay=10000 corner case
 boot parameter misbehavior

For legacy machines cpu_init_udelay defaults to 10,000.
For modern machines it is set to 0.

The user should be able to set cpu_init_udelay to
any value on the cmdline, including 10,000.

Before this patch, that was seen as "unchanged from default"
and thus on a modern machine, the user request was ignored
and the delay was set to 0.

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dparsons@brightdsl.net
Cc: shrybman@teksavvy.com
Link: http://lkml.kernel.org/r/de363cdbbcfcca1d22569683f7eb9873e0177251.1444968087.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/smpboot.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e0c198e5f9202..32267ccac3d70 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -509,7 +509,7 @@ void __inquire_remote_apic(int apicid)
  */
 #define UDELAY_10MS_DEFAULT 10000
 
-static unsigned int init_udelay = UDELAY_10MS_DEFAULT;
+static unsigned int init_udelay = INT_MAX;
 
 static int __init cpu_init_udelay(char *str)
 {
@@ -522,13 +522,16 @@ early_param("cpu_init_udelay", cpu_init_udelay);
 static void __init smp_quirk_init_udelay(void)
 {
 	/* if cmdline changed it from default, leave it alone */
-	if (init_udelay != UDELAY_10MS_DEFAULT)
+	if (init_udelay != INT_MAX)
 		return;
 
 	/* if modern processor, use no delay */
 	if (((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) && (boot_cpu_data.x86 == 6)) ||
 	    ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) && (boot_cpu_data.x86 >= 0xF)))
 		init_udelay = 0;
+
+	/* else, use legacy delay */
+	init_udelay = UDELAY_10MS_DEFAULT;
 }
 
 /*

From fcafddec4e78a7776db4b6685db6b2902d4300fc Mon Sep 17 00:00:00 2001
From: Len Brown <len.brown@intel.com>
Date: Fri, 16 Oct 2015 00:14:29 -0400
Subject: [PATCH 075/199] x86/smpboot: Fix CPU #1 boot timeout

The following commit:

  a9bcaa02a5104ac ("x86/smpboot: Remove SIPI delays from cpu_up()")

Caused some Intel Core2 processors to time-out when bringing up CPU #1,
resulting in the missing of that CPU after bootup.

That patch reduced the SIPI delays from udelay() 300, 200 to udelay() 0,
0 on modern processors.

Several Intel(R) Core(TM)2 systems failed to bring up CPU #1 10/10 times
after that change.

Increasing either of the SIPI delays to udelay(1) results in
success. So here we increase both to udelay(10).  While this may
be 20x slower than the absolute minimum, it is still 20x to 30x
faster than the original code.

Tested-by: Donald Parsons <dparsons@brightdsl.net>
Tested-by: Shane <shrybman@teksavvy.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dparsons@brightdsl.net
Cc: shrybman@teksavvy.com
Link: http://lkml.kernel.org/r/6dd554ee8945984d85aafb2ad35793174d068af0.1444968087.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/smpboot.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 32267ccac3d70..892ee2e5ecbce 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -660,7 +660,9 @@ wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
 		/*
 		 * Give the other CPU some time to accept the IPI.
 		 */
-		if (init_udelay)
+		if (init_udelay == 0)
+			udelay(10);
+		else
 			udelay(300);
 
 		pr_debug("Startup point 1\n");
@@ -671,7 +673,9 @@ wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
 		/*
 		 * Give the other CPU some time to accept the IPI.
 		 */
-		if (init_udelay)
+		if (init_udelay == 0)
+			udelay(10);
+		else
 			udelay(200);
 
 		if (maxlvt > 3)		/* Due to the Pentium erratum 3AP.  */

From a75ca545e8d57473da47ece828ad98a10727ec6f Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
Date: Fri, 16 Oct 2015 14:28:53 +0300
Subject: [PATCH 076/199] x86, kasan: Fix build failure on KASAN=y &&
 KMEMCHECK=y kernels
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Declaration of memcpy() is hidden under #ifndef CONFIG_KMEMCHECK.
In asm/efi.h under #ifdef CONFIG_KASAN we #undef memcpy(), due to
which the following happens:

  In file included from arch/x86/kernel/setup.c:96:0:
  ./arch/x86/include/asm/desc.h: In function ‘native_write_idt_entry’:
  ./arch/x86/include/asm/desc.h:122:2: error: implicit declaration of function ‘memcpy’ [-Werror=implicit-function-declaration]   memcpy(&idt[entry], gate, sizeof(*gate));
    ^
    cc1: some warnings being treated as errors
    make[2]: *** [arch/x86/kernel/setup.o] Error 1

We will get rid of that #undef in asm/efi.h eventually.
But in the meanwhile move memcpy() declaration out of #ifdefs
to fix the build.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444994933-28328-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/string_64.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index e4661196994e8..ff8b9a17dc4b2 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -27,12 +27,11 @@ static __always_inline void *__inline_memcpy(void *to, const void *from, size_t
    function. */
 
 #define __HAVE_ARCH_MEMCPY 1
+extern void *memcpy(void *to, const void *from, size_t len);
 extern void *__memcpy(void *to, const void *from, size_t len);
 
 #ifndef CONFIG_KMEMCHECK
-#if (__GNUC__ == 4 && __GNUC_MINOR__ >= 3) || __GNUC__ > 4
-extern void *memcpy(void *to, const void *from, size_t len);
-#else
+#if (__GNUC__ == 4 && __GNUC_MINOR__ < 3) || __GNUC__ < 4
 #define memcpy(dst, src, len)					\
 ({								\
 	size_t __len = (len);					\

From 911b79cde95c7da0ec02f48105358a36636b7a71 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells@redhat.com>
Date: Mon, 19 Oct 2015 11:20:28 +0100
Subject: [PATCH 077/199] KEYS: Don't permit request_key() to construct a new
 keyring

If request_key() is used to find a keyring, only do the search part - don't
do the construction part if the keyring was not found by the search.  We
don't really want keyrings in the negative instantiated state since the
rejected/negative instantiation error value in the payload is unioned with
keyring metadata.

Now the kernel gives an error:

	request_key("keyring", "#selinux,bdekeyring", "keyring", KEY_SPEC_USER_SESSION_KEYRING) = -1 EPERM (Operation not permitted)

Signed-off-by: David Howells <dhowells@redhat.com>
---
 security/keys/request_key.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 486ef6fa393b2..0d62531242783 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -440,6 +440,9 @@ static struct key *construct_key_and_link(struct keyring_search_context *ctx,
 
 	kenter("");
 
+	if (ctx->index_key.type == &key_type_keyring)
+		return ERR_PTR(-EPERM);
+	
 	user = key_user_lookup(current_fsuid());
 	if (!user)
 		return ERR_PTR(-ENOMEM);

From 57df5380853460bc66b59a46273ce113c923d39c Mon Sep 17 00:00:00 2001
From: Tony Lindgren <tony@atomide.com>
Date: Fri, 16 Oct 2015 12:23:33 -0700
Subject: [PATCH 078/199] ARM: OMAP2+: Fix imprecise external abort caused by
 bogus SRAM init

Some omaps are producing imprecise external aborts because we are
wrongly trying to init SRAM for device tree based booting. Only
omap3 is still using the legacy SRAM code, so we need to make it
omap3 specific. Otherwise we can get errors like this on at least
dm814x:

Unhandled fault: imprecise external abort (0xc06) at 0xc08b156c
...
(omap_rev) from [<c08b12e0>] (omap_sram_init+0xf8/0x3e0)
(omap_sram_init) from [<c08aca0c>] (omap_sdrc_init+0x10/0xb0)
(omap_sdrc_init) from [<c08b581c>] (pdata_quirks_init+0x18/0x44)
(pdata_quirks_init) from [<c08b5478>] (omap_generic_init+0x10/0x1c)
(omap_generic_init) from [<c08a57e0>] (customize_machine+0x1c/0x40)
(customize_machine) from [<c00098a4>] (do_one_initcall+0x80/0x1dc)
(do_one_initcall) from [<c08a2ec4>] (kernel_init_freeable+0x218/0x2e8)
(kernel_init_freeable) from [<c063a554>] (kernel_init+0x8/0xec)
(kernel_init) from [<c000f890>] (ret_from_fork+0x14/0x24)

Let's fix the issue by making sure omap_sdrc_init only gets called for
omap3. To do that, we need to have compatible "ti,omap3" in the dts
files. And let's also use "ti,omap3630" instead of "ti,omap36xx" like
we're supposed to.

Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 arch/arm/boot/dts/logicpd-torpedo-37xx-devkit.dts | 2 +-
 arch/arm/boot/dts/omap3-evm-37xx.dts              | 2 +-
 arch/arm/mach-omap2/board-generic.c               | 1 +
 arch/arm/mach-omap2/pdata-quirks.c                | 9 ++++++++-
 4 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/logicpd-torpedo-37xx-devkit.dts b/arch/arm/boot/dts/logicpd-torpedo-37xx-devkit.dts
index 91146c318798f..5b0430041ec6d 100644
--- a/arch/arm/boot/dts/logicpd-torpedo-37xx-devkit.dts
+++ b/arch/arm/boot/dts/logicpd-torpedo-37xx-devkit.dts
@@ -12,7 +12,7 @@
 
 / {
 	model = "LogicPD Zoom DM3730 Torpedo Development Kit";
-	compatible = "logicpd,dm3730-torpedo-devkit", "ti,omap36xx";
+	compatible = "logicpd,dm3730-torpedo-devkit", "ti,omap3630", "ti,omap3";
 
 	gpio_keys {
 		compatible = "gpio-keys";
diff --git a/arch/arm/boot/dts/omap3-evm-37xx.dts b/arch/arm/boot/dts/omap3-evm-37xx.dts
index 16e8ce350ddaa..bb339d1648e07 100644
--- a/arch/arm/boot/dts/omap3-evm-37xx.dts
+++ b/arch/arm/boot/dts/omap3-evm-37xx.dts
@@ -13,7 +13,7 @@
 
 / {
 	model = "TI OMAP37XX EVM (TMDSEVM3730)";
-	compatible = "ti,omap3-evm-37xx", "ti,omap36xx";
+	compatible = "ti,omap3-evm-37xx", "ti,omap3630", "ti,omap3";
 
 	memory {
 		device_type = "memory";
diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c
index 6b59230cd6be7..fb219a30c10c6 100644
--- a/arch/arm/mach-omap2/board-generic.c
+++ b/arch/arm/mach-omap2/board-generic.c
@@ -106,6 +106,7 @@ DT_MACHINE_START(OMAP3_DT, "Generic OMAP3 (Flattened Device Tree)")
 MACHINE_END
 
 static const char *const omap36xx_boards_compat[] __initconst = {
+	"ti,omap3630",
 	"ti,omap36xx",
 	NULL,
 };
diff --git a/arch/arm/mach-omap2/pdata-quirks.c b/arch/arm/mach-omap2/pdata-quirks.c
index ea56397599c21..1dfe34654c43a 100644
--- a/arch/arm/mach-omap2/pdata-quirks.c
+++ b/arch/arm/mach-omap2/pdata-quirks.c
@@ -559,7 +559,14 @@ static void pdata_quirks_check(struct pdata_init *quirks)
 
 void __init pdata_quirks_init(const struct of_device_id *omap_dt_match_table)
 {
-	omap_sdrc_init(NULL, NULL);
+	/*
+	 * We still need this for omap2420 and omap3 PM to work, others are
+	 * using drivers/misc/sram.c already.
+	 */
+	if (of_machine_is_compatible("ti,omap2420") ||
+	    of_machine_is_compatible("ti,omap3"))
+		omap_sdrc_init(NULL, NULL);
+
 	pdata_quirks_check(auxdata_quirks);
 	of_platform_populate(NULL, omap_dt_match_table,
 			     omap_auxdata_lookup, NULL);

From 8a603f91cc4848ab1a0458bc065aa9f64322e123 Mon Sep 17 00:00:00 2001
From: "H. Nikolaus Schaller" <hns@goldelico.com>
Date: Fri, 16 Oct 2015 22:19:06 +0100
Subject: [PATCH 079/199] ARM: 8445/1: fix vdsomunge not to depend on glibc
 specific byteswap.h

If the host toolchain is not glibc based then the arm kernel build
fails with

  HOSTCC  arch/arm/vdso/vdsomunge
  arch/arm/vdso/vdsomunge.c:48:22: fatal error: byteswap.h: No such file or directory

Observed: with omap2plus_defconfig and compile on Mac OS X with arm ELF
cross-compiler.

Reason: byteswap.h is a glibc only header.

Solution: replace by private byte-swapping macros (taken from
arch/mips/boot/elf2ecoff.c and kindly improved by Russell King)

Tested to compile on Mac OS X 10.9.5 host.

Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/vdso/vdsomunge.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arm/vdso/vdsomunge.c b/arch/arm/vdso/vdsomunge.c
index aedec81d11988..0cebd98cd88c3 100644
--- a/arch/arm/vdso/vdsomunge.c
+++ b/arch/arm/vdso/vdsomunge.c
@@ -45,7 +45,6 @@
  * it does.
  */
 
-#include <byteswap.h>
 #include <elf.h>
 #include <errno.h>
 #include <fcntl.h>
@@ -59,6 +58,16 @@
 #include <sys/types.h>
 #include <unistd.h>
 
+#define swab16(x) \
+	((((x) & 0x00ff) << 8) | \
+	 (((x) & 0xff00) >> 8))
+
+#define swab32(x) \
+	((((x) & 0x000000ff) << 24) | \
+	 (((x) & 0x0000ff00) <<  8) | \
+	 (((x) & 0x00ff0000) >>  8) | \
+	 (((x) & 0xff000000) << 24))
+
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
 #define HOST_ORDER ELFDATA2LSB
 #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
@@ -104,17 +113,17 @@ static void cleanup(void)
 
 static Elf32_Word read_elf_word(Elf32_Word word, bool swap)
 {
-	return swap ? bswap_32(word) : word;
+	return swap ? swab32(word) : word;
 }
 
 static Elf32_Half read_elf_half(Elf32_Half half, bool swap)
 {
-	return swap ? bswap_16(half) : half;
+	return swap ? swab16(half) : half;
 }
 
 static void write_elf_word(Elf32_Word val, Elf32_Word *dst, bool swap)
 {
-	*dst = swap ? bswap_32(val) : val;
+	*dst = swap ? swab32(val) : val;
 }
 
 int main(int argc, char **argv)

From 2a7d44f47f53fa1be677f44c73d78b1bcf9c05d9 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Mon, 19 Oct 2015 09:30:42 -0400
Subject: [PATCH 080/199] drm/radeon/dpm: don't add pwm attributes if DPM is
 disabled

PWM fan control is only available with DPM.  If DPM disabled,
don't expose the PWM fan controls to avoid a crash.

Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=92524

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/radeon/radeon_pm.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_pm.c b/drivers/gpu/drm/radeon/radeon_pm.c
index 44489cce74584..6a0a176e26ec2 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -717,10 +717,14 @@ static umode_t hwmon_attributes_visible(struct kobject *kobj,
 	struct radeon_device *rdev = dev_get_drvdata(dev);
 	umode_t effective_mode = attr->mode;
 
-	/* Skip limit attributes if DPM is not enabled */
+	/* Skip attributes if DPM is not enabled */
 	if (rdev->pm.pm_method != PM_METHOD_DPM &&
 	    (attr == &sensor_dev_attr_temp1_crit.dev_attr.attr ||
-	     attr == &sensor_dev_attr_temp1_crit_hyst.dev_attr.attr))
+	     attr == &sensor_dev_attr_temp1_crit_hyst.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1_enable.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1_max.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1_min.dev_attr.attr))
 		return 0;
 
 	/* Skip fan attributes if fan is not present */

From 27100735adbcb872854674bed1d000825f9954ac Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Mon, 19 Oct 2015 15:49:11 -0400
Subject: [PATCH 081/199] drm/amdgpu/dpm: don't add pwm attributes if DPM is
 disabled

PWM fan control is only available with DPM.  There is no non-DPM
support on amdgpu, so we should never get a crash here because
the sysfs nodes would never be created in the first place. Add the
check just in case to be on the safe side.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index efed11509f4a2..ed2bbe5b10afe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -294,10 +294,14 @@ static umode_t hwmon_attributes_visible(struct kobject *kobj,
 	struct amdgpu_device *adev = dev_get_drvdata(dev);
 	umode_t effective_mode = attr->mode;
 
-	/* Skip limit attributes if DPM is not enabled */
+	/* Skip attributes if DPM is not enabled */
 	if (!adev->pm.dpm_enabled &&
 	    (attr == &sensor_dev_attr_temp1_crit.dev_attr.attr ||
-	     attr == &sensor_dev_attr_temp1_crit_hyst.dev_attr.attr))
+	     attr == &sensor_dev_attr_temp1_crit_hyst.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1_enable.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1_max.dev_attr.attr ||
+	     attr == &sensor_dev_attr_pwm1_min.dev_attr.attr))
 		return 0;
 
 	/* Skip fan attributes if fan is not present */

From 677c884ff6370add1360e2b9558285355ebe2b36 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Mon, 19 Oct 2015 15:54:21 -0400
Subject: [PATCH 082/199] drm/amdgpu: add missing dpm check for KV dpm late
 init

Skip dpm late init if dpm is disabled.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/amd/amdgpu/kv_dpm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
index 9745ed3a9aef8..7e9154c7f1dbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
@@ -2997,6 +2997,9 @@ static int kv_dpm_late_init(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 	int ret;
 
+	if (!amdgpu_dpm)
+		return 0;
+
 	/* init the sysfs and debugfs files late */
 	ret = amdgpu_pm_sysfs_init(adev);
 	if (ret)

From 0b5aedfe0e6654ec54f35109e1929a1cf7fc4cdd Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard@nod.at>
Date: Sun, 28 Jun 2015 22:55:26 +0200
Subject: [PATCH 083/199] um: Fix out-of-tree build

Commit 30b11ee9a (um: Remove copy&paste code from init.h)
uncovered an issue wrt. out-of-tree builds.
For out-of-tree builds, we must not rely on relative paths.
Before 30b11ee9a it worked by chance as no host code included
generated header files.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
---
 arch/um/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/um/Makefile b/arch/um/Makefile
index 098ab3333e7cd..e3abe6f3156d3 100644
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -70,8 +70,8 @@ KBUILD_AFLAGS += $(ARCH_INCLUDE)
 
 USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
 		$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
-		-D_FILE_OFFSET_BITS=64 -idirafter include \
-		-D__KERNEL__ -D__UM_HOST__
+		-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
+		-idirafter $(obj)/include -D__KERNEL__ -D__UM_HOST__
 
 #This will adjust *FLAGS accordingly to the platform.
 include $(ARCH_DIR)/Makefile-os-$(OS)

From 37e81a016cc847c03ea71570fea29f12ca390bee Mon Sep 17 00:00:00 2001
From: Hans-Werner Hilse <hwhilse@gmail.com>
Date: Mon, 29 Jun 2015 11:50:32 +0200
Subject: [PATCH 084/199] um: Do not rely on libc to provide modify_ldt()

modify_ldt() was declared as an external symbol. Despite the man
page for this syscall telling that there is no wrapper in glibc,
since version 2.1 there actually is, so linking to the glibc
works.

Since modify_ldt() is not a POSIX interface, other libc
implementations do not always provide a wrapper function.
Even glibc headers do not provide a corresponding declaration.

So go the recommended way to call this using syscall().

Signed-off-by: Hans-Werner Hilse <hwhilse@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
---
 arch/x86/um/ldt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/um/ldt.c b/arch/x86/um/ldt.c
index 9701a4fd7bf2e..836a1eb5df436 100644
--- a/arch/x86/um/ldt.c
+++ b/arch/x86/um/ldt.c
@@ -12,7 +12,10 @@
 #include <skas.h>
 #include <sysdep/tls.h>
 
-extern int modify_ldt(int func, void *ptr, unsigned long bytecount);
+static inline int modify_ldt (int func, void *ptr, unsigned long bytecount)
+{
+	return syscall(__NR_modify_ldt, func, ptr, bytecount);
+}
 
 static long write_ldt_entry(struct mm_id *mm_idp, int func,
 		     struct user_desc *desc, void **addr, int done)

From 6b1873371cea13036171d03a7c1e3e59158b4505 Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard@nod.at>
Date: Sun, 9 Aug 2015 21:49:07 +0200
Subject: [PATCH 085/199] um: Fix waitpid() usage in helper code

If UML is executing a helper program it is using
waitpid() with the __WCLONE flag to wait for the program
as the helper is executed from a clone()'ed thread.
While using __WCLONE is perfectly fine for clone()'ed
childs it won't detect terminated childs if the helper
has issued an execve().

We have to use __WALL to wait for both clone()'ed and
regular childs to detect the termination before and
after an execve().

Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
---
 arch/um/os-Linux/helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/um/os-Linux/helper.c b/arch/um/os-Linux/helper.c
index e3ee4a51ef63a..3f02d42328127 100644
--- a/arch/um/os-Linux/helper.c
+++ b/arch/um/os-Linux/helper.c
@@ -96,7 +96,7 @@ int run_helper(void (*pre_exec)(void *), void *pre_data, char **argv)
 			       "ret = %d\n", -n);
 			ret = n;
 		}
-		CATCH_EINTR(waitpid(pid, NULL, __WCLONE));
+		CATCH_EINTR(waitpid(pid, NULL, __WALL));
 	}
 
 out_free2:
@@ -129,7 +129,7 @@ int run_helper_thread(int (*proc)(void *), void *arg, unsigned int flags,
 		return err;
 	}
 	if (stack_out == NULL) {
-		CATCH_EINTR(pid = waitpid(pid, &status, __WCLONE));
+		CATCH_EINTR(pid = waitpid(pid, &status, __WALL));
 		if (pid < 0) {
 			err = -errno;
 			printk(UM_KERN_ERR "run_helper_thread - wait failed, "
@@ -148,7 +148,7 @@ int run_helper_thread(int (*proc)(void *), void *arg, unsigned int flags,
 int helper_wait(int pid)
 {
 	int ret, status;
-	int wflags = __WCLONE;
+	int wflags = __WALL;
 
 	CATCH_EINTR(ret = waitpid(pid, &status, wflags));
 	if (ret < 0) {

From 56b88a3bf97a39d3f4f010509917b76a865a6dc8 Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard@nod.at>
Date: Sun, 9 Aug 2015 22:26:33 +0200
Subject: [PATCH 086/199] um: Fix kernel mode fault condition

We have to exclude memory locations <= PAGE_SIZE from
the condition and let the kernel mode fault path catch it.
Otherwise a kernel NULL pointer exception will be reported
as a kernel user space access.

Fixes: d2313084e2c (um: Catch unprotected user memory access)
Signed-off-by: Richard Weinberger <richard@nod.at>
---
 arch/um/kernel/trap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index d8a9fce6ee2e5..98783dd0fa2ea 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -220,7 +220,7 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, int is_user,
 		show_regs(container_of(regs, struct pt_regs, regs));
 		panic("Segfault with no mm");
 	}
-	else if (!is_user && address < TASK_SIZE) {
+	else if (!is_user && address > PAGE_SIZE && address < TASK_SIZE) {
 		show_regs(container_of(regs, struct pt_regs, regs));
 		panic("Kernel tried to access user memory at addr 0x%lx, ip 0x%lx",
 		       address, ip);

From fde7d22e01aa0d252fc5c95fa11f0dac35a4dd59 Mon Sep 17 00:00:00 2001
From: Yuyang Du <yuyang.du@intel.com>
Date: Tue, 13 Oct 2015 09:18:22 +0800
Subject: [PATCH 087/199] sched/fair: Fix overly small weight for interactive
 group entities

Commit:

  9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")

led to an overly small weight for interactive group entities. The bad case
can be easily reproduced when a number of CPU hogs compete for the CPUs
at the same time (thanks to Mike). This is largly because the task group's
load average tracking cross CPUs lags behind the real changes.

To fix this we accelerate the group share distribution process by using
the load.weight of the cfs_rq. This may increase the entire group's
share, but we have to do so to protect the (fragile) interactive
tasks, especially from CPU hogs.

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444699103-20272-1-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6e2e3483b1ecf..bc62c5096e547 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2363,7 +2363,7 @@ static inline long calc_tg_weight(struct task_group *tg, struct cfs_rq *cfs_rq)
 	 */
 	tg_weight = atomic_long_read(&tg->load_avg);
 	tg_weight -= cfs_rq->tg_load_avg_contrib;
-	tg_weight += cfs_rq_load_avg(cfs_rq);
+	tg_weight += cfs_rq->load.weight;
 
 	return tg_weight;
 }
@@ -2373,7 +2373,7 @@ static long calc_cfs_shares(struct cfs_rq *cfs_rq, struct task_group *tg)
 	long tg_weight, load, shares;
 
 	tg_weight = calc_tg_weight(tg, cfs_rq);
-	load = cfs_rq_load_avg(cfs_rq);
+	load = cfs_rq->load.weight;
 
 	shares = (tg->shares * load);
 	if (tg_weight)

From 3e386d56bafbb6d2540b49367444997fc671ea69 Mon Sep 17 00:00:00 2001
From: Yuyang Du <yuyang.du@intel.com>
Date: Tue, 13 Oct 2015 09:18:23 +0800
Subject: [PATCH 088/199] sched/fair: Update task group's load_avg after task
 migration

When cfs_rq has cfs_rq->removed_load_avg set (when a task migrates from
this cfs_rq), we need to update its contribution to the group's load_avg.

This should not increase tg's update too much, because in most cases, the
cfs_rq has already decayed its load_avg.

Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444699103-20272-2-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bc62c5096e547..9a5e60fe721a5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2664,13 +2664,14 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
 /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */
 static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
 {
-	int decayed;
 	struct sched_avg *sa = &cfs_rq->avg;
+	int decayed, removed = 0;
 
 	if (atomic_long_read(&cfs_rq->removed_load_avg)) {
 		long r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
 		sa->load_avg = max_t(long, sa->load_avg - r, 0);
 		sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
+		removed = 1;
 	}
 
 	if (atomic_long_read(&cfs_rq->removed_util_avg)) {
@@ -2688,7 +2689,7 @@ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
 	cfs_rq->load_last_update_time_copy = sa->last_update_time;
 #endif
 
-	return decayed;
+	return decayed || removed;
 }
 
 /* Update task and its cfs_rq load average */

From 0baabb385eb4bce699ddab0db015112be6cf1e6a Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Mon, 12 Oct 2015 17:21:23 +0200
Subject: [PATCH 089/199] nohz: Revert "nohz: Set isolcpus when nohz_full is
 set"

This reverts:

  8cb9764fc88b ("nohz: Set isolcpus when nohz_full is set")

We assumed that full-nohz users always want scheduler isolation on full
dynticks CPUs, therefore we included full-nohz CPUs on cpu_isolated_map.

This means that tasks run by default on CPUs outside the nohz_full range
unless their affinity is explicity overwritten.

This suits pure isolation workloads but when the machine is needed to
run common workloads, the available sets of CPUs to run common tasks
becomes reduced.

We reach an extreme case when CONFIG_NO_HZ_FULL_ALL is enabled as it
leaves only CPU 0 for non-isolation tasks, which makes people think that
their supercomputer regressed to 90's UP - which is true in a sense.

Some full-nohz users appear to be interested in running normal workloads
either before or after an isolation workload. Full-nohz isn't optimized
toward normal workloads but it's still better than UP performance.

We are reaching a limitation in kernel presets here. Lets revert this
cpu_isolated_map inclusion and let userspace do its own scheduler
isolation using cpusets or explicit affinity settings.

Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Link: http://lkml.kernel.org/r/1444663283-30068-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 10a8faa1b0d4a..5bd7d60658d37 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7238,9 +7238,6 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
-	/* nohz_full won't take effect without isolating the cpus. */
-	tick_nohz_full_add_cpus_to(cpu_isolated_map);
-
 	sched_init_numa();
 
 	/*

From 5aa5050787f449e7eaef2c5ec93c7b357aa7dcdc Mon Sep 17 00:00:00 2001
From: Luca Abeni <luca.abeni@unitn.it>
Date: Fri, 16 Oct 2015 10:06:21 +0200
Subject: [PATCH 090/199] sched/deadline: Fix migration of SCHED_DEADLINE tasks

Commit:

  9d5142624256 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target")

broke select_task_rq_dl() and find_lock_later_rq(), because it introduced
a comparison between the local task's deadline and dl.earliest_dl.curr of
the remote queue.

However, if the remote runqueue does not contain any SCHED_DEADLINE
task its earliest_dl.curr is 0 (always smaller than the deadline of
the local task) and the remote runqueue is not selected for pushing.

As a result, if an application creates multiple SCHED_DEADLINE
threads, they will never be pushed to runqueues that do not already
contain SCHED_DEADLINE tasks.

This patch fixes the issue by checking if dl.dl_nr_running == 0.

Signed-off-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@linux.intel.com>
Fixes: 9d5142624256 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target")
Link: http://lkml.kernel.org/r/1444982781-15608-1-git-send-email-luca.abeni@unitn.it
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/deadline.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fc8f01083527a..142df2668e5d5 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1066,8 +1066,9 @@ select_task_rq_dl(struct task_struct *p, int cpu, int sd_flag, int flags)
 		int target = find_later_rq(p);
 
 		if (target != -1 &&
-				dl_time_before(p->dl.deadline,
-					cpu_rq(target)->dl.earliest_dl.curr))
+				(dl_time_before(p->dl.deadline,
+					cpu_rq(target)->dl.earliest_dl.curr) ||
+				(cpu_rq(target)->dl.dl_nr_running == 0)))
 			cpu = target;
 	}
 	rcu_read_unlock();
@@ -1417,7 +1418,8 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
 
 		later_rq = cpu_rq(cpu);
 
-		if (!dl_time_before(task->dl.deadline,
+		if (later_rq->dl.dl_nr_running &&
+		    !dl_time_before(task->dl.deadline,
 					later_rq->dl.earliest_dl.curr)) {
 			/*
 			 * Target rq has tasks of equal or earlier deadline,

From d976441f44bc5d48635d081d277aa76556ffbf8b Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
Date: Mon, 19 Oct 2015 11:37:17 +0300
Subject: [PATCH 091/199] compiler, atomics, kasan: Provide READ_ONCE_NOCHECK()

Some code may perform racy by design memory reads. This could be
harmless, yet such code may produce KASAN warnings.

To hide such accesses from KASAN this patch introduces
READ_ONCE_NOCHECK() macro. KASAN will not check the memory
accessed by READ_ONCE_NOCHECK(). The KernelThreadSanitizer
(KTSAN) is going to ignore it as well.

This patch creates __read_once_size_nocheck() a clone of
__read_once_size(). The only difference between them is
'no_sanitized_address' attribute appended to '*_nocheck'
function. This attribute tells the compiler that instrumentation
of memory accesses should not be applied to that function. We
declare it as static '__maybe_unsed' because GCC is not capable
to inline such function:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368

With KASAN=n READ_ONCE_NOCHECK() is just a clone of READ_ONCE().

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Link: http://lkml.kernel.org/r/1445243838-17763-2-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/compiler-gcc.h | 13 +++++++
 include/linux/compiler.h     | 66 +++++++++++++++++++++++++++++-------
 2 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index dfaa7b3e9ae90..8efb40e61d6e4 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -237,12 +237,25 @@
 #define KASAN_ABI_VERSION 3
 #endif
 
+#if GCC_VERSION >= 40902
+/*
+ * Tell the compiler that address safety instrumentation (KASAN)
+ * should not be applied to that function.
+ * Conflicts with inlining: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
+ */
+#define __no_sanitize_address __attribute__((no_sanitize_address))
+#endif
+
 #endif	/* gcc version >= 40000 specific checks */
 
 #if !defined(__noclone)
 #define __noclone	/* not needed */
 #endif
 
+#if !defined(__no_sanitize_address)
+#define __no_sanitize_address
+#endif
+
 /*
  * A trick to suppress uninitialized variable warning without generating any
  * code
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index c836eb2dc44d5..3d7810341b579 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -198,19 +198,45 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
 
 #include <uapi/linux/types.h>
 
-static __always_inline void __read_once_size(const volatile void *p, void *res, int size)
+#define __READ_ONCE_SIZE						\
+({									\
+	switch (size) {							\
+	case 1: *(__u8 *)res = *(volatile __u8 *)p; break;		\
+	case 2: *(__u16 *)res = *(volatile __u16 *)p; break;		\
+	case 4: *(__u32 *)res = *(volatile __u32 *)p; break;		\
+	case 8: *(__u64 *)res = *(volatile __u64 *)p; break;		\
+	default:							\
+		barrier();						\
+		__builtin_memcpy((void *)res, (const void *)p, size);	\
+		barrier();						\
+	}								\
+})
+
+static __always_inline
+void __read_once_size(const volatile void *p, void *res, int size)
 {
-	switch (size) {
-	case 1: *(__u8 *)res = *(volatile __u8 *)p; break;
-	case 2: *(__u16 *)res = *(volatile __u16 *)p; break;
-	case 4: *(__u32 *)res = *(volatile __u32 *)p; break;
-	case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
-	default:
-		barrier();
-		__builtin_memcpy((void *)res, (const void *)p, size);
-		barrier();
-	}
+	__READ_ONCE_SIZE;
+}
+
+#ifdef CONFIG_KASAN
+/*
+ * This function is not 'inline' because __no_sanitize_address confilcts
+ * with inlining. Attempt to inline it may cause a build failure.
+ * 	https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
+ * '__maybe_unused' allows us to avoid defined-but-not-used warnings.
+ */
+static __no_sanitize_address __maybe_unused
+void __read_once_size_nocheck(const volatile void *p, void *res, int size)
+{
+	__READ_ONCE_SIZE;
+}
+#else
+static __always_inline
+void __read_once_size_nocheck(const volatile void *p, void *res, int size)
+{
+	__READ_ONCE_SIZE;
 }
+#endif
 
 static __always_inline void __write_once_size(volatile void *p, void *res, int size)
 {
@@ -248,8 +274,22 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
  * required ordering.
  */
 
-#define READ_ONCE(x) \
-	({ union { typeof(x) __val; char __c[1]; } __u; __read_once_size(&(x), __u.__c, sizeof(x)); __u.__val; })
+#define __READ_ONCE(x, check)						\
+({									\
+	union { typeof(x) __val; char __c[1]; } __u;			\
+	if (check)							\
+		__read_once_size(&(x), __u.__c, sizeof(x));		\
+	else								\
+		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\
+	__u.__val;							\
+})
+#define READ_ONCE(x) __READ_ONCE(x, 1)
+
+/*
+ * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
+ * to hide memory access from KASAN.
+ */
+#define READ_ONCE_NOCHECK(x) __READ_ONCE(x, 0)
 
 #define WRITE_ONCE(x, val) \
 ({							\

From f7d27c35ddff7c100d7a98db499ac0040149ac05 Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
Date: Mon, 19 Oct 2015 11:37:18 +0300
Subject: [PATCH 092/199] x86/mm, kasan: Silence KASAN warnings in get_wchan()

get_wchan() is racy by design, it may access volatile stack
of running task, thus it may access redzone in a stack frame
and cause KASAN to warn about this.

Use READ_ONCE_NOCHECK() to silence these warnings.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Link: http://lkml.kernel.org/r/1445243838-17763-3-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/process.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 39e585a554b71..e28db181e4fcb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -550,14 +550,14 @@ unsigned long get_wchan(struct task_struct *p)
 	if (sp < bottom || sp > top)
 		return 0;
 
-	fp = READ_ONCE(*(unsigned long *)sp);
+	fp = READ_ONCE_NOCHECK(*(unsigned long *)sp);
 	do {
 		if (fp < bottom || fp > top)
 			return 0;
-		ip = READ_ONCE(*(unsigned long *)(fp + sizeof(unsigned long)));
+		ip = READ_ONCE_NOCHECK(*(unsigned long *)(fp + sizeof(unsigned long)));
 		if (!in_sched_functions(ip))
 			return ip;
-		fp = READ_ONCE(*(unsigned long *)fp);
+		fp = READ_ONCE_NOCHECK(*(unsigned long *)fp);
 	} while (count++ < 16 && p->state != TASK_RUNNING);
 	return 0;
 }

From 3fc89adb9fa4beff31374a4bf50b3d099d88ae83 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 19 Oct 2015 18:23:57 +0800
Subject: [PATCH 093/199] crypto: api - Only abort operations on fatal signal

Currently a number of Crypto API operations may fail when a signal
occurs.  This causes nasty problems as the caller of those operations
are often not in a good position to restart the operation.

In fact there is currently no need for those operations to be
interrupted by user signals at all.  All we need is for them to
be killable.

This patch replaces the relevant calls of signal_pending with
fatal_signal_pending, and wait_for_completion_interruptible with
wait_for_completion_killable, respectively.

Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
 crypto/ablkcipher.c  | 2 +-
 crypto/algapi.c      | 2 +-
 crypto/api.c         | 6 +++---
 crypto/crypto_user.c | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/crypto/ablkcipher.c b/crypto/ablkcipher.c
index b788f169cc988..b4ffc5be1a93c 100644
--- a/crypto/ablkcipher.c
+++ b/crypto/ablkcipher.c
@@ -706,7 +706,7 @@ struct crypto_ablkcipher *crypto_alloc_ablkcipher(const char *alg_name,
 err:
 		if (err != -EAGAIN)
 			break;
-		if (signal_pending(current)) {
+		if (fatal_signal_pending(current)) {
 			err = -EINTR;
 			break;
 		}
diff --git a/crypto/algapi.c b/crypto/algapi.c
index d130b41dbaea2..59bf491fe3d86 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -345,7 +345,7 @@ static void crypto_wait_for_test(struct crypto_larval *larval)
 		crypto_alg_tested(larval->alg.cra_driver_name, 0);
 	}
 
-	err = wait_for_completion_interruptible(&larval->completion);
+	err = wait_for_completion_killable(&larval->completion);
 	WARN_ON(err);
 
 out:
diff --git a/crypto/api.c b/crypto/api.c
index afe4610afc4b9..bbc147cb5dec8 100644
--- a/crypto/api.c
+++ b/crypto/api.c
@@ -172,7 +172,7 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
 	struct crypto_larval *larval = (void *)alg;
 	long timeout;
 
-	timeout = wait_for_completion_interruptible_timeout(
+	timeout = wait_for_completion_killable_timeout(
 		&larval->completion, 60 * HZ);
 
 	alg = larval->adult;
@@ -445,7 +445,7 @@ struct crypto_tfm *crypto_alloc_base(const char *alg_name, u32 type, u32 mask)
 err:
 		if (err != -EAGAIN)
 			break;
-		if (signal_pending(current)) {
+		if (fatal_signal_pending(current)) {
 			err = -EINTR;
 			break;
 		}
@@ -562,7 +562,7 @@ void *crypto_alloc_tfm(const char *alg_name,
 err:
 		if (err != -EAGAIN)
 			break;
-		if (signal_pending(current)) {
+		if (fatal_signal_pending(current)) {
 			err = -EINTR;
 			break;
 		}
diff --git a/crypto/crypto_user.c b/crypto/crypto_user.c
index d94d99ffe8b9b..237f3795cfaaa 100644
--- a/crypto/crypto_user.c
+++ b/crypto/crypto_user.c
@@ -375,7 +375,7 @@ static struct crypto_alg *crypto_user_skcipher_alg(const char *name, u32 type,
 		err = PTR_ERR(alg);
 		if (err != -EAGAIN)
 			break;
-		if (signal_pending(current)) {
+		if (fatal_signal_pending(current)) {
 			err = -EINTR;
 			break;
 		}

From d289619a219dd01e255d7b5e30f9171b25efea48 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai@suse.de>
Date: Tue, 20 Oct 2015 16:23:55 +0200
Subject: [PATCH 094/199] ALSA: hda - Fix deadlock at error in building PCM

The HDA codec driver issues snd_hda_codec_reset() at the error path of
PCM build.  This was needed in the earlier code base, but the recent
rewrite to use the standard bus binding made this a deadlock:
 modprobe        D 0000000000000005     0   720    716 0x00000080
 Call Trace:
  [<ffffffff816a5dbe>] schedule+0x3e/0x90
  [<ffffffff816a61a5>] schedule_preempt_disabled+0x15/0x20
  [<ffffffff816a7ae5>] __mutex_lock_slowpath+0xb5/0x120
  [<ffffffff816a7b6b>] mutex_lock+0x1b/0x30
  [<ffffffff8148656b>] device_release_driver+0x1b/0x30
  [<ffffffff81485c15>] bus_remove_device+0x105/0x180
  [<ffffffff814822b9>] device_del+0x139/0x260
  [<ffffffffa05e0ec5>] snd_hdac_device_unregister+0x25/0x30 [snd_hda_core]
  [<ffffffffa074fa6a>] snd_hda_codec_reset+0x2a/0x70 [snd_hda_codec]
  [<ffffffffa075007b>] snd_hda_codec_build_pcms+0x18b/0x1b0 [snd_hda_codec]
  [<ffffffffa074a44e>] hda_codec_driver_probe+0xbe/0x140 [snd_hda_codec]
  [<ffffffff81486ac4>] driver_probe_device+0x1f4/0x460
  [<ffffffff81486dc0>] __driver_attach+0x90/0xa0
  [<ffffffff81484844>] bus_for_each_dev+0x64/0xa0
  [<ffffffff814862de>] driver_attach+0x1e/0x20
  [<ffffffff81485e7b>] bus_add_driver+0x1eb/0x280
  [<ffffffff81487680>] driver_register+0x60/0xe0
  [<ffffffffa074a0da>] __hda_codec_driver_register+0x5a/0x60 [snd_hda_codec]
  [<ffffffffa070a01e>] realtek_driver_init+0x1e/0x1000 [snd_hda_codec_realtek]
  [<ffffffff810002f3>] do_one_initcall+0xb3/0x200
  [<ffffffff816a1fc5>] do_init_module+0x60/0x1f8
  [<ffffffff810ee5c3>] load_module+0x1653/0x1bd0
  [<ffffffff810eed48>] SYSC_finit_module+0x98/0xc0
  [<ffffffff810eed8e>] SyS_finit_module+0xe/0x10
  [<ffffffff816aa032>] entry_SYSCALL_64_fastpath+0x16/0x75

The simple fix is just to remove this call, since we don't need to
think about unbinding at there any longer.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=948758
Cc: <stable@vger.kernel.org> # v4.1+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/pci/hda/hda_codec.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c
index 37f43a1b34ef1..a249d5486889d 100644
--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -3367,10 +3367,8 @@ int snd_hda_codec_build_pcms(struct hda_codec *codec)
 	int dev, err;
 
 	err = snd_hda_codec_parse_pcms(codec);
-	if (err < 0) {
-		snd_hda_codec_reset(codec);
+	if (err < 0)
 		return err;
-	}
 
 	/* attach a new PCM streams */
 	list_for_each_entry(cpcm, &codec->pcm_list_head, list) {

From 97aff2c03a1e4d343266adadb52313613efb027f Mon Sep 17 00:00:00 2001
From: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Date: Tue, 20 Oct 2015 10:25:58 +0100
Subject: [PATCH 095/199] ASoC: wm8904: Correct number of EQ registers

There are 24 EQ registers not 25, I suspect this bug came about because
the registers start at EQ1 not zero. The bug is relatively harmless as
the extra register written is an unused one.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
---
 include/sound/wm8904.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/sound/wm8904.h b/include/sound/wm8904.h
index 898be3a8db9ae..6d8f8fba33414 100644
--- a/include/sound/wm8904.h
+++ b/include/sound/wm8904.h
@@ -119,7 +119,7 @@
 #define WM8904_MIC_REGS  2
 #define WM8904_GPIO_REGS 4
 #define WM8904_DRC_REGS  4
-#define WM8904_EQ_REGS   25
+#define WM8904_EQ_REGS   24
 
 /**
  * DRC configurations are specified with a label and a set of register

From a2d7629048322ae62bff57f34f5f995e25ed234c Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
Date: Tue, 20 Oct 2015 11:38:08 -0400
Subject: [PATCH 096/199] tracing: Have stack tracer force RCU to be watching

The stack tracer was triggering the WARN_ON() in module.c:

 static void module_assert_mutex_or_preempt(void)
 {
 #ifdef CONFIG_LOCKDEP
	if (unlikely(!debug_locks))
		return;

	WARN_ON(!rcu_read_lock_sched_held() &&
		!lockdep_is_held(&module_mutex));
 #endif
 }

The reason is that the stack tracer traces all function calls, and some of
those calls happen while exiting or entering user space and idle. Some of
these functions are called after RCU had already stopped watching, as RCU
does not watch userspace or idle CPUs.

If a max stack is hit, then the save_stack_trace() is called, which will
check module addresses and call module_assert_mutex_or_preempt(), and then
trigger the warning. Sad part is, the warning itself will also do a stack
trace and tigger the same warning. That probably should be fixed.

The warning was added by 0be964be0d45 "module: Sanitize RCU usage and
locking" but this bug has probably been around longer. But it's unlikely to
cause much harm, but the new warning causes the system to lock up.

Cc: stable@vger.kernel.org # 4.2+
Cc: Peter Zijlstra <peterz@infradead.org>
Cc:"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index b746399ab59c0..5f29402bff0f8 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -88,6 +88,12 @@ check_stack(unsigned long ip, unsigned long *stack)
 	local_irq_save(flags);
 	arch_spin_lock(&max_stack_lock);
 
+	/*
+	 * RCU may not be watching, make it see us.
+	 * The stack trace code uses rcu_sched.
+	 */
+	rcu_irq_enter();
+
 	/* In case another CPU set the tracer_frame on us */
 	if (unlikely(!frame_size))
 		this_size -= tracer_frame;
@@ -169,6 +175,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 	}
 
  out:
+	rcu_irq_exit();
 	arch_spin_unlock(&max_stack_lock);
 	local_irq_restore(flags);
 }

From 437f9963bc4fd75889c1fe9289a92dea9124a439 Mon Sep 17 00:00:00 2001
From: Pavel Fedin <p.fedin@samsung.com>
Date: Fri, 25 Sep 2015 17:00:29 +0300
Subject: [PATCH 097/199] KVM: arm/arm64: Do not inject spurious interrupts

When lowering a level-triggered line from userspace, we forgot to lower
the pending bit on the emulated CPU interface and we also did not
re-compute the pending_on_cpu bitmap for the CPU affected by the change.

Update vgic_update_irq_pending() to fix the two issues above and also
raise a warning in vgic_quue_irq_to_lr if we encounter an interrupt
pending on a CPU which is neither marked active nor pending.

  [ Commit text reworked completely - Christoffer ]

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 6bd1c9bf7ae71..596455a394af9 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1132,7 +1132,8 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
 		vgic_irq_clear_active(vcpu, irq);
 		vgic_update_state(vcpu->kvm);
-	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
+	} else {
+		WARN_ON(!vgic_dist_irq_is_pending(vcpu, irq));
 		vlr.state |= LR_STATE_PENDING;
 		kvm_debug("Set pending: 0x%x\n", vlr.state);
 	}
@@ -1607,8 +1608,12 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 	} else {
 		if (level_triggered) {
 			vgic_dist_irq_clear_level(vcpu, irq_num);
-			if (!vgic_dist_irq_soft_pend(vcpu, irq_num))
+			if (!vgic_dist_irq_soft_pend(vcpu, irq_num)) {
 				vgic_dist_irq_clear_pending(vcpu, irq_num);
+				vgic_cpu_irq_clear(vcpu, irq_num);
+				if (!compute_pending_for_cpu(vcpu))
+					clear_bit(cpuid, dist->irq_pending_on_cpu);
+			}
 		}
 
 		ret = false;

From 399ea0f6bcd318af94ec8e4ffe96703ed674f22e Mon Sep 17 00:00:00 2001
From: Pavel Fedin <p.fedin@samsung.com>
Date: Tue, 6 Oct 2015 11:14:35 +0300
Subject: [PATCH 098/199] KVM: arm/arm64: Fix memory leak if timer
 initialization fails

Jump to correct label and free kvm_host_cpu_state

Reviewed-by: Wei Huang <wei@redhat.com>
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 arch/arm/kvm/arm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dc017adfddc8b..78b2869945771 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1080,7 +1080,7 @@ static int init_hyp_mode(void)
 	 */
 	err = kvm_timer_hyp_init();
 	if (err)
-		goto out_free_mappings;
+		goto out_free_context;
 
 #ifndef CONFIG_HOTPLUG_CPU
 	free_boot_hyp_pgd();

From 4a5d69b73948d0e03cd38d77dc11edb2e707165f Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Mon, 12 Oct 2015 15:22:31 +0200
Subject: [PATCH 099/199] KVM: arm: use GIC support unconditionally

The vgic code on ARM is built for all configurations that enable KVM,
but the parent_data field that it references is only present when
CONFIG_IRQ_DOMAIN_HIERARCHY is set:

virt/kvm/arm/vgic.c: In function 'kvm_vgic_map_phys_irq':
virt/kvm/arm/vgic.c:1781:13: error: 'struct irq_data' has no member named 'parent_data'

This flag is implied by the GIC driver, and indeed the VGIC code only
makes sense if a GIC is present. This changes the CONFIG_KVM symbol
to always select GIC, which avoids the issue.

Fixes: 662d9715840 ("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 arch/arm/kvm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 210eccadb69a9..356970f3b25e3 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
 	depends on MMU && OF
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
+	select ARM_GIC
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
 	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO

From cff9211eb1a1f58ce7f5a2d596b617928fd4be0e Mon Sep 17 00:00:00 2001
From: Christoffer Dall <christoffer.dall@linaro.org>
Date: Fri, 16 Oct 2015 12:41:21 +0200
Subject: [PATCH 100/199] arm/arm64: KVM: Fix arch timer behavior for disabled
 interrupts

We have an interesting issue when the guest disables the timer interrupt
on the VGIC, which happens when turning VCPUs off using PSCI, for
example.

The problem is that because the guest disables the virtual interrupt at
the VGIC level, we never inject interrupts to the guest and therefore
never mark the interrupt as active on the physical distributor.  The
host also never takes the timer interrupt (we only use the timer device
to trigger a guest exit and everything else is done in software), so the
interrupt does not become active through normal means.

The result is that we keep entering the guest with a programmed timer
that will always fire as soon as we context switch the hardware timer
state and run the guest, preventing forward progress for the VCPU.

Since the active state on the physical distributor is really part of the
timer logic, it is the job of our virtual arch timer driver to manage
this state.

The timer->map->active boolean field indicates whether we have signalled
this interrupt to the vgic and if that interrupt is still pending or
active.  As long as that is the case, the hardware doesn't have to
generate physical interrupts and therefore we mark the interrupt as
active on the physical distributor.

We also have to restore the pending state of an interrupt that was
queued to an LR but was retired from the LR for some reason, while
remaining pending in the LR.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/arch_timer.c | 19 +++++++++++++++++
 virt/kvm/arm/vgic.c       | 43 ++++++++++-----------------------------
 2 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 48c6e1ac6827f..b9d3a32cbc048 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -137,6 +137,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
 void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+	bool phys_active;
+	int ret;
 
 	/*
 	 * We're about to run this vcpu again, so there is no need to
@@ -151,6 +153,23 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
 	 */
 	if (kvm_timer_should_fire(vcpu))
 		kvm_timer_inject_irq(vcpu);
+
+	/*
+	 * We keep track of whether the edge-triggered interrupt has been
+	 * signalled to the vgic/guest, and if so, we mask the interrupt and
+	 * the physical distributor to prevent the timer from raising a
+	 * physical interrupt whenever we run a guest, preventing forward
+	 * VCPU progress.
+	 */
+	if (kvm_vgic_get_phys_irq_active(timer->map))
+		phys_active = true;
+	else
+		phys_active = false;
+
+	ret = irq_set_irqchip_state(timer->map->irq,
+				    IRQCHIP_STATE_ACTIVE,
+				    phys_active);
+	WARN_ON(ret);
 }
 
 /**
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 596455a394af9..ea21bc2735425 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1092,6 +1092,15 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
 
+	/*
+	 * We must transfer the pending state back to the distributor before
+	 * retiring the LR, otherwise we may loose edge-triggered interrupts.
+	 */
+	if (vlr.state & LR_STATE_PENDING) {
+		vgic_dist_irq_set_pending(vcpu, irq);
+		vlr.hwirq = 0;
+	}
+
 	vlr.state = 0;
 	vgic_set_lr(vcpu, lr_nr, vlr);
 	clear_bit(lr_nr, vgic_cpu->lr_used);
@@ -1241,7 +1250,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	unsigned long *pa_percpu, *pa_shared;
-	int i, vcpu_id, lr, ret;
+	int i, vcpu_id;
 	int overflow = 0;
 	int nr_shared = vgic_nr_shared_irqs(dist);
 
@@ -1296,31 +1305,6 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 		 */
 		clear_bit(vcpu_id, dist->irq_pending_on_cpu);
 	}
-
-	for (lr = 0; lr < vgic->nr_lr; lr++) {
-		struct vgic_lr vlr;
-
-		if (!test_bit(lr, vgic_cpu->lr_used))
-			continue;
-
-		vlr = vgic_get_lr(vcpu, lr);
-
-		/*
-		 * If we have a mapping, and the virtual interrupt is
-		 * presented to the guest (as pending or active), then we must
-		 * set the state to active in the physical world. See
-		 * Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt.
-		 */
-		if (vlr.state & LR_HW) {
-			struct irq_phys_map *map;
-			map = vgic_irq_map_search(vcpu, vlr.irq);
-
-			ret = irq_set_irqchip_state(map->irq,
-						    IRQCHIP_STATE_ACTIVE,
-						    true);
-			WARN_ON(ret);
-		}
-	}
 }
 
 static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
@@ -1430,13 +1414,8 @@ static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
 
 	WARN_ON(ret);
 
-	if (map->active) {
-		ret = irq_set_irqchip_state(map->irq,
-					    IRQCHIP_STATE_ACTIVE,
-					    false);
-		WARN_ON(ret);
+	if (map->active)
 		return 0;
-	}
 
 	return 1;
 }

From 544c572e03174438b6656ed24a4516b9a9d5f14a Mon Sep 17 00:00:00 2001
From: Christoffer Dall <christoffer.dall@linaro.org>
Date: Sat, 17 Oct 2015 17:55:12 +0200
Subject: [PATCH 101/199] arm/arm64: KVM: Clear map->active on pend/active
 clear

When a guest reboots or offlines/onlines CPUs, it is not uncommon for it
to clear the pending and active states of an interrupt through the
emulated VGIC distributor.  However, since the architected timers are
defined by the architecture to be level triggered and the guest
rightfully expects them to be that, but we emulate them as
edge-triggered, we have to mimic level-triggered behavior for an
edge-triggered virtual implementation.

We currently do not signal the VGIC when the map->active field is true,
because it indicates that the guest has already been signalled of the
interrupt as required.  Normally this field is set to false when the
guest deactivates the virtual interrupt through the sync path.

We also need to catch the case where the guest deactivates the interrupt
through the emulated distributor, again allowing guests to boot even if
the original virtual timer signal hit before the guest's GIC
initialization sequence is run.

Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index ea21bc2735425..58b1256767856 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -531,6 +531,34 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm,
 	return false;
 }
 
+/*
+ * If a mapped interrupt's state has been modified by the guest such that it
+ * is no longer active or pending, without it have gone through the sync path,
+ * then the map->active field must be cleared so the interrupt can be taken
+ * again.
+ */
+static void vgic_handle_clear_mapped_irq(struct kvm_vcpu *vcpu)
+{
+	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	struct list_head *root;
+	struct irq_phys_map_entry *entry;
+	struct irq_phys_map *map;
+
+	rcu_read_lock();
+
+	/* Check for PPIs */
+	root = &vgic_cpu->irq_phys_map_list;
+	list_for_each_entry_rcu(entry, root, entry) {
+		map = &entry->map;
+
+		if (!vgic_dist_irq_is_pending(vcpu, map->virt_irq) &&
+		    !vgic_irq_is_active(vcpu, map->virt_irq))
+			map->active = false;
+	}
+
+	rcu_read_unlock();
+}
+
 bool vgic_handle_clear_pending_reg(struct kvm *kvm,
 				   struct kvm_exit_mmio *mmio,
 				   phys_addr_t offset, int vcpu_id)
@@ -561,6 +589,7 @@ bool vgic_handle_clear_pending_reg(struct kvm *kvm,
 					  vcpu_id, offset);
 		vgic_reg_access(mmio, reg, offset, mode);
 
+		vgic_handle_clear_mapped_irq(kvm_get_vcpu(kvm, vcpu_id));
 		vgic_update_state(kvm);
 		return true;
 	}
@@ -598,6 +627,7 @@ bool vgic_handle_clear_active_reg(struct kvm *kvm,
 			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
 
 	if (mmio->is_write) {
+		vgic_handle_clear_mapped_irq(kvm_get_vcpu(kvm, vcpu_id));
 		vgic_update_state(kvm);
 		return true;
 	}
@@ -1406,7 +1436,7 @@ static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
 		return 0;
 
 	map = vgic_irq_map_search(vcpu, vlr.irq);
-	BUG_ON(!map || !map->active);
+	BUG_ON(!map);
 
 	ret = irq_get_irqchip_state(map->irq,
 				    IRQCHIP_STATE_ACTIVE,

From 0d997491f814c87310a6ad7be30a9049c7150489 Mon Sep 17 00:00:00 2001
From: Christoffer Dall <christoffer.dall@linaro.org>
Date: Sat, 17 Oct 2015 19:05:27 +0200
Subject: [PATCH 102/199] arm/arm64: KVM: Fix disabled distributor operation

We currently do a single update of the vgic state when the distributor
enable/disable control register is accessed and then bypass updating the
state for as long as the distributor remains disabled.

This is incorrect, because updating the state does not consider the
distributor enable bit, and this you can end up in a situation where an
interrupt is marked as pending on the CPU interface, but not pending on
the distributor, which is an impossible state to be in, and triggers a
warning.  Consider for example the following sequence of events:

1. An interrupt is marked as pending on the distributor
   - the interrupt is also forwarded to the CPU interface
2. The guest turns off the distributor (it's about to do a reboot)
   - we stop updating the CPU interface state from now on
3. The guest disables the pending interrupt
   - we remove the pending state from the distributor, but don't touch
     the CPU interface, see point 2.

Since the distributor disable bit really means that no interrupts should
be forwarded to the CPU interface, we modify the code to keep updating
the internal VGIC state, but always set the CPU interface pending bits
to zero when the distributor is disabled.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 58b1256767856..66c66165e712d 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1012,6 +1012,12 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 	pend_percpu = vcpu->arch.vgic_cpu.pending_percpu;
 	pend_shared = vcpu->arch.vgic_cpu.pending_shared;
 
+	if (!dist->enabled) {
+		bitmap_zero(pend_percpu, VGIC_NR_PRIVATE_IRQS);
+		bitmap_zero(pend_shared, nr_shared);
+		return 0;
+	}
+
 	pending = vgic_bitmap_get_cpu_map(&dist->irq_pending, vcpu_id);
 	enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
 	bitmap_and(pend_percpu, pending, enabled, VGIC_NR_PRIVATE_IRQS);
@@ -1039,11 +1045,6 @@ void vgic_update_state(struct kvm *kvm)
 	struct kvm_vcpu *vcpu;
 	int c;
 
-	if (!dist->enabled) {
-		set_bit(0, dist->irq_pending_on_cpu);
-		return;
-	}
-
 	kvm_for_each_vcpu(c, vcpu, kvm) {
 		if (compute_pending_for_cpu(vcpu))
 			set_bit(c, dist->irq_pending_on_cpu);

From 625faa6a720d26fc0db9e20b48dc0dfe4c8d8ddf Mon Sep 17 00:00:00 2001
From: Russell King <rmk+kernel@arm.linux.org.uk>
Date: Tue, 20 Oct 2015 11:49:44 +0100
Subject: [PATCH 103/199] clkdev: fix clk_add_alias() with a NULL alias device
 name

clk_add_alias() was not correctly handling the case where alias_dev_name
was NULL: rather than producing an entry with a NULL dev_id pointer,
it would produce a device name of (null).  Fix this.

Cc: <stable@vger.kernel.org>
Fixes: 2568999835d7 ("clkdev: add clkdev_create() helper")
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/clk/clkdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/clkdev.c b/drivers/clk/clkdev.c
index c0eaf0973bd2b..779b6ff0c7ad4 100644
--- a/drivers/clk/clkdev.c
+++ b/drivers/clk/clkdev.c
@@ -333,7 +333,8 @@ int clk_add_alias(const char *alias, const char *alias_dev_name,
 	if (IS_ERR(r))
 		return PTR_ERR(r);
 
-	l = clkdev_create(r, alias, "%s", alias_dev_name);
+	l = clkdev_create(r, alias, alias_dev_name ? "%s" : NULL,
+			  alias_dev_name);
 	clk_put(r);
 
 	return l ? 0 : -ENODEV;

From 3909642034ffd7a8906ff3f2b2a71455bf39e7f6 Mon Sep 17 00:00:00 2001
From: Matan Barak <matanb@mellanox.com>
Date: Thu, 15 Oct 2015 15:01:03 +0300
Subject: [PATCH 104/199] IB/core: Fix use after free of ifa

When using ifup/ifdown while executing enum_netdev_ipv4_ips,
ifa could become invalid and cause use after free error.
Fixing it by protecting with RCU lock.

Fixes: 03db3a2d81e6 ('IB/core: Add RoCE GID table management')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/core/roce_gid_mgmt.c | 35 +++++++++++++++++++------
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/roce_gid_mgmt.c b/drivers/infiniband/core/roce_gid_mgmt.c
index 6b24cba1e474d..178f98482e13e 100644
--- a/drivers/infiniband/core/roce_gid_mgmt.c
+++ b/drivers/infiniband/core/roce_gid_mgmt.c
@@ -250,25 +250,44 @@ static void enum_netdev_ipv4_ips(struct ib_device *ib_dev,
 				 u8 port, struct net_device *ndev)
 {
 	struct in_device *in_dev;
+	struct sin_list {
+		struct list_head	list;
+		struct sockaddr_in	ip;
+	};
+	struct sin_list *sin_iter;
+	struct sin_list *sin_temp;
 
+	LIST_HEAD(sin_list);
 	if (ndev->reg_state >= NETREG_UNREGISTERING)
 		return;
 
-	in_dev = in_dev_get(ndev);
-	if (!in_dev)
+	rcu_read_lock();
+	in_dev = __in_dev_get_rcu(ndev);
+	if (!in_dev) {
+		rcu_read_unlock();
 		return;
+	}
 
 	for_ifa(in_dev) {
-		struct sockaddr_in ip;
+		struct sin_list *entry = kzalloc(sizeof(*entry), GFP_ATOMIC);
 
-		ip.sin_family = AF_INET;
-		ip.sin_addr.s_addr = ifa->ifa_address;
-		update_gid_ip(GID_ADD, ib_dev, port, ndev,
-			      (struct sockaddr *)&ip);
+		if (!entry) {
+			pr_warn("roce_gid_mgmt: couldn't allocate entry for IPv4 update\n");
+			continue;
+		}
+		entry->ip.sin_family = AF_INET;
+		entry->ip.sin_addr.s_addr = ifa->ifa_address;
+		list_add_tail(&entry->list, &sin_list);
 	}
 	endfor_ifa(in_dev);
+	rcu_read_unlock();
 
-	in_dev_put(in_dev);
+	list_for_each_entry_safe(sin_iter, sin_temp, &sin_list, list) {
+		update_gid_ip(GID_ADD, ib_dev, port, ndev,
+			      (struct sockaddr *)&sin_iter->ip);
+		list_del(&sin_iter->list);
+		kfree(sin_iter);
+	}
 }
 
 static void enum_netdev_ipv6_ips(struct ib_device *ib_dev,

From b3b51f9f6f5d91cd16afaed0c22df2c56ed5f92e Mon Sep 17 00:00:00 2001
From: Haggai Eran <haggaie@mellanox.com>
Date: Mon, 21 Sep 2015 16:02:02 +0300
Subject: [PATCH 105/199] IB/cma: Potential NULL dereference in
 cma_id_from_event

If the lookup of a listening ID failed for an AF_IB request, the code
would try to call dev_put() on a NULL net_dev.

Fixes: be688195bd08 ("IB/cma: Fix net_dev reference leak with failed
requests")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/core/cma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 59a2dafc8c574..f163ac680841b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1324,7 +1324,7 @@ static struct rdma_id_private *cma_id_from_event(struct ib_cm_id *cm_id,
 	bind_list = cma_ps_find(rdma_ps_from_service_id(req.service_id),
 				cma_port_from_service_id(req.service_id));
 	id_priv = cma_find_listener(bind_list, cm_id, ib_event, &req, *net_dev);
-	if (IS_ERR(id_priv)) {
+	if (IS_ERR(id_priv) && *net_dev) {
 		dev_put(*net_dev);
 		*net_dev = NULL;
 	}

From 0174b381caf89443d92c6fe75f725f2bfeba96b6 Mon Sep 17 00:00:00 2001
From: Sasha Levin <sasha.levin@oracle.com>
Date: Thu, 17 Sep 2015 16:04:19 -0400
Subject: [PATCH 106/199] IB/ucma: check workqueue allocation before usage

Allocating a workqueue might fail, which wasn't checked so far and would
lead to NULL ptr derefs when an attempt to use it was made.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/core/ucma.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index a53fc9b01c699..30467d10df91b 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1624,11 +1624,16 @@ static int ucma_open(struct inode *inode, struct file *filp)
 	if (!file)
 		return -ENOMEM;
 
+	file->close_wq = create_singlethread_workqueue("ucma_close_id");
+	if (!file->close_wq) {
+		kfree(file);
+		return -ENOMEM;
+	}
+
 	INIT_LIST_HEAD(&file->event_list);
 	INIT_LIST_HEAD(&file->ctx_list);
 	init_waitqueue_head(&file->poll_wait);
 	mutex_init(&file->mut);
-	file->close_wq = create_singlethread_workqueue("ucma_close_id");
 
 	filp->private_data = file;
 	file->filp = filp;

From ab3964ad2acfbb0dc5414d4c86fa6d8d690f27a1 Mon Sep 17 00:00:00 2001
From: Haggai Eran <haggaie@mellanox.com>
Date: Tue, 20 Oct 2015 09:53:01 +0300
Subject: [PATCH 107/199] IB/cma: Use inner P_Key to determine netdev

When discussing the patches to demux ids in rdma_cm instead of ib_cm, it
was decided that it is best to use the P_Key value in the packet headers.
However, the mlx5 and ipath drivers are currently unable to send correct
P_Key values in GMP headers. They always send using a single P_Key that is
set during the GSI QP initialization.

Change the rdma_cm code to look at the P_Key value that is part of the
packet payload as a workaround. Once the drivers are fixed this patch can
be reverted.

Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to
RDMA CM")
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/core/cma.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index f163ac680841b..36b12d560e17e 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1067,14 +1067,14 @@ static int cma_save_req_info(const struct ib_cm_event *ib_event,
 		       sizeof(req->local_gid));
 		req->has_gid	= true;
 		req->service_id	= req_param->primary_path->service_id;
-		req->pkey	= req_param->bth_pkey;
+		req->pkey	= be16_to_cpu(req_param->primary_path->pkey);
 		break;
 	case IB_CM_SIDR_REQ_RECEIVED:
 		req->device	= sidr_param->listen_id->device;
 		req->port	= sidr_param->port;
 		req->has_gid	= false;
 		req->service_id	= sidr_param->service_id;
-		req->pkey	= sidr_param->bth_pkey;
+		req->pkey	= sidr_param->pkey;
 		break;
 	default:
 		return -EINVAL;

From 203d27b0226a05202438ddb39ef0ef1acb14a759 Mon Sep 17 00:00:00 2001
From: Jes Sorensen <Jes.Sorensen@redhat.com>
Date: Tue, 20 Oct 2015 12:09:12 -0400
Subject: [PATCH 108/199] md/raid1: submit_bio_wait() returns 0 on success

This was introduced with 9e882242c6193ae6f416f2d8d8db0d9126bd996b
which changed the return value of submit_bio_wait() to return != 0 on
error, but didn't update the caller accordingly.

Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md")
Cc: stable@vger.kernel.org (v3.10)
Reported-by: Bill Kuzeja <William.Kuzeja@stratus.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/raid1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index ddd8a5f572aa1..cfca6edf78139 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2195,7 +2195,7 @@ static int narrow_write_error(struct r1bio *r1_bio, int i)
 		bio_trim(wbio, sector - r1_bio->sector, sectors);
 		wbio->bi_iter.bi_sector += rdev->data_offset;
 		wbio->bi_bdev = rdev->bdev;
-		if (submit_bio_wait(WRITE, wbio) == 0)
+		if (submit_bio_wait(WRITE, wbio) < 0)
 			/* failure! */
 			ok = rdev_set_badblocks(rdev, sector,
 						sectors, 0)

From 681ab4696062f5aa939c9e04d058732306a97176 Mon Sep 17 00:00:00 2001
From: Jes Sorensen <Jes.Sorensen@redhat.com>
Date: Tue, 20 Oct 2015 12:09:13 -0400
Subject: [PATCH 109/199] md/raid10: submit_bio_wait() returns 0 on success

This was introduced with 9e882242c6193ae6f416f2d8d8db0d9126bd996b
which changed the return value of submit_bio_wait() to return != 0 on
error, but didn't update the caller accordingly.

Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md")
Cc: stable@vger.kernel.org (v3.10)
Reported-by: Bill Kuzeja <William.Kuzeja@stratus.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/raid10.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 9f69dc526f8cb..a9ecec4e9a135 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2467,7 +2467,7 @@ static int narrow_write_error(struct r10bio *r10_bio, int i)
 				   choose_data_offset(r10_bio, rdev) +
 				   (sector - r10_bio->sector));
 		wbio->bi_bdev = rdev->bdev;
-		if (submit_bio_wait(WRITE, wbio) == 0)
+		if (submit_bio_wait(WRITE, wbio) < 0)
 			/* Failure! */
 			ok = rdev_set_badblocks(rdev, sector,
 						sectors, 0)

From 1904be1b6bb92058c8e00063dd59df2df294e258 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
Date: Tue, 20 Oct 2015 21:48:02 -0400
Subject: [PATCH 110/199] tracing: Do not allow stack_tracer to record stack in
 NMI

The code in stack tracer should not be executed within an NMI as it grabs
spinlocks and stack tracing an NMI gives the possibility of causing a
deadlock. Although this is safe on x86_64, because it does not perform stack
traces when the task struct stack is not in use (interrupts and NMIs), it
may be an issue for NMIs on i386 and other archs that use the same stack as
the NMI.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 5f29402bff0f8..8abf1ba180857 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -85,6 +85,10 @@ check_stack(unsigned long ip, unsigned long *stack)
 	if (!object_is_on_stack(stack))
 		return;
 
+	/* Can't do this from NMI context (can cause deadlocks) */
+	if (in_nmi())
+		return;
+
 	local_irq_save(flags);
 	arch_spin_lock(&max_stack_lock);
 

From 0f6925fa2907df58496cabc33fa4677c635e2223 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: Wed, 14 Oct 2015 15:26:13 +0800
Subject: [PATCH 111/199] btrfs: Avoid truncate tailing page if fallocate range
 doesn't exceed inode size

Current code will always truncate tailing page if its alloc_start is
smaller than inode size.

For example, the file extent layout is like:
0	4K	8K	16K	32K
|<-----Extent A---------------->|
|<--Inode size: 18K---------->|

But if calling fallocate even for range [0,4K), it will cause btrfs to
re-truncate the range [16,32K), causing COW and a new extent.

0	4K	8K	16K	32K
|///////|	<- Fallocate call range
|<-----Extent A-------->|<--B-->|

The cause is quite easy, just a careless btrfs_truncate_inode() in a
else branch without extra judgment.
Fix it by add judgment on whether the fallocate range is beyond isize.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
---
 fs/btrfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index b823fac91c928..8c6f247ba81d4 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2584,7 +2584,7 @@ static long btrfs_fallocate(struct file *file, int mode,
 					alloc_start);
 		if (ret)
 			goto out;
-	} else {
+	} else if (offset + len > inode->i_size) {
 		/*
 		 * If we are fallocating from the end of the file onward we
 		 * need to zero out the end of the page if i_size lands in the

From 08b137d90eec51b0e90c42e123ca8ceb118d233f Mon Sep 17 00:00:00 2001
From: Chaotian Jing <chaotian.jing@mediatek.com>
Date: Mon, 12 Oct 2015 17:22:23 +0800
Subject: [PATCH 112/199] mmc: core: Fix init_card in 52Mhz

Suppose that we got a data crc error, and it triggers the mmc_reset.
mmc_reset will call mmc_send_status to see if HW reset was supported.
before issue CMD13, it will do retune, and if EMMC was in HS400 mode,
it will reduce frequency to 52Mhz firstly, then results in card init
was doing at 52Mhz.
The mmc_send_status was originally only done for mmc_test, should drop
it. And, rename the "eMMC hardware reset" to "Reset test", as we would
also be able to use the test for SD-cards.

Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com>
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: bd11e8bd03ca ("mmc: core: Flag re-tuning is needed on CRC errors")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 drivers/mmc/card/mmc_test.c | 9 +++------
 drivers/mmc/core/mmc.c      | 7 -------
 2 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index b78cf5d403a33..7fc9174d46191 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -2263,15 +2263,12 @@ static int mmc_test_profile_sglen_r_nonblock_perf(struct mmc_test_card *test)
 /*
  * eMMC hardware reset.
  */
-static int mmc_test_hw_reset(struct mmc_test_card *test)
+static int mmc_test_reset(struct mmc_test_card *test)
 {
 	struct mmc_card *card = test->card;
 	struct mmc_host *host = card->host;
 	int err;
 
-	if (!mmc_card_mmc(card) || !mmc_can_reset(card))
-		return RESULT_UNSUP_CARD;
-
 	err = mmc_hw_reset(host);
 	if (!err)
 		return RESULT_OK;
@@ -2605,8 +2602,8 @@ static const struct mmc_test_case mmc_test_cases[] = {
 	},
 
 	{
-		.name = "eMMC hardware reset",
-		.run = mmc_test_hw_reset,
+		.name = "Reset test",
+		.run = mmc_test_reset,
 	},
 };
 
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index e726903170a82..f6cd995dbe920 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1924,7 +1924,6 @@ EXPORT_SYMBOL(mmc_can_reset);
 static int mmc_reset(struct mmc_host *host)
 {
 	struct mmc_card *card = host->card;
-	u32 status;
 
 	if (!(host->caps & MMC_CAP_HW_RESET) || !host->ops->hw_reset)
 		return -EOPNOTSUPP;
@@ -1937,12 +1936,6 @@ static int mmc_reset(struct mmc_host *host)
 
 	host->ops->hw_reset(host);
 
-	/* If the reset has happened, then a status command will fail */
-	if (!mmc_send_status(card, &status)) {
-		mmc_host_clk_release(host);
-		return -ENOSYS;
-	}
-
 	/* Set initial state and call mmc_set_ios */
 	mmc_set_initial_state(host);
 	mmc_host_clk_release(host);

From cbf3ccd09d683abf1cacd36e3640872ee912d99b Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@suse.de>
Date: Tue, 20 Oct 2015 14:59:36 +0200
Subject: [PATCH 113/199] iommu/amd: Don't clear DTE flags when modifying it

During device assignment/deassignment the flags in the DTE
get lost, which might cause spurious faults, for example
when the device tries to access the system management range.
Fix this by not clearing the flags with the rest of the DTE.

Reported-by: G. Richard Bellamy <rbellamy@pteradigm.com>
Tested-by: G. Richard Bellamy <rbellamy@pteradigm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c       | 4 ++--
 drivers/iommu/amd_iommu_types.h | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 08d2775887f7a..532e2a211fe1c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1974,8 +1974,8 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
 static void clear_dte_entry(u16 devid)
 {
 	/* remove entry from the device table seen by the hardware */
-	amd_iommu_dev_table[devid].data[0] = IOMMU_PTE_P | IOMMU_PTE_TV;
-	amd_iommu_dev_table[devid].data[1] = 0;
+	amd_iommu_dev_table[devid].data[0]  = IOMMU_PTE_P | IOMMU_PTE_TV;
+	amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
 	amd_iommu_apply_erratum_63(devid);
 }
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index f65908841be01..c9b64722f6230 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -295,6 +295,7 @@
 #define IOMMU_PTE_IR (1ULL << 61)
 #define IOMMU_PTE_IW (1ULL << 62)
 
+#define DTE_FLAG_MASK	(0x3ffULL << 32)
 #define DTE_FLAG_IOTLB	(0x01UL << 32)
 #define DTE_FLAG_GV	(0x01ULL << 55)
 #define DTE_GLX_SHIFT	(56)

From 23316316c1af0677a041c81f3ad6efb9dc470b33 Mon Sep 17 00:00:00 2001
From: Paul Mackerras <paulus@samba.org>
Date: Wed, 21 Oct 2015 16:03:14 +1100
Subject: [PATCH 114/199] powerpc: Revert "Use the POWER8 Micro Partition
 Prefetch Engine in KVM HV on POWER8"

This reverts commit 9678cdaae939 ("Use the POWER8 Micro Partition
Prefetch Engine in KVM HV on POWER8") because the original commit had
multiple, partly self-cancelling bugs, that could cause occasional
memory corruption.

In fact the logmpp instruction was incorrectly using register r0 as the
source of the buffer address and operation code, and depending on what
was in r0, it would either do nothing or corrupt the 64k page pointed to
by r0.

The logmpp instruction encoding and the operation code definitions could
be corrected, but then there is the problem that there is no clearly
defined way to know when the hardware has finished writing to the
buffer.

The original commit attempted to work around this by aborting the
write-out before starting the prefetch, but this is ineffective in the
case where the virtual core is now executing on a different physical
core from the one where the write-out was initiated.

These problems plus advice from the hardware designers not to use the
function (since the measured performance improvement from using the
feature was actually mostly negative), mean that reverting the code is
the best option.

Fixes: 9678cdaae939 ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8")
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/cache.h      |  7 ----
 arch/powerpc/include/asm/kvm_host.h   |  2 -
 arch/powerpc/include/asm/ppc-opcode.h | 17 ---------
 arch/powerpc/include/asm/reg.h        |  1 -
 arch/powerpc/kvm/book3s_hv.c          | 55 +--------------------------
 5 files changed, 1 insertion(+), 81 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 0dc42c5082b74..5f8229e24fe65 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -3,7 +3,6 @@
 
 #ifdef __KERNEL__
 
-#include <asm/reg.h>
 
 /* bytes per L1 cache line */
 #if defined(CONFIG_8xx) || defined(CONFIG_403GCX)
@@ -40,12 +39,6 @@ struct ppc64_caches {
 };
 
 extern struct ppc64_caches ppc64_caches;
-
-static inline void logmpp(u64 x)
-{
-	asm volatile(PPC_LOGMPP(R1) : : "r" (x));
-}
-
 #endif /* __powerpc64__ && ! __ASSEMBLY__ */
 
 #if defined(__ASSEMBLY__)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 827a38d7a9dbe..887c259556dff 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -297,8 +297,6 @@ struct kvmppc_vcore {
 	u32 arch_compat;
 	ulong pcr;
 	ulong dpdes;		/* doorbell state (POWER8) */
-	void *mpp_buffer; /* Micro Partition Prefetch buffer */
-	bool mpp_buffer_is_valid;
 	ulong conferring_threads;
 };
 
diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index 790f5d1d9a462..7ab04fc59e246 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -141,7 +141,6 @@
 #define PPC_INST_ISEL			0x7c00001e
 #define PPC_INST_ISEL_MASK		0xfc00003e
 #define PPC_INST_LDARX			0x7c0000a8
-#define PPC_INST_LOGMPP			0x7c0007e4
 #define PPC_INST_LSWI			0x7c0004aa
 #define PPC_INST_LSWX			0x7c00042a
 #define PPC_INST_LWARX			0x7c000028
@@ -285,20 +284,6 @@
 #define __PPC_EH(eh)	0
 #endif
 
-/* POWER8 Micro Partition Prefetch (MPP) parameters */
-/* Address mask is common for LOGMPP instruction and MPPR SPR */
-#define PPC_MPPE_ADDRESS_MASK 0xffffffffc000ULL
-
-/* Bits 60 and 61 of MPP SPR should be set to one of the following */
-/* Aborting the fetch is indeed setting 00 in the table size bits */
-#define PPC_MPPR_FETCH_ABORT (0x0ULL << 60)
-#define PPC_MPPR_FETCH_WHOLE_TABLE (0x2ULL << 60)
-
-/* Bits 54 and 55 of register for LOGMPP instruction should be set to: */
-#define PPC_LOGMPP_LOG_L2 (0x02ULL << 54)
-#define PPC_LOGMPP_LOG_L2L3 (0x01ULL << 54)
-#define PPC_LOGMPP_LOG_ABORT (0x03ULL << 54)
-
 /* Deal with instructions that older assemblers aren't aware of */
 #define	PPC_DCBAL(a, b)		stringify_in_c(.long PPC_INST_DCBAL | \
 					__PPC_RA(a) | __PPC_RB(b))
@@ -307,8 +292,6 @@
 #define PPC_LDARX(t, a, b, eh)	stringify_in_c(.long PPC_INST_LDARX | \
 					___PPC_RT(t) | ___PPC_RA(a) | \
 					___PPC_RB(b) | __PPC_EH(eh))
-#define PPC_LOGMPP(b)		stringify_in_c(.long PPC_INST_LOGMPP | \
-					__PPC_RB(b))
 #define PPC_LWARX(t, a, b, eh)	stringify_in_c(.long PPC_INST_LWARX | \
 					___PPC_RT(t) | ___PPC_RA(a) | \
 					___PPC_RB(b) | __PPC_EH(eh))
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index aa1cc5f015eee..a908ada8e0a53 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -226,7 +226,6 @@
 #define   CTRL_TE	0x00c00000	/* thread enable */
 #define   CTRL_RUNLATCH	0x1
 #define SPRN_DAWR	0xB4
-#define SPRN_MPPR	0xB8	/* Micro Partition Prefetch Register */
 #define SPRN_RPR	0xBA	/* Relative Priority Register */
 #define SPRN_CIABR	0xBB
 #define   CIABR_PRIV		0x3
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2280497868886..9c26c5a96ea2b 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -36,7 +36,6 @@
 
 #include <asm/reg.h>
 #include <asm/cputable.h>
-#include <asm/cache.h>
 #include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 #include <asm/uaccess.h>
@@ -75,12 +74,6 @@
 
 static DECLARE_BITMAP(default_enabled_hcalls, MAX_HCALL_OPCODE/4 + 1);
 
-#if defined(CONFIG_PPC_64K_PAGES)
-#define MPP_BUFFER_ORDER	0
-#elif defined(CONFIG_PPC_4K_PAGES)
-#define MPP_BUFFER_ORDER	3
-#endif
-
 static int dynamic_mt_modes = 6;
 module_param(dynamic_mt_modes, int, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(dynamic_mt_modes, "Set of allowed dynamic micro-threading modes: 0 (= none), 2, 4, or 6 (= 2 or 4)");
@@ -1455,13 +1448,6 @@ static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
 	vcore->kvm = kvm;
 	INIT_LIST_HEAD(&vcore->preempt_list);
 
-	vcore->mpp_buffer_is_valid = false;
-
-	if (cpu_has_feature(CPU_FTR_ARCH_207S))
-		vcore->mpp_buffer = (void *)__get_free_pages(
-			GFP_KERNEL|__GFP_ZERO,
-			MPP_BUFFER_ORDER);
-
 	return vcore;
 }
 
@@ -1894,33 +1880,6 @@ static int on_primary_thread(void)
 	return 1;
 }
 
-static void kvmppc_start_saving_l2_cache(struct kvmppc_vcore *vc)
-{
-	phys_addr_t phy_addr, mpp_addr;
-
-	phy_addr = (phys_addr_t)virt_to_phys(vc->mpp_buffer);
-	mpp_addr = phy_addr & PPC_MPPE_ADDRESS_MASK;
-
-	mtspr(SPRN_MPPR, mpp_addr | PPC_MPPR_FETCH_ABORT);
-	logmpp(mpp_addr | PPC_LOGMPP_LOG_L2);
-
-	vc->mpp_buffer_is_valid = true;
-}
-
-static void kvmppc_start_restoring_l2_cache(const struct kvmppc_vcore *vc)
-{
-	phys_addr_t phy_addr, mpp_addr;
-
-	phy_addr = virt_to_phys(vc->mpp_buffer);
-	mpp_addr = phy_addr & PPC_MPPE_ADDRESS_MASK;
-
-	/* We must abort any in-progress save operations to ensure
-	 * the table is valid so that prefetch engine knows when to
-	 * stop prefetching. */
-	logmpp(mpp_addr | PPC_LOGMPP_LOG_ABORT);
-	mtspr(SPRN_MPPR, mpp_addr | PPC_MPPR_FETCH_WHOLE_TABLE);
-}
-
 /*
  * A list of virtual cores for each physical CPU.
  * These are vcores that could run but their runner VCPU tasks are
@@ -2471,14 +2430,8 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 
 	srcu_idx = srcu_read_lock(&vc->kvm->srcu);
 
-	if (vc->mpp_buffer_is_valid)
-		kvmppc_start_restoring_l2_cache(vc);
-
 	__kvmppc_vcore_entry();
 
-	if (vc->mpp_buffer)
-		kvmppc_start_saving_l2_cache(vc);
-
 	srcu_read_unlock(&vc->kvm->srcu, srcu_idx);
 
 	spin_lock(&vc->lock);
@@ -3073,14 +3026,8 @@ static void kvmppc_free_vcores(struct kvm *kvm)
 {
 	long int i;
 
-	for (i = 0; i < KVM_MAX_VCORES; ++i) {
-		if (kvm->arch.vcores[i] && kvm->arch.vcores[i]->mpp_buffer) {
-			struct kvmppc_vcore *vc = kvm->arch.vcores[i];
-			free_pages((unsigned long)vc->mpp_buffer,
-				   MPP_BUFFER_ORDER);
-		}
+	for (i = 0; i < KVM_MAX_VCORES; ++i)
 		kfree(kvm->arch.vcores[i]);
-	}
 	kvm->arch.online_vcores = 0;
 }
 

From 53c656c4138511c2ba54df413dc29976cfa9f084 Mon Sep 17 00:00:00 2001
From: Paul Mackerras <paulus@samba.org>
Date: Wed, 21 Oct 2015 16:06:24 +1100
Subject: [PATCH 115/199] powerpc/powernv: Handle irq_happened flag correctly
 in off-line loop

This fixes a bug where it is possible for an off-line CPU to fail to go
into a low-power state (nap/sleep/winkle), and to become unresponsive to
requests from the KVM subsystem to wake up and run a VCPU. What can
happen is that a maskable interrupt of some kind (external, decrementer,
hypervisor doorbell, or HMI) after we have called local_irq_disable() at
the beginning of pnv_smp_cpu_kill_self() and before interrupts are
hard-disabled inside power7_nap/sleep/winkle(). In this situation, the
pending event is marked in the irq_happened flag in the PACA. This
pending event prevents power7_nap/sleep/winkle from going to the
requested low-power state; instead they return immediately. We don't
deal with any of these pending event flags in the off-line loop in
pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case,
so we will have srr1 == 0, and none of the processing to clear
interrupts or doorbells will be done.

Usually, the most obvious symptom of this is that a KVM guest will fail
with a console message saying "KVM: couldn't grab cpu N".

This fixes the problem by making sure we handle the irq_happened flags
properly. First, we hard-disable before the off-line loop. Once we have
hard-disabled, the irq_happened flags can't change underneath us. We
unconditionally clear the DEC and HMI flags: there is no processing of
timer interrupts while off-line, and the necessary HMI processing is all
done in lower-level code. We leave the EE and DBELL flags alone for the
first iteration of the loop, so that we won't fail to respond to a
split-core request that came in just before hard-disabling. Within the
loop, we handle external interrupts if the EE bit is set in irq_happened
as well as if the low-power state was interrupted by an external
interrupt. (We don't need to do the msgclr for a pending doorbell in
irq_happened, because doorbells are edge-triggered and don't remain
pending in hardware.) Then we clear both the EE and DBELL flags, and
once clear, they cannot be set again (until this CPU comes online again,
that is).

This also fixes the debug check to not be done when we just ran a KVM
guest or when the sleep didn't happen because of a pending event in
irq_happened.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/platforms/powernv/smp.c | 29 +++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index 8f70ba681a78b..ca264833ee64d 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -171,7 +171,26 @@ static void pnv_smp_cpu_kill_self(void)
 	 * so clear LPCR:PECE1. We keep PECE2 enabled.
 	 */
 	mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1);
+
+	/*
+	 * Hard-disable interrupts, and then clear irq_happened flags
+	 * that we can safely ignore while off-line, since they
+	 * are for things for which we do no processing when off-line
+	 * (or in the case of HMI, all the processing we need to do
+	 * is done in lower-level real-mode code).
+	 */
+	hard_irq_disable();
+	local_paca->irq_happened &= ~(PACA_IRQ_DEC | PACA_IRQ_HMI);
+
 	while (!generic_check_cpu_restart(cpu)) {
+		/*
+		 * Clear IPI flag, since we don't handle IPIs while
+		 * offline, except for those when changing micro-threading
+		 * mode, which are handled explicitly below, and those
+		 * for coming online, which are handled via
+		 * generic_check_cpu_restart() calls.
+		 */
+		kvmppc_set_host_ipi(cpu, 0);
 
 		ppc64_runlatch_off();
 
@@ -196,20 +215,20 @@ static void pnv_smp_cpu_kill_self(void)
 		 * having finished executing in a KVM guest, then srr1
 		 * contains 0.
 		 */
-		if ((srr1 & wmask) == SRR1_WAKEEE) {
+		if (((srr1 & wmask) == SRR1_WAKEEE) ||
+		    (local_paca->irq_happened & PACA_IRQ_EE)) {
 			icp_native_flush_interrupt();
-			local_paca->irq_happened &= PACA_IRQ_HARD_DIS;
-			smp_mb();
 		} else if ((srr1 & wmask) == SRR1_WAKEHDBELL) {
 			unsigned long msg = PPC_DBELL_TYPE(PPC_DBELL_SERVER);
 			asm volatile(PPC_MSGCLR(%0) : : "r" (msg));
-			kvmppc_set_host_ipi(cpu, 0);
 		}
+		local_paca->irq_happened &= ~(PACA_IRQ_EE | PACA_IRQ_DBELL);
+		smp_mb();
 
 		if (cpu_core_split_required())
 			continue;
 
-		if (!generic_check_cpu_restart(cpu))
+		if (srr1 && !generic_check_cpu_restart(cpu))
 			DBG("CPU%d Unexpected exit while offline !\n", cpu);
 	}
 	mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_PECE1);

From f8f2dc4a7127725383c93b501fcc4e47871b0a9d Mon Sep 17 00:00:00 2001
From: Bard Liao <bardliao@realtek.com>
Date: Wed, 21 Oct 2015 16:18:18 +0800
Subject: [PATCH 116/199] ASoC: rt298: fix wrong setting of gpio2_en

The register value to enable gpio2 was incorrect. So fix it.

Signed-off-by: Bard Liao <bardliao@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 sound/soc/codecs/rt298.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/codecs/rt298.c b/sound/soc/codecs/rt298.c
index d3e30a645ae30..f823eb502367d 100644
--- a/sound/soc/codecs/rt298.c
+++ b/sound/soc/codecs/rt298.c
@@ -1214,7 +1214,7 @@ static int rt298_i2c_probe(struct i2c_client *i2c,
 	mdelay(10);
 
 	if (!rt298->pdata.gpio2_en)
-		regmap_write(rt298->regmap, RT298_SET_DMIC2_DEFAULT, 0x4000);
+		regmap_write(rt298->regmap, RT298_SET_DMIC2_DEFAULT, 0x40);
 	else
 		regmap_write(rt298->regmap, RT298_SET_DMIC2_DEFAULT, 0);
 

From e27c5b9d23168cc2cb8fec147ae7ed1f7a2005c3 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 13 Oct 2015 18:14:19 -0400
Subject: [PATCH 117/199] writeback: remove broken
 rbtree_postorder_for_each_entry_safe() usage in cgwb_bdi_destroy()

a20135ffbc44 ("writeback: don't drain bdi_writeback_congested on bdi
destruction") added rbtree_postorder_for_each_entry_safe() which is
used to remove all entries; however, according to Cody, the iterator
isn't safe against operations which may rebalance the tree.  Fix it by
switching to repeatedly removing rb_first() until empty.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Cody P Schafer <dev@codyps.com>
Fixes: a20135ffbc44 ("writeback: don't drain bdi_writeback_congested on bdi destruction")
Link: http://lkml.kernel.org/g/1443997973-1700-1-git-send-email-dev@codyps.com
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 mm/backing-dev.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 9e841399041a4..619984fc07ec3 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -681,7 +681,7 @@ static int cgwb_bdi_init(struct backing_dev_info *bdi)
 static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
 {
 	struct radix_tree_iter iter;
-	struct bdi_writeback_congested *congested, *congested_n;
+	struct rb_node *rbn;
 	void **slot;
 
 	WARN_ON(test_bit(WB_registered, &bdi->wb.state));
@@ -691,9 +691,11 @@ static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
 	radix_tree_for_each_slot(slot, &bdi->cgwb_tree, &iter, 0)
 		cgwb_kill(*slot);
 
-	rbtree_postorder_for_each_entry_safe(congested, congested_n,
-					&bdi->cgwb_congested_tree, rb_node) {
-		rb_erase(&congested->rb_node, &bdi->cgwb_congested_tree);
+	while ((rbn = rb_first(&bdi->cgwb_congested_tree))) {
+		struct bdi_writeback_congested *congested =
+			rb_entry(rbn, struct bdi_writeback_congested, rb_node);
+
+		rb_erase(rbn, &bdi->cgwb_congested_tree);
 		congested->bdi = NULL;	/* mark @congested unlinked */
 	}
 

From 09dc1387c9c06cdaf55bc99b35238bd2ec0aed4b Mon Sep 17 00:00:00 2001
From: Thomas Hellstrom <thellstrom@vmware.com>
Date: Wed, 21 Oct 2015 21:31:49 +0200
Subject: [PATCH 118/199] drm/vmwgfx: Stabilize the command buffer submission
 code

This commit addresses some stability problems with the command buffer
submission code recently introduced:

1) Make the vmw_cmdbuf_man_process() function handle reruns internally to
avoid losing interrupts if the caller forgets to rerun on -EAGAIN.
2) Handle default command buffer allocations using inline command buffers.
This avoids rare allocation deadlocks.
3) In case of command buffer errors we might lose fence submissions.
Therefore send a new fence after each command buffer error. This will help
avoid lengthy fence waits.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c | 34 +++++++++++++++-----------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
index 8a76821177a6c..6377e8151000f 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
@@ -415,16 +415,16 @@ static void vmw_cmdbuf_ctx_process(struct vmw_cmdbuf_man *man,
  *
  * Calls vmw_cmdbuf_ctx_process() on all contexts. If any context has
  * command buffers left that are not submitted to hardware, Make sure
- * IRQ handling is turned on. Otherwise, make sure it's turned off. This
- * function may return -EAGAIN to indicate it should be rerun due to
- * possibly missed IRQs if IRQs has just been turned on.
+ * IRQ handling is turned on. Otherwise, make sure it's turned off.
  */
-static int vmw_cmdbuf_man_process(struct vmw_cmdbuf_man *man)
+static void vmw_cmdbuf_man_process(struct vmw_cmdbuf_man *man)
 {
-	int notempty = 0;
+	int notempty;
 	struct vmw_cmdbuf_context *ctx;
 	int i;
 
+retry:
+	notempty = 0;
 	for_each_cmdbuf_ctx(man, i, ctx)
 		vmw_cmdbuf_ctx_process(man, ctx, &notempty);
 
@@ -440,10 +440,8 @@ static int vmw_cmdbuf_man_process(struct vmw_cmdbuf_man *man)
 		man->irq_on = true;
 
 		/* Rerun in case we just missed an irq. */
-		return -EAGAIN;
+		goto retry;
 	}
-
-	return 0;
 }
 
 /**
@@ -468,8 +466,7 @@ static void vmw_cmdbuf_ctx_add(struct vmw_cmdbuf_man *man,
 	header->cb_context = cb_context;
 	list_add_tail(&header->list, &man->ctx[cb_context].submitted);
 
-	if (vmw_cmdbuf_man_process(man) == -EAGAIN)
-		vmw_cmdbuf_man_process(man);
+	vmw_cmdbuf_man_process(man);
 }
 
 /**
@@ -488,8 +485,7 @@ static void vmw_cmdbuf_man_tasklet(unsigned long data)
 	struct vmw_cmdbuf_man *man = (struct vmw_cmdbuf_man *) data;
 
 	spin_lock(&man->lock);
-	if (vmw_cmdbuf_man_process(man) == -EAGAIN)
-		(void) vmw_cmdbuf_man_process(man);
+	vmw_cmdbuf_man_process(man);
 	spin_unlock(&man->lock);
 }
 
@@ -507,6 +503,7 @@ static void vmw_cmdbuf_work_func(struct work_struct *work)
 	struct vmw_cmdbuf_man *man =
 		container_of(work, struct vmw_cmdbuf_man, work);
 	struct vmw_cmdbuf_header *entry, *next;
+	uint32_t dummy;
 	bool restart = false;
 
 	spin_lock_bh(&man->lock);
@@ -523,6 +520,8 @@ static void vmw_cmdbuf_work_func(struct work_struct *work)
 	if (restart && vmw_cmdbuf_startstop(man, true))
 		DRM_ERROR("Failed restarting command buffer context 0.\n");
 
+	/* Send a new fence in case one was removed */
+	vmw_fifo_send_fence(man->dev_priv, &dummy);
 }
 
 /**
@@ -682,7 +681,7 @@ static bool vmw_cmdbuf_try_alloc(struct vmw_cmdbuf_man *man,
 					 DRM_MM_SEARCH_DEFAULT,
 					 DRM_MM_CREATE_DEFAULT);
 	if (ret) {
-		(void) vmw_cmdbuf_man_process(man);
+		vmw_cmdbuf_man_process(man);
 		ret = drm_mm_insert_node_generic(&man->mm, info->node,
 						 info->page_size, 0, 0,
 						 DRM_MM_SEARCH_DEFAULT,
@@ -1168,7 +1167,14 @@ int vmw_cmdbuf_set_pool_size(struct vmw_cmdbuf_man *man,
 	drm_mm_init(&man->mm, 0, size >> PAGE_SHIFT);
 
 	man->has_pool = true;
-	man->default_size = default_size;
+
+	/*
+	 * For now, set the default size to VMW_CMDBUF_INLINE_SIZE to
+	 * prevent deadlocks from happening when vmw_cmdbuf_space_pool()
+	 * needs to wait for space and we block on further command
+	 * submissions to be able to free up space.
+	 */
+	man->default_size = VMW_CMDBUF_INLINE_SIZE;
 	DRM_INFO("Using command buffers with %s pool.\n",
 		 (man->using_mob) ? "MOB" : "DMA");
 

From 0ca81a2840f77855bbad1b9f172c545c4dc9e6a4 Mon Sep 17 00:00:00 2001
From: Doron Tsur <doront@mellanox.com>
Date: Sun, 11 Oct 2015 15:58:17 +0300
Subject: [PATCH 119/199] IB/cm: Fix rb-tree duplicate free and use-after-free

ib_send_cm_sidr_rep could sometimes erase the node from the sidr
(depending on errors in the process). Since ib_send_cm_sidr_rep is
called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv
could be either erased from the rb_tree twice or not erased at all.
Fixing that by making sure it's erased only once before freeing
cm_id_priv.

Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
Signed-off-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/core/cm.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index ea4db9c1d44fb..4f918b929eca9 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -835,6 +835,11 @@ static void cm_destroy_id(struct ib_cm_id *cm_id, int err)
 	case IB_CM_SIDR_REQ_RCVD:
 		spin_unlock_irq(&cm_id_priv->lock);
 		cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
+		spin_lock_irq(&cm.lock);
+		if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node))
+			rb_erase(&cm_id_priv->sidr_id_node,
+				 &cm.remote_sidr_table);
+		spin_unlock_irq(&cm.lock);
 		break;
 	case IB_CM_REQ_SENT:
 	case IB_CM_MRA_REQ_RCVD:
@@ -3172,7 +3177,10 @@ int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
 	spin_unlock_irqrestore(&cm_id_priv->lock, flags);
 
 	spin_lock_irqsave(&cm.lock, flags);
-	rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
+	if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node)) {
+		rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
+		RB_CLEAR_NODE(&cm_id_priv->sidr_id_node);
+	}
 	spin_unlock_irqrestore(&cm.lock, flags);
 	return 0;
 

From 30730c7f5943b3beace1e29f7f1476e05de3da14 Mon Sep 17 00:00:00 2001
From: Adam Richter <adamrichter4@gmail.com>
Date: Fri, 16 Oct 2015 03:33:02 -0700
Subject: [PATCH 120/199] drm: fix mutex leak in drm_dp_get_mst_branch_device

In Linux 4.3-rc5, there is an error case in drm_dp_get_branch_device
that returns without releasing mgr->lock, resulting a spew of kernel
messages about a kernel work function possibly having leaked a mutex
and presumably more serious adverse consequences later.  This patch
changes the error to "goto out" to unlock the mutex before returning.

[airlied: grabbed from drm-next as it fixes something we've seen]

Signed-off-by: Adam J. Richter <adam_richter2004@yahoo.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index 5bca390d9ae26..809959d56d782 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -1194,17 +1194,18 @@ static struct drm_dp_mst_branch *drm_dp_get_mst_branch_device(struct drm_dp_mst_
 
 		list_for_each_entry(port, &mstb->ports, next) {
 			if (port->port_num == port_num) {
-				if (!port->mstb) {
+				mstb = port->mstb;
+				if (!mstb) {
 					DRM_ERROR("failed to lookup MSTB with lct %d, rad %02x\n", lct, rad[0]);
-					return NULL;
+					goto out;
 				}
 
-				mstb = port->mstb;
 				break;
 			}
 		}
 	}
 	kref_get(&mstb->kref);
+out:
 	mutex_unlock(&mgr->lock);
 	return mstb;
 }

From 2a6c521bb41ce862e43db46f52e7681d33e8d771 Mon Sep 17 00:00:00 2001
From: Ilia Mirkin <imirkin@alum.mit.edu>
Date: Tue, 20 Oct 2015 01:15:39 -0400
Subject: [PATCH 121/199] drm/nouveau/gem: return only valid domain when
 there's only one

On nv50+, we restrict the valid domains to just the one where the buffer
was originally created. However after the buffer is evicted to system
memory, we might move it back to a different domain that was not
originally valid. When sharing the buffer and retrieving its GEM_INFO
data, we still want the domain that will be valid for this buffer in a
pushbuf, not the one where it currently happens to be.

This resolves fdo#92504 and several others. These are due to suspend
evicting all buffers, making it more likely that they temporarily end up
in the wrong place.

Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92504
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
---
 drivers/gpu/drm/nouveau/nouveau_gem.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 2c9981512d27b..41be584147b93 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -227,11 +227,12 @@ nouveau_gem_info(struct drm_file *file_priv, struct drm_gem_object *gem,
 	struct nouveau_bo *nvbo = nouveau_gem_object(gem);
 	struct nvkm_vma *vma;
 
-	if (nvbo->bo.mem.mem_type == TTM_PL_TT)
+	if (is_power_of_2(nvbo->valid_domains))
+		rep->domain = nvbo->valid_domains;
+	else if (nvbo->bo.mem.mem_type == TTM_PL_TT)
 		rep->domain = NOUVEAU_GEM_DOMAIN_GART;
 	else
 		rep->domain = NOUVEAU_GEM_DOMAIN_VRAM;
-
 	rep->offset = nvbo->bo.offset;
 	if (cli->vm) {
 		vma = nouveau_bo_vma_find(nvbo, cli->vm);

From 8832317f662c06f5c06e638f57bfe89a71c9b266 Mon Sep 17 00:00:00 2001
From: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Date: Fri, 16 Oct 2015 15:53:29 +0530
Subject: [PATCH 122/199] powerpc/rtas: Validate rtas.entry before calling
 enter_rtas()

Currently we do not validate rtas.entry before calling enter_rtas(). This
leads to a kernel oops when user space calls rtas system call on a powernv
platform (see below). This patch adds code to validate rtas.entry before
making enter_rtas() call.

  Oops: Exception in kernel mode, sig: 4 [#1]
  SMP NR_CPUS=1024 NUMA PowerNV
  task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000
  NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140
  REGS: c0000007e1a7b920 TRAP: 0e40   Not tainted  (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le)
  MSR: 1000000000081000 <HV,ME>  CR: 00000000  XER: 00000000
  CFAR: c000000000009c0c SOFTE: 0
  NIP [0000000000000000]           (null)
  LR [0000000000009c14] 0x9c14
  Call Trace:
  [c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable)
  [c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0
  [c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98

Cc: stable@vger.kernel.org # v3.2+
Fixes: 55190f88789a ("powerpc: Add skeleton PowerNV platform")
Reported-by: NAGESWARA R. SASTRY <nasastry@in.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[mpe: Reword change log, trim oops, and add stable + fixes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/rtas.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 84bf934cf7487..5a753fae8265a 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1043,6 +1043,9 @@ asmlinkage int ppc_rtas(struct rtas_args __user *uargs)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (!rtas.entry)
+		return -EINVAL;
+
 	if (copy_from_user(&args, uargs, 3 * sizeof(u32)) != 0)
 		return -EFAULT;
 

From 0f89abf56abbd0e1c6e3cef9813e6d9f05383c1e Mon Sep 17 00:00:00 2001
From: Christian Engelmayer <cengelma@gmx.at>
Date: Wed, 21 Oct 2015 00:50:06 +0200
Subject: [PATCH 123/199] btrfs: fix possible leak in btrfs_ioctl_balance()

Commit 8eb934591f8b ("btrfs: check unsupported filters in balance
arguments") adds a jump to exit label out_bargs in case the argument
check fails. At this point in addition to the bargs memory, the
memory for struct btrfs_balance_control has already been allocated.
Ownership of bctl is passed to btrfs_balance() in the good case,
thus the memory is not freed due to the introduced jump. Make sure
that the memory gets freed in any case as necessary. Detected by
Coverity CID 1328378.

Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
---
 fs/btrfs/ioctl.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3e3e6130637fa..8d20f3b1cab0a 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -4641,7 +4641,7 @@ static long btrfs_ioctl_balance(struct file *file, void __user *arg)
 
 	if (bctl->flags & ~(BTRFS_BALANCE_ARGS_MASK | BTRFS_BALANCE_TYPE_MASK)) {
 		ret = -EINVAL;
-		goto out_bargs;
+		goto out_bctl;
 	}
 
 do_balance:
@@ -4655,12 +4655,15 @@ static long btrfs_ioctl_balance(struct file *file, void __user *arg)
 	need_unlock = false;
 
 	ret = btrfs_balance(bctl, bargs);
+	bctl = NULL;
 
 	if (arg) {
 		if (copy_to_user(arg, bargs, sizeof(*bargs)))
 			ret = -EFAULT;
 	}
 
+out_bctl:
+	kfree(bctl);
 out_bargs:
 	kfree(bargs);
 out_unlock:

From 4eb0f7abcefad2d4c127aa7502d3122635eddab0 Mon Sep 17 00:00:00 2001
From: Jiada Wang <jiada_wang@mentor.com>
Date: Tue, 20 Oct 2015 11:47:11 +0900
Subject: [PATCH 124/199] ASoC: wm8962: mark cache_dirty flag after software
 reset in pm_resume

By doing software reset of wm8962 in pm_resume, all registers which
have already been set will be reset to default value without regmap
interface be involved, thus driver need to mark cache_dirty flag,
to let regcache can be updated by regcache_sync().

Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 sound/soc/codecs/wm8962.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/soc/codecs/wm8962.c b/sound/soc/codecs/wm8962.c
index b4eb975da981a..45f06d5f865fa 100644
--- a/sound/soc/codecs/wm8962.c
+++ b/sound/soc/codecs/wm8962.c
@@ -3804,6 +3804,8 @@ static int wm8962_runtime_resume(struct device *dev)
 
 	wm8962_reset(wm8962);
 
+	regcache_mark_dirty(wm8962->regmap);
+
 	/* SYSCLK defaults to on; make sure it is off so we can safely
 	 * write to registers if the device is declocked.
 	 */

From 0729a04977d497cf66234fd7f900ddcec3ef1c52 Mon Sep 17 00:00:00 2001
From: Hezi Shahmoon <hezi@marvell.com>
Date: Tue, 20 Oct 2015 16:32:24 +0200
Subject: [PATCH 125/199] i2c: mv64xxx: really allow I2C offloading

Commit 00d8689b85a7 ("i2c: mv64xxx: rework offload support to fix
several problems") completely reworked the offload support, but left a
debugging-related "return false" at the beginning of the
mv64xxx_i2c_can_offload() function. This has the unfortunate consequence
that offloading is in fact never used, which wasn't really the
intention.

This commit fixes that problem by removing the bogus "return false".

Fixes: 00d8689b85a7 ("i2c: mv64xxx: rework offload support to fix several problems")
Signed-off-by: Hezi Shahmoon <hezi@marvell.com>
[Thomas: reworked commit log and title.]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: <stable@vger.kernel.org>
---
 drivers/i2c/busses/i2c-mv64xxx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c
index 30059c1df2a3b..5801227b97ab0 100644
--- a/drivers/i2c/busses/i2c-mv64xxx.c
+++ b/drivers/i2c/busses/i2c-mv64xxx.c
@@ -669,8 +669,6 @@ mv64xxx_i2c_can_offload(struct mv64xxx_i2c_data *drv_data)
 	struct i2c_msg *msgs = drv_data->msgs;
 	int num = drv_data->num_msgs;
 
-	return false;
-
 	if (!drv_data->offload_enabled)
 		return false;
 

From ebdd4b7e6a0dd86736eeb6b9e60b361ef64ccc30 Mon Sep 17 00:00:00 2001
From: Javier Martinez Canillas <javier@osg.samsung.com>
Date: Sun, 13 Sep 2015 19:39:22 -0300
Subject: [PATCH 126/199] [media] horus3a: Fix horus3a_attach() function
 parameters

If CONFIG_DVB_HORUS3A is disabled a stub static inline function is
defined that just prints a warning about the driver being disabled
but the function parameters were wrong which caused a build error.

Fixes: a5d32b358254f ("[media] horus3a: Sony Horus3A DVB-S/S2 tuner driver")

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
---
 drivers/media/dvb-frontends/horus3a.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/media/dvb-frontends/horus3a.h b/drivers/media/dvb-frontends/horus3a.h
index b055319d532ed..c1e2d1834b782 100644
--- a/drivers/media/dvb-frontends/horus3a.h
+++ b/drivers/media/dvb-frontends/horus3a.h
@@ -46,8 +46,8 @@ extern struct dvb_frontend *horus3a_attach(struct dvb_frontend *fe,
 					const struct horus3a_config *config,
 					struct i2c_adapter *i2c);
 #else
-static inline struct dvb_frontend *horus3a_attach(
-					const struct cxd2820r_config *config,
+static inline struct dvb_frontend *horus3a_attach(struct dvb_frontend *fe,
+					const struct horus3a_config *config,
 					struct i2c_adapter *i2c)
 {
 	printk(KERN_WARNING "%s: driver disabled by Kconfig\n", __func__);

From a9c4e5cfebc44e6caa6b9299af5603f5c2da0c33 Mon Sep 17 00:00:00 2001
From: Javier Martinez Canillas <javier@osg.samsung.com>
Date: Sun, 13 Sep 2015 19:45:21 -0300
Subject: [PATCH 127/199] [media] lnbh25: Fix lnbh25_attach() function return
 type

If CONFIG_DVB_LNBH25 is disabled, a stub static inline function is
defined that just prints a warning about the driver being disabled
but the function return type was wrong which caused a build error.

Fixes: e025273b86fb ("[media] lnbh25: LNBH25 SEC controller driver")

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
---
 drivers/media/dvb-frontends/lnbh25.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/dvb-frontends/lnbh25.h b/drivers/media/dvb-frontends/lnbh25.h
index 69f30e21f6b31..1f329ef05acce 100644
--- a/drivers/media/dvb-frontends/lnbh25.h
+++ b/drivers/media/dvb-frontends/lnbh25.h
@@ -43,7 +43,7 @@ struct dvb_frontend *lnbh25_attach(
 	struct lnbh25_config *cfg,
 	struct i2c_adapter *i2c);
 #else
-static inline dvb_frontend *lnbh25_attach(
+static inline struct dvb_frontend *lnbh25_attach(
 	struct dvb_frontend *fe,
 	struct lnbh25_config *cfg,
 	struct i2c_adapter *i2c)

From bf447221a8791d0f5dd28b19336e31e48f05f04a Mon Sep 17 00:00:00 2001
From: Colin Ian King <colin.king@canonical.com>
Date: Tue, 15 Sep 2015 08:42:27 -0300
Subject: [PATCH 128/199] [media] c8sectpfe: fix ininitialized error return on
 firmware load failure

static analysis with cppcheck detected the following error:

[drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c:1210]:
  (error) Uninitialized variable: ret

ret is never initialised, so garbage is being returned. Instead
return the error return from the call of request_firmware_nowait

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c b/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c
index 486aef50d99b2..16aa494f22be3 100644
--- a/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c
+++ b/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c
@@ -1192,7 +1192,6 @@ static void load_c8sectpfe_fw_cb(const struct firmware *fw, void *context)
 
 static int load_c8sectpfe_fw_step1(struct c8sectpfei *fei)
 {
-	int ret;
 	int err;
 
 	dev_info(fei->dev, "Loading firmware: %s\n", FIRMWARE_MEMDMA);
@@ -1207,7 +1206,7 @@ static int load_c8sectpfe_fw_step1(struct c8sectpfei *fei)
 	if (err) {
 		dev_err(fei->dev, "request_firmware_nowait err: %d.\n", err);
 		complete_all(&fei->fw_ack);
-		return ret;
+		return err;
 	}
 
 	return 0;

From 51a3ac5f4dc45120c78fad51096d989914801457 Mon Sep 17 00:00:00 2001
From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Date: Thu, 17 Sep 2015 07:12:54 -0300
Subject: [PATCH 129/199] [media] c8sectpfe: fix return of garbage

The variable err was never initialized, that means we had been checking
a garbage value in the for loop. Moreover if the segment is not outside
the firmware file then also we have been returning the garbage.
Initialize it to 0 so that on success we return the value and no need to
check in the for loop also as it is initially 0 and whenever that value
changes we have done a break from the loop.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
---
 drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c b/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c
index 16aa494f22be3..f922f2e827bcb 100644
--- a/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c
+++ b/drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c
@@ -1097,7 +1097,7 @@ static int load_slim_core_fw(const struct firmware *fw, void *context)
 	Elf32_Ehdr *ehdr;
 	Elf32_Phdr *phdr;
 	u8 __iomem *dst;
-	int err, i;
+	int err = 0, i;
 
 	if (!fw || !context)
 		return -EINVAL;
@@ -1106,7 +1106,7 @@ static int load_slim_core_fw(const struct firmware *fw, void *context)
 	phdr = (Elf32_Phdr *)(fw->data + ehdr->e_phoff);
 
 	/* go through the available ELF segments */
-	for (i = 0; i < ehdr->e_phnum && !err; i++, phdr++) {
+	for (i = 0; i < ehdr->e_phnum; i++, phdr++) {
 
 		/* Only consider LOAD segments */
 		if (phdr->p_type != PT_LOAD)

From 54bec3970cb5351d08866af1ea8b0787edd7ede3 Mon Sep 17 00:00:00 2001
From: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 21 Sep 2015 12:47:11 -0300
Subject: [PATCH 130/199] [media] ir-hix5hd2: drop the use of IRQF_NO_SUSPEND

This driver doesn't claim the IR transmitter to be wakeup source. It
even disables the clock and the IR during suspend-resume cycle.

This patch removes yet another misuse of IRQF_NO_SUSPEND.

Cc: Patrice Chotard <patrice.chotard@st.com>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Guoxiong Yan <yanguoxiong@huawei.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
---
 drivers/media/rc/ir-hix5hd2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/rc/ir-hix5hd2.c b/drivers/media/rc/ir-hix5hd2.c
index 1c087cb76815a..d0549fba711c4 100644
--- a/drivers/media/rc/ir-hix5hd2.c
+++ b/drivers/media/rc/ir-hix5hd2.c
@@ -257,7 +257,7 @@ static int hix5hd2_ir_probe(struct platform_device *pdev)
 		goto clkerr;
 
 	if (devm_request_irq(dev, priv->irq, hix5hd2_ir_rx_interrupt,
-			     IRQF_NO_SUSPEND, pdev->name, priv) < 0) {
+			     0, pdev->name, priv) < 0) {
 		dev_err(dev, "IRQ %d register failed\n", priv->irq);
 		ret = -EINVAL;
 		goto regerr;

From a828d72df216c36e9c40b6c24dc4b17b6f7b5a76 Mon Sep 17 00:00:00 2001
From: Laura Abbott <labbott@fedoraproject.org>
Date: Tue, 29 Sep 2015 21:10:10 -0300
Subject: [PATCH 131/199] [media] si2157: Bounds check firmware

When reading the firmware and sending commands, the length
must be bounds checked to avoid overrunning the size of the command
buffer and smashing the stack if the firmware is not in the
expected format. Add the proper check.

Cc: stable@kernel.org
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
---
 drivers/media/tuners/si2157.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/media/tuners/si2157.c b/drivers/media/tuners/si2157.c
index 507382160e5e7..ce157edd45fa1 100644
--- a/drivers/media/tuners/si2157.c
+++ b/drivers/media/tuners/si2157.c
@@ -166,6 +166,10 @@ static int si2157_init(struct dvb_frontend *fe)
 
 	for (remaining = fw->size; remaining > 0; remaining -= 17) {
 		len = fw->data[fw->size - remaining];
+		if (len > SI2157_ARGLEN) {
+			dev_err(&client->dev, "Bad firmware length\n");
+			goto err_release_firmware;
+		}
 		memcpy(cmd.args, &fw->data[(fw->size - remaining) + 1], len);
 		cmd.wlen = len;
 		cmd.rlen = 1;

From 47810b4341ac9d2f558894bc5995e6fa2a1298f9 Mon Sep 17 00:00:00 2001
From: Laura Abbott <labbott@fedoraproject.org>
Date: Tue, 29 Sep 2015 21:10:09 -0300
Subject: [PATCH 132/199] [media] si2168: Bounds check firmware

When reading the firmware and sending commands, the length must
be bounds checked to avoid overrunning the size of the command
buffer and smashing the stack if the firmware is not in the expected
format:

si2168 11-0064: found a 'Silicon Labs Si2168-B40'
si2168 11-0064: downloading firmware from file 'dvb-demod-si2168-b40-01.fw'
si2168 11-0064: firmware download failed -95
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffa085708f

Add the proper check.

Cc: stable@kernel.org
Reported-by: Stuart Auchterlonie <sauchter@redhat.com>
Reviewed-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
---
 drivers/media/dvb-frontends/si2168.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/media/dvb-frontends/si2168.c b/drivers/media/dvb-frontends/si2168.c
index 81788c5a44d83..821a8f481507a 100644
--- a/drivers/media/dvb-frontends/si2168.c
+++ b/drivers/media/dvb-frontends/si2168.c
@@ -502,6 +502,10 @@ static int si2168_init(struct dvb_frontend *fe)
 		/* firmware is in the new format */
 		for (remaining = fw->size; remaining > 0; remaining -= 17) {
 			len = fw->data[fw->size - remaining];
+			if (len > SI2168_ARGLEN) {
+				ret = -EINVAL;
+				break;
+			}
 			memcpy(cmd.args, &fw->data[(fw->size - remaining) + 1], len);
 			cmd.wlen = len;
 			cmd.rlen = 1;

From 9d2b064c0ae42ad93b2a0c7da05daef312c96bcc Mon Sep 17 00:00:00 2001
From: Abylay Ospan <aospan@netup.ru>
Date: Fri, 25 Sep 2015 04:56:21 -0300
Subject: [PATCH 133/199] [media] netup_unidvb: fix potential crash when spi is
 NULL

Signed-off-by: Abylay Ospan <aospan@netup.ru>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
---
 drivers/media/pci/netup_unidvb/netup_unidvb_spi.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/media/pci/netup_unidvb/netup_unidvb_spi.c b/drivers/media/pci/netup_unidvb/netup_unidvb_spi.c
index f55b3276f28de..56773f3893d40 100644
--- a/drivers/media/pci/netup_unidvb/netup_unidvb_spi.c
+++ b/drivers/media/pci/netup_unidvb/netup_unidvb_spi.c
@@ -80,11 +80,9 @@ irqreturn_t netup_spi_interrupt(struct netup_spi *spi)
 	u16 reg;
 	unsigned long flags;
 
-	if (!spi) {
-		dev_dbg(&spi->master->dev,
-			"%s(): SPI not initialized\n", __func__);
+	if (!spi)
 		return IRQ_NONE;
-	}
+
 	spin_lock_irqsave(&spi->lock, flags);
 	reg = readw(&spi->regs->control_stat);
 	if (!(reg & NETUP_SPI_CTRL_IRQ)) {
@@ -234,11 +232,9 @@ void netup_spi_release(struct netup_unidvb_dev *ndev)
 	unsigned long flags;
 	struct netup_spi *spi = ndev->spi;
 
-	if (!spi) {
-		dev_dbg(&spi->master->dev,
-			"%s(): SPI not initialized\n", __func__);
+	if (!spi)
 		return;
-	}
+
 	spin_lock_irqsave(&spi->lock, flags);
 	reg = readw(&spi->regs->control_stat);
 	writew(reg | NETUP_SPI_CTRL_IRQ, &spi->regs->control_stat);

From 17f38822038ba5d4dba79b72fd111bbf64173063 Mon Sep 17 00:00:00 2001
From: Jacek Anaszewski <j.anaszewski@samsung.com>
Date: Fri, 2 Oct 2015 06:19:15 -0300
Subject: [PATCH 134/199] [media] v4l2-flash-led-class: Add missing VIDEO_V4L2
 Kconfig dependency

Fixes the following randconfig problem:

drivers/built-in.o: In function `v4l2_flash_release':
(.text+0x12204f): undefined reference to `v4l2_async_unregister_subdev'
drivers/built-in.o: In function `v4l2_flash_release':
(.text+0x122057): undefined reference to `v4l2_ctrl_handler_free'
drivers/built-in.o: In function `v4l2_flash_close':
v4l2-flash-led-class.c:(.text+0x12208f): undefined reference to `v4l2_fh_is_singular'
v4l2-flash-led-class.c:(.text+0x1220c8): undefined reference to `__v4l2_ctrl_s_ctrl'
drivers/built-in.o: In function `v4l2_flash_open':
v4l2-flash-led-class.c:(.text+0x12227f): undefined reference to `v4l2_fh_is_singular'
drivers/built-in.o: In function `v4l2_flash_init_controls':
v4l2-flash-led-class.c:(.text+0x12274e): undefined reference to `v4l2_ctrl_handler_init_class'
v4l2-flash-led-class.c:(.text+0x122797): undefined reference to `v4l2_ctrl_new_std_menu'
v4l2-flash-led-class.c:(.text+0x1227e0): undefined reference to `v4l2_ctrl_new_std'
v4l2-flash-led-class.c:(.text+0x122826): undefined reference to `v4l2_ctrl_handler_setup'
v4l2-flash-led-class.c:(.text+0x122839): undefined reference to `v4l2_ctrl_handler_free'
drivers/built-in.o: In function `v4l2_flash_init':
(.text+0x1228e2): undefined reference to `v4l2_subdev_init'
drivers/built-in.o: In function `v4l2_flash_init':
(.text+0x12293b): undefined reference to `v4l2_async_register_subdev'
drivers/built-in.o: In function `v4l2_flash_init':
(.text+0x122949): undefined reference to `v4l2_ctrl_handler_free'
drivers/built-in.o:(.rodata+0x20ef8): undefined reference to `v4l2_subdev_queryctrl'
drivers/built-in.o:(.rodata+0x20f10): undefined reference to `v4l2_subdev_querymenu'

Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Sakari Ailus <sakari.ailus@iki.fi>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
index 82876a67f1449..9beece00869bf 100644
--- a/drivers/media/v4l2-core/Kconfig
+++ b/drivers/media/v4l2-core/Kconfig
@@ -47,7 +47,7 @@ config V4L2_MEM2MEM_DEV
 # Used by LED subsystem flash drivers
 config V4L2_FLASH_LED_CLASS
 	tristate "V4L2 flash API for LED flash class devices"
-	depends on VIDEO_V4L2_SUBDEV_API
+	depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API
 	depends on LEDS_CLASS_FLASH
 	---help---
 	  Say Y here to enable V4L2 flash API support for LED flash

From d18ca5b7ceca0e9674cb4bb2ed476b0fcbb23ba2 Mon Sep 17 00:00:00 2001
From: Antti Palosaari <crope@iki.fi>
Date: Tue, 6 Oct 2015 00:22:23 -0300
Subject: [PATCH 135/199] [media] rtl28xxu: fix control message flaws

Add lock to prevent concurrent access for control message as control
message function uses shared buffer. Without the lock there may be
remote control polling which messes the buffer causing IO errors.
Increase buffer size and add check for maximum supported message
length.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=103391
Fixes: c56222a6b25c ("[media] rtl28xxu: move usb buffers to state")

Cc: <stable@vger.kernel.org> # 4.0+
Signed-off-by: Antti Palosaari <crope@iki.fi>
---
 drivers/media/usb/dvb-usb-v2/rtl28xxu.c | 15 +++++++++++++--
 drivers/media/usb/dvb-usb-v2/rtl28xxu.h |  2 +-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
index c3cac4c12fb3c..197a4f2e54d2a 100644
--- a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
+++ b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
@@ -34,6 +34,14 @@ static int rtl28xxu_ctrl_msg(struct dvb_usb_device *d, struct rtl28xxu_req *req)
 	unsigned int pipe;
 	u8 requesttype;
 
+	mutex_lock(&d->usb_mutex);
+
+	if (req->size > sizeof(dev->buf)) {
+		dev_err(&d->intf->dev, "too large message %u\n", req->size);
+		ret = -EINVAL;
+		goto err_mutex_unlock;
+	}
+
 	if (req->index & CMD_WR_FLAG) {
 		/* write */
 		memcpy(dev->buf, req->data, req->size);
@@ -50,14 +58,17 @@ static int rtl28xxu_ctrl_msg(struct dvb_usb_device *d, struct rtl28xxu_req *req)
 	dvb_usb_dbg_usb_control_msg(d->udev, 0, requesttype, req->value,
 			req->index, dev->buf, req->size);
 	if (ret < 0)
-		goto err;
+		goto err_mutex_unlock;
 
 	/* read request, copy returned data to return buf */
 	if (requesttype == (USB_TYPE_VENDOR | USB_DIR_IN))
 		memcpy(req->data, dev->buf, req->size);
 
+	mutex_unlock(&d->usb_mutex);
+
 	return 0;
-err:
+err_mutex_unlock:
+	mutex_unlock(&d->usb_mutex);
 	dev_dbg(&d->intf->dev, "failed=%d\n", ret);
 	return ret;
 }
diff --git a/drivers/media/usb/dvb-usb-v2/rtl28xxu.h b/drivers/media/usb/dvb-usb-v2/rtl28xxu.h
index 9f6115a2ee016..138062960a736 100644
--- a/drivers/media/usb/dvb-usb-v2/rtl28xxu.h
+++ b/drivers/media/usb/dvb-usb-v2/rtl28xxu.h
@@ -71,7 +71,7 @@
 
 
 struct rtl28xxu_dev {
-	u8 buf[28];
+	u8 buf[128];
 	u8 chip_id;
 	u8 tuner;
 	char *tuner_name;

From 56ea37da3b93dfe46cb5c3ee0ee4cc44229ece47 Mon Sep 17 00:00:00 2001
From: Antti Palosaari <crope@iki.fi>
Date: Sat, 3 Oct 2015 18:35:14 -0300
Subject: [PATCH 136/199] [media] m88ds3103: use own reg update_bits()
 implementation

Device stopped to tuning some channels after regmap conversion.
Reason is that regmap_update_bits() works a bit differently for
partially volatile registers than old homemade routine. Return
back to old routine in order to fix issue.

Fixes: 478932b16052f5ded74685d096ae920cd17d6424

Cc: <stable@kernel.org> # 4.2+
Reported-by: Mark Clarkstone <hello@markclarkstone.co.uk>
Tested-by: Mark Clarkstone <hello@markclarkstone.co.uk>
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
---
 drivers/media/dvb-frontends/m88ds3103.c | 73 ++++++++++++++++---------
 1 file changed, 47 insertions(+), 26 deletions(-)

diff --git a/drivers/media/dvb-frontends/m88ds3103.c b/drivers/media/dvb-frontends/m88ds3103.c
index ff31e7a01ca9a..feeeb70d841ed 100644
--- a/drivers/media/dvb-frontends/m88ds3103.c
+++ b/drivers/media/dvb-frontends/m88ds3103.c
@@ -18,6 +18,27 @@
 
 static struct dvb_frontend_ops m88ds3103_ops;
 
+/* write single register with mask */
+static int m88ds3103_update_bits(struct m88ds3103_dev *dev,
+				u8 reg, u8 mask, u8 val)
+{
+	int ret;
+	u8 tmp;
+
+	/* no need for read if whole reg is written */
+	if (mask != 0xff) {
+		ret = regmap_bulk_read(dev->regmap, reg, &tmp, 1);
+		if (ret)
+			return ret;
+
+		val &= mask;
+		tmp &= ~mask;
+		val |= tmp;
+	}
+
+	return regmap_bulk_write(dev->regmap, reg, &val, 1);
+}
+
 /* write reg val table using reg addr auto increment */
 static int m88ds3103_wr_reg_val_tab(struct m88ds3103_dev *dev,
 		const struct m88ds3103_reg_val *tab, int tab_len)
@@ -394,10 +415,10 @@ static int m88ds3103_set_frontend(struct dvb_frontend *fe)
 			u8tmp2 = 0x00; /* 0b00 */
 			break;
 		}
-		ret = regmap_update_bits(dev->regmap, 0x22, 0xc0, u8tmp1 << 6);
+		ret = m88ds3103_update_bits(dev, 0x22, 0xc0, u8tmp1 << 6);
 		if (ret)
 			goto err;
-		ret = regmap_update_bits(dev->regmap, 0x24, 0xc0, u8tmp2 << 6);
+		ret = m88ds3103_update_bits(dev, 0x24, 0xc0, u8tmp2 << 6);
 		if (ret)
 			goto err;
 	}
@@ -455,13 +476,13 @@ static int m88ds3103_set_frontend(struct dvb_frontend *fe)
 			if (ret)
 				goto err;
 		}
-		ret = regmap_update_bits(dev->regmap, 0x9d, 0x08, 0x08);
+		ret = m88ds3103_update_bits(dev, 0x9d, 0x08, 0x08);
 		if (ret)
 			goto err;
 		ret = regmap_write(dev->regmap, 0xf1, 0x01);
 		if (ret)
 			goto err;
-		ret = regmap_update_bits(dev->regmap, 0x30, 0x80, 0x80);
+		ret = m88ds3103_update_bits(dev, 0x30, 0x80, 0x80);
 		if (ret)
 			goto err;
 	}
@@ -498,7 +519,7 @@ static int m88ds3103_set_frontend(struct dvb_frontend *fe)
 	switch (dev->cfg->ts_mode) {
 	case M88DS3103_TS_SERIAL:
 	case M88DS3103_TS_SERIAL_D7:
-		ret = regmap_update_bits(dev->regmap, 0x29, 0x20, u8tmp1);
+		ret = m88ds3103_update_bits(dev, 0x29, 0x20, u8tmp1);
 		if (ret)
 			goto err;
 		u8tmp1 = 0;
@@ -567,11 +588,11 @@ static int m88ds3103_set_frontend(struct dvb_frontend *fe)
 	if (ret)
 		goto err;
 
-	ret = regmap_update_bits(dev->regmap, 0x4d, 0x02, dev->cfg->spec_inv << 1);
+	ret = m88ds3103_update_bits(dev, 0x4d, 0x02, dev->cfg->spec_inv << 1);
 	if (ret)
 		goto err;
 
-	ret = regmap_update_bits(dev->regmap, 0x30, 0x10, dev->cfg->agc_inv << 4);
+	ret = m88ds3103_update_bits(dev, 0x30, 0x10, dev->cfg->agc_inv << 4);
 	if (ret)
 		goto err;
 
@@ -625,13 +646,13 @@ static int m88ds3103_init(struct dvb_frontend *fe)
 	dev->warm = false;
 
 	/* wake up device from sleep */
-	ret = regmap_update_bits(dev->regmap, 0x08, 0x01, 0x01);
+	ret = m88ds3103_update_bits(dev, 0x08, 0x01, 0x01);
 	if (ret)
 		goto err;
-	ret = regmap_update_bits(dev->regmap, 0x04, 0x01, 0x00);
+	ret = m88ds3103_update_bits(dev, 0x04, 0x01, 0x00);
 	if (ret)
 		goto err;
-	ret = regmap_update_bits(dev->regmap, 0x23, 0x10, 0x00);
+	ret = m88ds3103_update_bits(dev, 0x23, 0x10, 0x00);
 	if (ret)
 		goto err;
 
@@ -749,18 +770,18 @@ static int m88ds3103_sleep(struct dvb_frontend *fe)
 		utmp = 0x29;
 	else
 		utmp = 0x27;
-	ret = regmap_update_bits(dev->regmap, utmp, 0x01, 0x00);
+	ret = m88ds3103_update_bits(dev, utmp, 0x01, 0x00);
 	if (ret)
 		goto err;
 
 	/* sleep */
-	ret = regmap_update_bits(dev->regmap, 0x08, 0x01, 0x00);
+	ret = m88ds3103_update_bits(dev, 0x08, 0x01, 0x00);
 	if (ret)
 		goto err;
-	ret = regmap_update_bits(dev->regmap, 0x04, 0x01, 0x01);
+	ret = m88ds3103_update_bits(dev, 0x04, 0x01, 0x01);
 	if (ret)
 		goto err;
-	ret = regmap_update_bits(dev->regmap, 0x23, 0x10, 0x10);
+	ret = m88ds3103_update_bits(dev, 0x23, 0x10, 0x10);
 	if (ret)
 		goto err;
 
@@ -992,12 +1013,12 @@ static int m88ds3103_set_tone(struct dvb_frontend *fe,
 	}
 
 	utmp = tone << 7 | dev->cfg->envelope_mode << 5;
-	ret = regmap_update_bits(dev->regmap, 0xa2, 0xe0, utmp);
+	ret = m88ds3103_update_bits(dev, 0xa2, 0xe0, utmp);
 	if (ret)
 		goto err;
 
 	utmp = 1 << 2;
-	ret = regmap_update_bits(dev->regmap, 0xa1, reg_a1_mask, utmp);
+	ret = m88ds3103_update_bits(dev, 0xa1, reg_a1_mask, utmp);
 	if (ret)
 		goto err;
 
@@ -1047,7 +1068,7 @@ static int m88ds3103_set_voltage(struct dvb_frontend *fe,
 	voltage_dis ^= dev->cfg->lnb_en_pol;
 
 	utmp = voltage_dis << 1 | voltage_sel << 0;
-	ret = regmap_update_bits(dev->regmap, 0xa2, 0x03, utmp);
+	ret = m88ds3103_update_bits(dev, 0xa2, 0x03, utmp);
 	if (ret)
 		goto err;
 
@@ -1080,7 +1101,7 @@ static int m88ds3103_diseqc_send_master_cmd(struct dvb_frontend *fe,
 	}
 
 	utmp = dev->cfg->envelope_mode << 5;
-	ret = regmap_update_bits(dev->regmap, 0xa2, 0xe0, utmp);
+	ret = m88ds3103_update_bits(dev, 0xa2, 0xe0, utmp);
 	if (ret)
 		goto err;
 
@@ -1115,12 +1136,12 @@ static int m88ds3103_diseqc_send_master_cmd(struct dvb_frontend *fe,
 	} else {
 		dev_dbg(&client->dev, "diseqc tx timeout\n");
 
-		ret = regmap_update_bits(dev->regmap, 0xa1, 0xc0, 0x40);
+		ret = m88ds3103_update_bits(dev, 0xa1, 0xc0, 0x40);
 		if (ret)
 			goto err;
 	}
 
-	ret = regmap_update_bits(dev->regmap, 0xa2, 0xc0, 0x80);
+	ret = m88ds3103_update_bits(dev, 0xa2, 0xc0, 0x80);
 	if (ret)
 		goto err;
 
@@ -1152,7 +1173,7 @@ static int m88ds3103_diseqc_send_burst(struct dvb_frontend *fe,
 	}
 
 	utmp = dev->cfg->envelope_mode << 5;
-	ret = regmap_update_bits(dev->regmap, 0xa2, 0xe0, utmp);
+	ret = m88ds3103_update_bits(dev, 0xa2, 0xe0, utmp);
 	if (ret)
 		goto err;
 
@@ -1194,12 +1215,12 @@ static int m88ds3103_diseqc_send_burst(struct dvb_frontend *fe,
 	} else {
 		dev_dbg(&client->dev, "diseqc tx timeout\n");
 
-		ret = regmap_update_bits(dev->regmap, 0xa1, 0xc0, 0x40);
+		ret = m88ds3103_update_bits(dev, 0xa1, 0xc0, 0x40);
 		if (ret)
 			goto err;
 	}
 
-	ret = regmap_update_bits(dev->regmap, 0xa2, 0xc0, 0x80);
+	ret = m88ds3103_update_bits(dev, 0xa2, 0xc0, 0x80);
 	if (ret)
 		goto err;
 
@@ -1435,13 +1456,13 @@ static int m88ds3103_probe(struct i2c_client *client,
 		goto err_kfree;
 
 	/* sleep */
-	ret = regmap_update_bits(dev->regmap, 0x08, 0x01, 0x00);
+	ret = m88ds3103_update_bits(dev, 0x08, 0x01, 0x00);
 	if (ret)
 		goto err_kfree;
-	ret = regmap_update_bits(dev->regmap, 0x04, 0x01, 0x01);
+	ret = m88ds3103_update_bits(dev, 0x04, 0x01, 0x01);
 	if (ret)
 		goto err_kfree;
-	ret = regmap_update_bits(dev->regmap, 0x23, 0x10, 0x10);
+	ret = m88ds3103_update_bits(dev, 0x23, 0x10, 0x10);
 	if (ret)
 		goto err_kfree;
 

From 5211613978cb7353a3237e4372958c0e7514683f Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Thu, 22 Oct 2015 13:32:08 -0700
Subject: [PATCH 137/199] kmod: don't run async usermode helper as a child of
 kworker thread

call_usermodehelper_exec_sync() does fork() + wait() with "unignored"
SIGCHLD.  What we have missed is that this worker thread can have other
children previously forked by call_usermodehelper_exec_work() without
UMH_WAIT_PROC.  If such a child exits in between it becomes a zombie
because auto-reaping only works if SIGCHLD is ignored, and nobody can
reap it (unless/until this worker thread exits too).

Change the !UMH_WAIT_PROC case to use CLONE_PARENT.

Note: this is only first step.  All PF_KTHREAD tasks, even created by
kernel_thread() should have ->parent == kthreadd by default.

Fixes: bb304a5c6fc63d8506c ("kmod: handle UMH_WAIT_PROC from system unbound workqueue")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 kernel/kmod.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index da98d0593de24..0277d1216f80a 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -327,9 +327,13 @@ static void call_usermodehelper_exec_work(struct work_struct *work)
 		call_usermodehelper_exec_sync(sub_info);
 	} else {
 		pid_t pid;
-
+		/*
+		 * Use CLONE_PARENT to reparent it to kthreadd; we do not
+		 * want to pollute current->children, and we need a parent
+		 * that always ignores SIGCHLD to ensure auto-reaping.
+		 */
 		pid = kernel_thread(call_usermodehelper_exec_async, sub_info,
-				    SIGCHLD);
+				    CLONE_PARENT | SIGCHLD);
 		if (pid < 0) {
 			sub_info->retval = pid;
 			umh_complete(sub_info);

From 67a2e213e7e937c41c52ab5bc46bf3f4de469f6e Mon Sep 17 00:00:00 2001
From: Rohit Vaswani <rvaswani@codeaurora.org>
Date: Thu, 22 Oct 2015 13:32:11 -0700
Subject: [PATCH 138/199] mm: cma: fix incorrect type conversion for size
 during dma allocation

This was found during userspace fuzzing test when a large size dma cma
allocation is made by driver(like ion) through userspace.

  show_stack+0x10/0x1c
  dump_stack+0x74/0xc8
  kasan_report_error+0x2b0/0x408
  kasan_report+0x34/0x40
  __asan_storeN+0x15c/0x168
  memset+0x20/0x44
  __dma_alloc_coherent+0x114/0x18c

Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 drivers/base/dma-contiguous.c  | 2 +-
 include/linux/cma.h            | 2 +-
 include/linux/dma-contiguous.h | 4 ++--
 mm/cma.c                       | 4 ++--
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c
index 950fff9ce4539..a12ff9863d7e1 100644
--- a/drivers/base/dma-contiguous.c
+++ b/drivers/base/dma-contiguous.c
@@ -187,7 +187,7 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base,
  * global one. Requires architecture specific dev_get_cma_area() helper
  * function.
  */
-struct page *dma_alloc_from_contiguous(struct device *dev, int count,
+struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
 				       unsigned int align)
 {
 	if (align > CONFIG_CMA_ALIGNMENT)
diff --git a/include/linux/cma.h b/include/linux/cma.h
index f7ef093ec49a2..29f9e774ab76e 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -26,6 +26,6 @@ extern int __init cma_declare_contiguous(phys_addr_t base,
 extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 					unsigned int order_per_bit,
 					struct cma **res_cma);
-extern struct page *cma_alloc(struct cma *cma, unsigned int count, unsigned int align);
+extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count);
 #endif
diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index 569bbd039896f..fec734df15247 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -111,7 +111,7 @@ static inline int dma_declare_contiguous(struct device *dev, phys_addr_t size,
 	return ret;
 }
 
-struct page *dma_alloc_from_contiguous(struct device *dev, int count,
+struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
 				       unsigned int order);
 bool dma_release_from_contiguous(struct device *dev, struct page *pages,
 				 int count);
@@ -144,7 +144,7 @@ int dma_declare_contiguous(struct device *dev, phys_addr_t size,
 }
 
 static inline
-struct page *dma_alloc_from_contiguous(struct device *dev, int count,
+struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
 				       unsigned int order)
 {
 	return NULL;
diff --git a/mm/cma.c b/mm/cma.c
index e7d1db5330254..4eb56badf37e6 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -361,7 +361,7 @@ int __init cma_declare_contiguous(phys_addr_t base,
  * This function allocates part of contiguous memory on specific
  * contiguous memory area.
  */
-struct page *cma_alloc(struct cma *cma, unsigned int count, unsigned int align)
+struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align)
 {
 	unsigned long mask, offset, pfn, start = 0;
 	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
@@ -371,7 +371,7 @@ struct page *cma_alloc(struct cma *cma, unsigned int count, unsigned int align)
 	if (!cma || !cma->count)
 		return NULL;
 
-	pr_debug("%s(cma %p, count %d, align %d)\n", __func__, (void *)cma,
+	pr_debug("%s(cma %p, count %zu, align %d)\n", __func__, (void *)cma,
 		 count, align);
 
 	if (!count)

From 41192a2d6a7f4cd6af9fc2f8edbbf24b2694f2f6 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date: Thu, 22 Oct 2015 13:32:13 -0700
Subject: [PATCH 139/199] MAINTAINERS: add Sergey as zsmalloc reviewer

Nominate myself as a zsmalloc reviewer.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fb7d2e4af2003..d1116d9ea6ae1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11674,6 +11674,7 @@ F:	drivers/tty/serial/zs.*
 ZSMALLOC COMPRESSED SLAB MEMORY ALLOCATOR
 M:	Minchan Kim <minchan@kernel.org>
 M:	Nitin Gupta <ngupta@vflare.org>
+R:	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
 L:	linux-mm@kvack.org
 S:	Maintained
 F:	mm/zsmalloc.c

From b8fa0efa01109e294e9be610465c324f771cb5ba Mon Sep 17 00:00:00 2001
From: Javier Martinez Canillas <javier@osg.samsung.com>
Date: Thu, 22 Oct 2015 13:32:16 -0700
Subject: [PATCH 140/199] mailmap: update Javier Martinez Canillas' email

The get_maintainer script still reports my old Collabora email based on
old commits but that address no longer exist so update mailmap to report
my current email and avoid people sending to the old address.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index 4b31af54ccd58..b1e9a97653dc6 100644
--- a/.mailmap
+++ b/.mailmap
@@ -59,6 +59,7 @@ James Bottomley <jejb@mulgrave.(none)>
 James Bottomley <jejb@titanic.il.steeleye.com>
 James E Wilson <wilson@specifix.com>
 James Ketrenos <jketreno@io.(none)>
+<javier@osg.samsung.com> <javier.martinez@collabora.co.uk>
 Jean Tourrilhes <jt@hpl.hp.com>
 Jeff Garzik <jgarzik@pretzel.yyz.us>
 Jens Axboe <axboe@suse.de>

From 47aee4d8e314384807e98b67ade07f6da476aa75 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Thu, 22 Oct 2015 13:32:19 -0700
Subject: [PATCH 141/199] thp: use is_zero_pfn() only after pte_present() check

Use is_zero_pfn() on pteval only after pte_present() check on pteval
(It might be better idea to introduce is_zero_pte() which checks
pte_present() first).

Otherwise when working on a swap or migration entry and if pte_pfn's
result is equal to zero_pfn by chance, we lose user's data in
__collapse_huge_page_copy().  So if you're unlucky, the application
segfaults and finally you could see below message on exit:

BUG: Bad rss-counter state mm:ffff88007f099300 idx:2 val:3

Fixes: ca0984caa823 ("mm: incorporate zero pages into transparent huge pages")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>	[4.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 mm/huge_memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4b06b8db9df23..bbac913f96bc1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2206,7 +2206,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 	for (_pte = pte; _pte < pte+HPAGE_PMD_NR;
 	     _pte++, address += PAGE_SIZE) {
 		pte_t pteval = *_pte;
-		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
+		if (pte_none(pteval) || (pte_present(pteval) &&
+				is_zero_pfn(pte_pfn(pteval)))) {
 			if (!userfaultfd_armed(vma) &&
 			    ++none_or_zero <= khugepaged_max_ptes_none)
 				continue;

From 296291cdd1629c308114504b850dc343eabc2782 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.com>
Date: Thu, 22 Oct 2015 13:32:21 -0700
Subject: [PATCH 142/199] mm: make sendfile(2) killable

Currently a simple program below issues a sendfile(2) system call which
takes about 62 days to complete in my test KVM instance.

        int fd;
        off_t off = 0;

        fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644);
        ftruncate(fd, 2);
        lseek(fd, 0, SEEK_END);
        sendfile(fd, fd, &off, 0xfffffff);

Now you should not ask kernel to do a stupid stuff like copying 256MB in
2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin
should have a way to stop you.

We actually do have a check for fatal_signal_pending() in
generic_perform_write() which triggers in this path however because we
always succeed in writing something before the check is done, we return
value > 0 from generic_perform_write() and thus the information about
signal gets lost.

Fix the problem by doing the signal check before writing anything.  That
way generic_perform_write() returns -EINTR, the error gets propagated up
and the sendfile loop terminates early.

Signed-off-by: Jan Kara <jack@suse.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 mm/filemap.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 1cc5467cf36ce..327910c2400c6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2488,6 +2488,11 @@ ssize_t generic_perform_write(struct file *file,
 			break;
 		}
 
+		if (fatal_signal_pending(current)) {
+			status = -EINTR;
+			break;
+		}
+
 		status = a_ops->write_begin(file, mapping, pos, bytes, flags,
 						&page, &fsdata);
 		if (unlikely(status < 0))
@@ -2525,10 +2530,6 @@ ssize_t generic_perform_write(struct file *file,
 		written += copied;
 
 		balance_dirty_pages_ratelimited(mapping);
-		if (fatal_signal_pending(current)) {
-			status = -EINTR;
-			break;
-		}
 	} while (iov_iter_count(i));
 
 	return written ? written : status;

From 3f181b4d8652f7bcd7e9932c7307b8ecd4d87cf6 Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
Date: Thu, 22 Oct 2015 13:32:24 -0700
Subject: [PATCH 143/199] lib/Kconfig.debug: disable -Wframe-larger-than
 warnings with KASAN=y

When the kernel compiled with KASAN=y, GCC adds redzones for each
variable on stack.  This enlarges function's stack frame and causes:

	'warning: the frame size of X bytes is larger than Y bytes'

The worst case I've seen for now is following:

   ../net/wireless/nl80211.c: In function `nl80211_send_wiphy':
   ../net/wireless/nl80211.c:1731:1: warning: the frame size of 5448 bytes is larger than 2048 bytes [-Wframe-larger-than=]

That kind of warning becomes useless with KASAN=y.  It doesn't
necessarily indicate that there is some problem in the code, thus we
should turn it off.

(The KASAN=y stack size in increased from 16k to 32k for this reason)

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Abylay Ospan <aospan@netup.ru>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: Kozlov Sergey <serjk@netup.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 lib/Kconfig.debug | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ab76b99adc857..1d1521c26302a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -197,6 +197,7 @@ config ENABLE_MUST_CHECK
 config FRAME_WARN
 	int "Warn for stack frames larger than (needs gcc 4.4)"
 	range 0 8192
+	default 0 if KASAN
 	default 1024 if !64BIT
 	default 2048 if 64BIT
 	help

From bb387002693ed28b2bb0408c5dec65521b71e5f1 Mon Sep 17 00:00:00 2001
From: Florian Westphal <fw@strlen.de>
Date: Thu, 22 Oct 2015 13:32:27 -0700
Subject: [PATCH 144/199] fault-inject: fix inverted interval/probability
 values in printk

interval displays the probability and vice versa.

Fixes: 6adc4a22f20bb ("fault-inject: add ratelimit option")
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 lib/fault-inject.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index f1cdeb024d172..6a823a53e357b 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -44,7 +44,7 @@ static void fail_dump(struct fault_attr *attr)
 		printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
 		       "name %pd, interval %lu, probability %lu, "
 		       "space %d, times %d\n", attr->dname,
-		       attr->probability, attr->interval,
+		       attr->interval, attr->probability,
 		       atomic_read(&attr->space),
 		       atomic_read(&attr->times));
 		if (attr->verbose > 1)

From b67de018b37a97548645a879c627d4188518e907 Mon Sep 17 00:00:00 2001
From: Joseph Qi <joseph.qi@huawei.com>
Date: Thu, 22 Oct 2015 13:32:29 -0700
Subject: [PATCH 145/199] ocfs2/dlm: unlock lockres spinlock before
 dlm_lockres_put

dlm_lockres_put will call dlm_lockres_release if it is the last
reference, and then it may call dlm_print_one_lock_resource and
take lockres spinlock.

So unlock lockres spinlock before dlm_lockres_put to avoid deadlock.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 fs/ocfs2/dlm/dlmmaster.c   | 3 ++-
 fs/ocfs2/dlm/dlmrecovery.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index ee5aa4daaea0d..ce38b4ccc9ab6 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -1658,12 +1658,13 @@ int dlm_master_request_handler(struct o2net_msg *msg, u32 len, void *data,
 		if (ret < 0) {
 			mlog(ML_ERROR, "failed to dispatch assert master work\n");
 			response = DLM_MASTER_RESP_ERROR;
+			spin_unlock(&res->spinlock);
 			dlm_lockres_put(res);
 		} else {
 			dispatched = 1;
 			__dlm_lockres_grab_inflight_worker(dlm, res);
+			spin_unlock(&res->spinlock);
 		}
-		spin_unlock(&res->spinlock);
 	} else {
 		if (res)
 			dlm_lockres_put(res);
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 3d90ad7ff91fe..58eaa5c0d3870 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -1723,8 +1723,8 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
 			} else {
 				dispatched = 1;
 				__dlm_lockres_grab_inflight_worker(dlm, res);
+				spin_unlock(&res->spinlock);
 			}
-			spin_unlock(&res->spinlock);
 		} else {
 			/* put.. incase we are not the master */
 			spin_unlock(&res->spinlock);

From 0aaafaabfcba8aa991913cd3280a5dbf7f111a2a Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz@infradead.org>
Date: Fri, 23 Oct 2015 11:50:08 +0200
Subject: [PATCH 146/199] sched/core: Add missing lockdep_unpin() annotations

Luca and Wanpeng reported two missing annotations that led to
false lockdep complaints. Add the missing annotations.

Reported-by: Luca Abeni <luca.abeni@unitn.it>
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: cbce1a686700 ("sched,lockdep: Employ lock pinning")
Link: http://lkml.kernel.org/r/20151023095008.GY17308@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c     | 9 ++++++++-
 kernel/sched/deadline.c | 9 ++++++++-
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5bd7d60658d37..bcd214e4b4d63 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2366,8 +2366,15 @@ void wake_up_new_task(struct task_struct *p)
 	trace_sched_wakeup_new(p);
 	check_preempt_curr(rq, p, WF_FORK);
 #ifdef CONFIG_SMP
-	if (p->sched_class->task_woken)
+	if (p->sched_class->task_woken) {
+		/*
+		 * Nothing relies on rq->lock after this, so its fine to
+		 * drop it.
+		 */
+		lockdep_unpin_lock(&rq->lock);
 		p->sched_class->task_woken(rq, p);
+		lockdep_pin_lock(&rq->lock);
+	}
 #endif
 	task_rq_unlock(rq, p, &flags);
 }
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 142df2668e5d5..8b0a15e285f91 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -668,8 +668,15 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
 	 * Queueing this task back might have overloaded rq, check if we need
 	 * to kick someone away.
 	 */
-	if (has_pushable_dl_tasks(rq))
+	if (has_pushable_dl_tasks(rq)) {
+		/*
+		 * Nothing relies on rq->lock after this, so its safe to drop
+		 * rq->lock.
+		 */
+		lockdep_unpin_lock(&rq->lock);
 		push_dl_task(rq);
+		lockdep_pin_lock(&rq->lock);
+	}
 #endif
 
 unlock:

From 5c92d87d30b23844e6998d8318e4c19ee3a907ac Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Wed, 21 Oct 2015 21:58:28 +0200
Subject: [PATCH 147/199] drm/amdgpu: stop leaking page flip fence
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

reservation_object_get_fences_rcu already takes the references.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index dc29ed8145c25..6c9e0902a4143 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -184,10 +184,6 @@ int amdgpu_crtc_page_flip(struct drm_crtc *crtc,
 		goto cleanup;
 	}
 
-	fence_get(work->excl);
-	for (i = 0; i < work->shared_count; ++i)
-		fence_get(work->shared[i]);
-
 	amdgpu_bo_get_tiling_flags(new_rbo, &tiling_flags);
 	amdgpu_bo_unreserve(new_rbo);
 

From 49abb26651167c892393cd9f2ad23df429645ed9 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Fri, 23 Oct 2015 10:38:52 -0400
Subject: [PATCH 148/199] drm/radeon: don't try to recreate sysfs entries on
 resume

Fixes a harmless error message caused by:
51a4726b04e880fdd9b4e0e58b13f70b0a68a7f5

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/radeon/radeon.h    |  1 +
 drivers/gpu/drm/radeon/radeon_pm.c | 35 ++++++++++++++++++------------
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index f03b7eb152336..b6cbd816537e7 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1658,6 +1658,7 @@ struct radeon_pm {
 	u8                      fan_max_rpm;
 	/* dpm */
 	bool                    dpm_enabled;
+	bool                    sysfs_initialized;
 	struct radeon_dpm       dpm;
 };
 
diff --git a/drivers/gpu/drm/radeon/radeon_pm.c b/drivers/gpu/drm/radeon/radeon_pm.c
index 6a0a176e26ec2..5feee3b4c5574 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -1528,19 +1528,23 @@ int radeon_pm_late_init(struct radeon_device *rdev)
 
 	if (rdev->pm.pm_method == PM_METHOD_DPM) {
 		if (rdev->pm.dpm_enabled) {
-			ret = device_create_file(rdev->dev, &dev_attr_power_dpm_state);
-			if (ret)
-				DRM_ERROR("failed to create device file for dpm state\n");
-			ret = device_create_file(rdev->dev, &dev_attr_power_dpm_force_performance_level);
-			if (ret)
-				DRM_ERROR("failed to create device file for dpm state\n");
-			/* XXX: these are noops for dpm but are here for backwards compat */
-			ret = device_create_file(rdev->dev, &dev_attr_power_profile);
-			if (ret)
-				DRM_ERROR("failed to create device file for power profile\n");
-			ret = device_create_file(rdev->dev, &dev_attr_power_method);
-			if (ret)
-				DRM_ERROR("failed to create device file for power method\n");
+			if (!rdev->pm.sysfs_initialized) {
+				ret = device_create_file(rdev->dev, &dev_attr_power_dpm_state);
+				if (ret)
+					DRM_ERROR("failed to create device file for dpm state\n");
+				ret = device_create_file(rdev->dev, &dev_attr_power_dpm_force_performance_level);
+				if (ret)
+					DRM_ERROR("failed to create device file for dpm state\n");
+				/* XXX: these are noops for dpm but are here for backwards compat */
+				ret = device_create_file(rdev->dev, &dev_attr_power_profile);
+				if (ret)
+					DRM_ERROR("failed to create device file for power profile\n");
+				ret = device_create_file(rdev->dev, &dev_attr_power_method);
+				if (ret)
+					DRM_ERROR("failed to create device file for power method\n");
+				if (!ret)
+					rdev->pm.sysfs_initialized = true;
+			}
 
 			mutex_lock(&rdev->pm.mutex);
 			ret = radeon_dpm_late_enable(rdev);
@@ -1556,7 +1560,8 @@ int radeon_pm_late_init(struct radeon_device *rdev)
 			}
 		}
 	} else {
-		if (rdev->pm.num_power_states > 1) {
+		if ((rdev->pm.num_power_states > 1) &&
+		    (!rdev->pm.sysfs_initialized)) {
 			/* where's the best place to put these? */
 			ret = device_create_file(rdev->dev, &dev_attr_power_profile);
 			if (ret)
@@ -1564,6 +1569,8 @@ int radeon_pm_late_init(struct radeon_device *rdev)
 			ret = device_create_file(rdev->dev, &dev_attr_power_method);
 			if (ret)
 				DRM_ERROR("failed to create device file for power method\n");
+			if (!ret)
+				rdev->pm.sysfs_initialized = true;
 		}
 	}
 	return ret;

From c86f5ebfbd147d1a228ab89ee1658e18939bd7ad Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Fri, 23 Oct 2015 10:45:14 -0400
Subject: [PATCH 149/199] drm/amdgpu: don't try to recreate sysfs entries on
 resume

Fixes an error on resume caused by:
fa022a9b65d2886486a022fd66b20c823cd76ad9

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h    | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6647fb26ef25c..0d13e6368b96d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1654,6 +1654,7 @@ struct amdgpu_pm {
 	u8                      fan_max_rpm;
 	/* dpm */
 	bool                    dpm_enabled;
+	bool                    sysfs_initialized;
 	struct amdgpu_dpm       dpm;
 	const struct firmware	*fw;	/* SMC firmware */
 	uint32_t                fw_version;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index ed2bbe5b10afe..22a8c7d3a3ab0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -695,6 +695,9 @@ int amdgpu_pm_sysfs_init(struct amdgpu_device *adev)
 {
 	int ret;
 
+	if (adev->pm.sysfs_initialized)
+		return 0;
+
 	if (adev->pm.funcs->get_temperature == NULL)
 		return 0;
 	adev->pm.int_hwmon_dev = hwmon_device_register_with_groups(adev->dev,
@@ -723,6 +726,8 @@ int amdgpu_pm_sysfs_init(struct amdgpu_device *adev)
 		return ret;
 	}
 
+	adev->pm.sysfs_initialized = true;
+
 	return 0;
 }
 

From 1f2c6651f69c14d0d3a9cfbda44ea101b02160ba Mon Sep 17 00:00:00 2001
From: Ilya Dryomov <idryomov@gmail.com>
Date: Sun, 11 Oct 2015 19:38:00 +0200
Subject: [PATCH 150/199] rbd: don't leak parent_spec in rbd_dev_probe_parent()

Currently we leak parent_spec and trigger a "parent reference
underflow" warning if rbd_dev_create() in rbd_dev_probe_parent() fails.
The problem is we take the !parent out_err branch and that only drops
refcounts; parent_spec that would've been freed had we called
rbd_dev_unparent() remains and triggers rbd_warn() in
rbd_dev_parent_put() - at that point we have parent_spec != NULL and
parent_ref == 0, so counter ends up being -1 after the decrement.

Redo rbd_dev_probe_parent() to fix this.

Cc: stable@vger.kernel.org # 3.10+, needs backporting for < 4.2
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
---
 drivers/block/rbd.c | 36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index f5e49b639818b..028db28cb8a06 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -5134,41 +5134,37 @@ static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev)
 static int rbd_dev_probe_parent(struct rbd_device *rbd_dev)
 {
 	struct rbd_device *parent = NULL;
-	struct rbd_spec *parent_spec;
-	struct rbd_client *rbdc;
 	int ret;
 
 	if (!rbd_dev->parent_spec)
 		return 0;
-	/*
-	 * We need to pass a reference to the client and the parent
-	 * spec when creating the parent rbd_dev.  Images related by
-	 * parent/child relationships always share both.
-	 */
-	parent_spec = rbd_spec_get(rbd_dev->parent_spec);
-	rbdc = __rbd_get_client(rbd_dev->rbd_client);
 
-	ret = -ENOMEM;
-	parent = rbd_dev_create(rbdc, parent_spec, NULL);
-	if (!parent)
+	parent = rbd_dev_create(rbd_dev->rbd_client, rbd_dev->parent_spec,
+				NULL);
+	if (!parent) {
+		ret = -ENOMEM;
 		goto out_err;
+	}
+
+	/*
+	 * Images related by parent/child relationships always share
+	 * rbd_client and spec/parent_spec, so bump their refcounts.
+	 */
+	__rbd_get_client(rbd_dev->rbd_client);
+	rbd_spec_get(rbd_dev->parent_spec);
 
 	ret = rbd_dev_image_probe(parent, false);
 	if (ret < 0)
 		goto out_err;
+
 	rbd_dev->parent = parent;
 	atomic_set(&rbd_dev->parent_ref, 1);
-
 	return 0;
+
 out_err:
-	if (parent) {
-		rbd_dev_unparent(rbd_dev);
+	rbd_dev_unparent(rbd_dev);
+	if (parent)
 		rbd_dev_destroy(parent);
-	} else {
-		rbd_put_client(rbdc);
-		rbd_spec_put(parent_spec);
-	}
-
 	return ret;
 }
 

From 6d69bb536bac0d403d83db1ca841444981b280cd Mon Sep 17 00:00:00 2001
From: Ilya Dryomov <idryomov@gmail.com>
Date: Sun, 11 Oct 2015 19:38:00 +0200
Subject: [PATCH 151/199] rbd: prevent kernel stack blow up on rbd map

Mapping an image with a long parent chain (e.g. image foo, whose parent
is bar, whose parent is baz, etc) currently leads to a kernel stack
overflow, due to the following recursion in the reply path:

  rbd_osd_req_callback()
    rbd_obj_request_complete()
      rbd_img_obj_callback()
        rbd_img_parent_read_callback()
          rbd_obj_request_complete()
            ...

Limit the parent chain to 16 images, which is ~5K worth of stack.  When
the above recursion is eliminated, this limit can be lifted.

Fixes: http://tracker.ceph.com/issues/12538

Cc: stable@vger.kernel.org # 3.10+, needs backporting for < 4.2
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
---
 drivers/block/rbd.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 028db28cb8a06..6f26cf38c6f9c 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -96,6 +96,8 @@ static int atomic_dec_return_safe(atomic_t *v)
 #define RBD_MINORS_PER_MAJOR		256
 #define RBD_SINGLE_MAJOR_PART_SHIFT	4
 
+#define RBD_MAX_PARENT_CHAIN_LEN	16
+
 #define RBD_SNAP_DEV_NAME_PREFIX	"snap_"
 #define RBD_MAX_SNAP_NAME_LEN	\
 			(NAME_MAX - (sizeof (RBD_SNAP_DEV_NAME_PREFIX) - 1))
@@ -426,7 +428,7 @@ static ssize_t rbd_add_single_major(struct bus_type *bus, const char *buf,
 				    size_t count);
 static ssize_t rbd_remove_single_major(struct bus_type *bus, const char *buf,
 				       size_t count);
-static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping);
+static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth);
 static void rbd_spec_put(struct rbd_spec *spec);
 
 static int rbd_dev_id_to_minor(int dev_id)
@@ -5131,7 +5133,12 @@ static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev)
 	return ret;
 }
 
-static int rbd_dev_probe_parent(struct rbd_device *rbd_dev)
+/*
+ * @depth is rbd_dev_image_probe() -> rbd_dev_probe_parent() ->
+ * rbd_dev_image_probe() recursion depth, which means it's also the
+ * length of the already discovered part of the parent chain.
+ */
+static int rbd_dev_probe_parent(struct rbd_device *rbd_dev, int depth)
 {
 	struct rbd_device *parent = NULL;
 	int ret;
@@ -5139,6 +5146,12 @@ static int rbd_dev_probe_parent(struct rbd_device *rbd_dev)
 	if (!rbd_dev->parent_spec)
 		return 0;
 
+	if (++depth > RBD_MAX_PARENT_CHAIN_LEN) {
+		pr_info("parent chain is too long (%d)\n", depth);
+		ret = -EINVAL;
+		goto out_err;
+	}
+
 	parent = rbd_dev_create(rbd_dev->rbd_client, rbd_dev->parent_spec,
 				NULL);
 	if (!parent) {
@@ -5153,7 +5166,7 @@ static int rbd_dev_probe_parent(struct rbd_device *rbd_dev)
 	__rbd_get_client(rbd_dev->rbd_client);
 	rbd_spec_get(rbd_dev->parent_spec);
 
-	ret = rbd_dev_image_probe(parent, false);
+	ret = rbd_dev_image_probe(parent, depth);
 	if (ret < 0)
 		goto out_err;
 
@@ -5282,7 +5295,7 @@ static void rbd_dev_image_release(struct rbd_device *rbd_dev)
  * parent), initiate a watch on its header object before using that
  * object to get detailed information about the rbd image.
  */
-static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping)
+static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth)
 {
 	int ret;
 
@@ -5300,7 +5313,7 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping)
 	if (ret)
 		goto err_out_format;
 
-	if (mapping) {
+	if (!depth) {
 		ret = rbd_dev_header_watch_sync(rbd_dev);
 		if (ret) {
 			if (ret == -ENOENT)
@@ -5321,7 +5334,7 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping)
 	 * Otherwise this is a parent image, identified by pool, image
 	 * and snap ids - need to fill in names for those ids.
 	 */
-	if (mapping)
+	if (!depth)
 		ret = rbd_spec_fill_snap_id(rbd_dev);
 	else
 		ret = rbd_spec_fill_names(rbd_dev);
@@ -5343,12 +5356,12 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping)
 		 * Need to warn users if this image is the one being
 		 * mapped and has a parent.
 		 */
-		if (mapping && rbd_dev->parent_spec)
+		if (!depth && rbd_dev->parent_spec)
 			rbd_warn(rbd_dev,
 				 "WARNING: kernel layering is EXPERIMENTAL!");
 	}
 
-	ret = rbd_dev_probe_parent(rbd_dev);
+	ret = rbd_dev_probe_parent(rbd_dev, depth);
 	if (ret)
 		goto err_out_probe;
 
@@ -5359,7 +5372,7 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping)
 err_out_probe:
 	rbd_dev_unprobe(rbd_dev);
 err_out_watch:
-	if (mapping)
+	if (!depth)
 		rbd_dev_header_unwatch_sync(rbd_dev);
 out_header_name:
 	kfree(rbd_dev->header_name);
@@ -5422,7 +5435,7 @@ static ssize_t do_rbd_add(struct bus_type *bus,
 	spec = NULL;		/* rbd_dev now owns this */
 	rbd_opts = NULL;	/* rbd_dev now owns this */
 
-	rc = rbd_dev_image_probe(rbd_dev, true);
+	rc = rbd_dev_image_probe(rbd_dev, 0);
 	if (rc < 0)
 		goto err_out_rbd_dev;
 

From 2871c69e025e8bc507651d5a9cf81a8a7da9d24b Mon Sep 17 00:00:00 2001
From: Joe Thornber <ejt@redhat.com>
Date: Wed, 21 Oct 2015 18:36:49 +0100
Subject: [PATCH 152/199] dm btree remove: fix a bug when rebalancing nodes
 after removal

Commit 4c7e309340ff ("dm btree remove: fix bug in redistribute3") wasn't
a complete fix for redistribute3().

The redistribute3 function takes 3 btree nodes and shares out the entries
evenly between them.  If the three nodes in total contained
(MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting
rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in
the center.

Fix this issue by being more careful about calculating the target number
of entries for the left and right nodes.

Unit tested in userspace using this program:
https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.c

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/md/persistent-data/dm-btree-remove.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/md/persistent-data/dm-btree-remove.c b/drivers/md/persistent-data/dm-btree-remove.c
index 421a36c593e3e..2e4c4cb79e4d9 100644
--- a/drivers/md/persistent-data/dm-btree-remove.c
+++ b/drivers/md/persistent-data/dm-btree-remove.c
@@ -301,11 +301,16 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
 {
 	int s;
 	uint32_t max_entries = le32_to_cpu(left->header.max_entries);
-	unsigned target = (nr_left + nr_center + nr_right) / 3;
-	BUG_ON(target > max_entries);
+	unsigned total = nr_left + nr_center + nr_right;
+	unsigned target_right = total / 3;
+	unsigned remainder = (target_right * 3) != total;
+	unsigned target_left = target_right + remainder;
+
+	BUG_ON(target_left > max_entries);
+	BUG_ON(target_right > max_entries);
 
 	if (nr_left < nr_right) {
-		s = nr_left - target;
+		s = nr_left - target_left;
 
 		if (s < 0 && nr_center < -s) {
 			/* not enough in central node */
@@ -316,10 +321,10 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
 		} else
 			shift(left, center, s);
 
-		shift(center, right, target - nr_right);
+		shift(center, right, target_right - nr_right);
 
 	} else {
-		s = target - nr_right;
+		s = target_right - nr_right;
 		if (s > 0 && nr_center < s) {
 			/* not enough in central node */
 			shift(center, right, nr_center);
@@ -329,7 +334,7 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
 		} else
 			shift(center, right, s);
 
-		shift(left, center, nr_left - target);
+		shift(left, center, nr_left - target_left);
 	}
 
 	*key_ptr(parent, c->index) = center->keys[0];

From 4dcb8b57df3593dcb20481d9d6cf79d1dc1534be Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer@redhat.com>
Date: Thu, 22 Oct 2015 10:56:40 -0400
Subject: [PATCH 153/199] dm btree: fix leak of bufio-backed block in
 btree_split_beneath error path

btree_split_beneath()'s error path had an outstanding FIXME that speaks
directly to the potential for _not_ cleaning up a previously allocated
bufio-backed block.

Fix this by releasing the previously allocated bufio block using
unlock_block().

Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/md/persistent-data/dm-btree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/persistent-data/dm-btree.c b/drivers/md/persistent-data/dm-btree.c
index b6cec258cc213..0e09aef43998a 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -523,7 +523,7 @@ static int btree_split_beneath(struct shadow_spine *s, uint64_t key)
 
 	r = new_block(s->info, &right);
 	if (r < 0) {
-		/* FIXME: put left */
+		unlock_block(s->info, left);
 		return r;
 	}
 

From 3201ac452e84a8a368197d648c9b7011e061804a Mon Sep 17 00:00:00 2001
From: Joe Thornber <ejt@redhat.com>
Date: Thu, 22 Oct 2015 18:10:55 +0100
Subject: [PATCH 154/199] dm cache: the CLEAN_SHUTDOWN flag was not being set

If the CLEAN_SHUTDOWN flag is not set when a cache is loaded then all cache
blocks are marked as dirty and a full writeback occurs.

__commit_transaction() is responsible for setting/clearing
CLEAN_SHUTDOWN (based the flags_mutator that is passed in).

Fix this issue, of the cache's on-disk flags being wrong, by making sure
__commit_transaction() does not reset the flags after the mutator has
altered the flags in preparation for them being serialized to disk.

before:

sb_flags = mutator(le32_to_cpu(disk_super->flags));
disk_super->flags = cpu_to_le32(sb_flags);
disk_super->flags = cpu_to_le32(cmd->flags);

after:

disk_super->flags = cpu_to_le32(cmd->flags);
sb_flags = mutator(le32_to_cpu(disk_super->flags));
disk_super->flags = cpu_to_le32(sb_flags);

Reported-by: Bogdan Vasiliev <bogdan.vasiliev@gmail.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/md/dm-cache-metadata.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c
index 20cc36b01b778..0a17d1b91a811 100644
--- a/drivers/md/dm-cache-metadata.c
+++ b/drivers/md/dm-cache-metadata.c
@@ -634,10 +634,10 @@ static int __commit_transaction(struct dm_cache_metadata *cmd,
 
 	disk_super = dm_block_data(sblock);
 
+	disk_super->flags = cpu_to_le32(cmd->flags);
 	if (mutator)
 		update_flags(disk_super, mutator);
 
-	disk_super->flags = cpu_to_le32(cmd->flags);
 	disk_super->mapping_root = cpu_to_le64(cmd->root);
 	disk_super->hint_root = cpu_to_le64(cmd->hint_root);
 	disk_super->discard_root = cpu_to_le64(cmd->discard_root);

From 5dd32eae604ee503e5a84a4f18d1381e4cc356cb Mon Sep 17 00:00:00 2001
From: Vladimir Zapolskiy <vz@mleia.com>
Date: Sat, 17 Oct 2015 21:52:27 +0300
Subject: [PATCH 155/199] i2c: pnx: fix runtime warnings caused by enabling
 unprepared clock

The driver can not be used on a platform with common clock framework
until clk_prepare/clk_unprepare calls are added, otherwise clk_enable
calls will fail and a WARN is generated.

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
---
 drivers/i2c/busses/i2c-pnx.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/i2c/busses/i2c-pnx.c b/drivers/i2c/busses/i2c-pnx.c
index e814a36d9b78f..6f8b446be5b0e 100644
--- a/drivers/i2c/busses/i2c-pnx.c
+++ b/drivers/i2c/busses/i2c-pnx.c
@@ -600,7 +600,7 @@ static int i2c_pnx_controller_suspend(struct device *dev)
 {
 	struct i2c_pnx_algo_data *alg_data = dev_get_drvdata(dev);
 
-	clk_disable(alg_data->clk);
+	clk_disable_unprepare(alg_data->clk);
 
 	return 0;
 }
@@ -609,7 +609,7 @@ static int i2c_pnx_controller_resume(struct device *dev)
 {
 	struct i2c_pnx_algo_data *alg_data = dev_get_drvdata(dev);
 
-	return clk_enable(alg_data->clk);
+	return clk_prepare_enable(alg_data->clk);
 }
 
 static SIMPLE_DEV_PM_OPS(i2c_pnx_pm,
@@ -672,7 +672,7 @@ static int i2c_pnx_probe(struct platform_device *pdev)
 	if (IS_ERR(alg_data->ioaddr))
 		return PTR_ERR(alg_data->ioaddr);
 
-	ret = clk_enable(alg_data->clk);
+	ret = clk_prepare_enable(alg_data->clk);
 	if (ret)
 		return ret;
 
@@ -726,7 +726,7 @@ static int i2c_pnx_probe(struct platform_device *pdev)
 	return 0;
 
 out_clock:
-	clk_disable(alg_data->clk);
+	clk_disable_unprepare(alg_data->clk);
 	return ret;
 }
 
@@ -735,7 +735,7 @@ static int i2c_pnx_remove(struct platform_device *pdev)
 	struct i2c_pnx_algo_data *alg_data = platform_get_drvdata(pdev);
 
 	i2c_del_adapter(&alg_data->adapter);
-	clk_disable(alg_data->clk);
+	clk_disable_unprepare(alg_data->clk);
 
 	return 0;
 }

From bd8688a199b864944bf62eebed0ca13b46249453 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Sat, 24 Oct 2015 16:02:16 +1100
Subject: [PATCH 156/199] md/raid1: don't clear bitmap bit when bad-block-list
 write fails.

When a write fails and a bad-block-list is present, we can
update the bad-block-list instead of writing the data.  If
this succeeds then it is OK clear the relevant bitmap-bit as
no further 'sync' of the block is needed.

However if writing the bad-block-list fails then we need to
treat the write as failed and particularly must not clear
the bitmap bit.  Otherwise the device can be re-added (after
any hardware connection issues are resolved) and because the
relevant bit in the bitmap is clear, that block will not be
resynced.  This leads to data corruption.

We already delay the final bio_endio() on the write until
the bad-block-list is written so that when the write
returns: either that data is safe, the bad-block record is
safe, or the fact that the device is faulty is safe.
However we *don't* delay the clearing of the bitmap, so the
bitmap bit can be recorded as cleared before we know if the
bad-block-list was written safely.

So: delay that until the write really is safe.
i.e. move the call to close_write() until just before
calling bio_endio(), and recheck the 'is array degraded'
status before making that call.

This bug goes back to v3.1 when bad-block-lists were
introduced, though it only affects arrays created with
mdadm-3.3 or later as only those have bad-block lists.

Backports will require at least
Commit: 55ce74d4bfe1 ("md/raid1: ensure device failure recorded before write request returns.")
as well.  I'll send that to 'stable' separately.

Note that of the two tests of R1BIO_WriteError that this
patch adds, the first is certain to fail and the second is
certain to succeed.  However doing it this way makes the
patch more obviously correct.  I will tidy the code up in a
future merge window.

Reported-and-tested-by: Nate Dailey <nate.dailey@stratus.com>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Fixes: cd5ff9a16f08 ("md/raid1:  Handle write errors by updating badblock log.")
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/raid1.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index cfca6edf78139..d9d031ede4bf5 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2258,15 +2258,16 @@ static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bio)
 			rdev_dec_pending(conf->mirrors[m].rdev,
 					 conf->mddev);
 		}
-	if (test_bit(R1BIO_WriteError, &r1_bio->state))
-		close_write(r1_bio);
 	if (fail) {
 		spin_lock_irq(&conf->device_lock);
 		list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
 		spin_unlock_irq(&conf->device_lock);
 		md_wakeup_thread(conf->mddev->thread);
-	} else
+	} else {
+		if (test_bit(R1BIO_WriteError, &r1_bio->state))
+			close_write(r1_bio);
 		raid_end_bio_io(r1_bio);
+	}
 }
 
 static void handle_read_error(struct r1conf *conf, struct r1bio *r1_bio)
@@ -2385,6 +2386,10 @@ static void raid1d(struct md_thread *thread)
 			r1_bio = list_first_entry(&tmp, struct r1bio,
 						  retry_list);
 			list_del(&r1_bio->retry_list);
+			if (mddev->degraded)
+				set_bit(R1BIO_Degraded, &r1_bio->state);
+			if (test_bit(R1BIO_WriteError, &r1_bio->state))
+				close_write(r1_bio);
 			raid_end_bio_io(r1_bio);
 		}
 	}

From c340702ca26a628832fade4f133d8160a55c29cc Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Sat, 24 Oct 2015 16:23:48 +1100
Subject: [PATCH 157/199] md/raid10: don't clear bitmap bit when bad-block-list
 write fails.

When a write fails and a bad-block-list is present, we can
update the bad-block-list instead of writing the data.  If
this succeeds then it is OK clear the relevant bitmap-bit as
no further 'sync' of the block is needed.

However if writing the bad-block-list fails then we need to
treat the write as failed and particularly must not clear
the bitmap bit.  Otherwise the device can be re-added (after
any hardware connection issues are resolved) and because the
relevant bit in the bitmap is clear, that block will not be
resynced.  This leads to data corruption.

We already delay the final bio_endio() on the write until
the bad-block-list is written so that when the write
returns: either that data is safe, the bad-block record is
safe, or the fact that the device is faulty is safe.
However we *don't* delay the clearing of the bitmap, so the
bitmap bit can be recorded as cleared before we know if the
bad-block-list was written safely.

So: delay that until the write really is safe.
i.e. move the call to close_write() until just before
calling bio_endio(), and recheck the 'is array degraded'
status before making that call.

This bug goes back to v3.1 when bad-block-lists were
introduced, though it only affects arrays created with
mdadm-3.3 or later as only those have bad-block lists.

Backports will require at least
Commit: 95af587e95aa ("md/raid10: ensure device failure recorded before write request returns.")
as well.  I'll send that to 'stable' separately.

Note that of the two tests of R10BIO_WriteError that this
patch adds, the first is certain to fail and the second is
certain to succeed.  However doing it this way makes the
patch more obviously correct.  I will tidy the code up in a
future merge window.

Reported-by: Nate Dailey <nate.dailey@stratus.com>
Fixes: bd870a16c594 ("md/raid10:  Handle write errors by updating badblock log.")
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/raid10.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index a9ecec4e9a135..23de2144ee13e 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2654,16 +2654,17 @@ static void handle_write_completed(struct r10conf *conf, struct r10bio *r10_bio)
 				rdev_dec_pending(rdev, conf->mddev);
 			}
 		}
-		if (test_bit(R10BIO_WriteError,
-			     &r10_bio->state))
-			close_write(r10_bio);
 		if (fail) {
 			spin_lock_irq(&conf->device_lock);
 			list_add(&r10_bio->retry_list, &conf->bio_end_io_list);
 			spin_unlock_irq(&conf->device_lock);
 			md_wakeup_thread(conf->mddev->thread);
-		} else
+		} else {
+			if (test_bit(R10BIO_WriteError,
+				     &r10_bio->state))
+				close_write(r10_bio);
 			raid_end_bio_io(r10_bio);
+		}
 	}
 }
 
@@ -2691,6 +2692,12 @@ static void raid10d(struct md_thread *thread)
 			r10_bio = list_first_entry(&tmp, struct r10bio,
 						   retry_list);
 			list_del(&r10_bio->retry_list);
+			if (mddev->degraded)
+				set_bit(R10BIO_Degraded, &r10_bio->state);
+
+			if (test_bit(R10BIO_WriteError,
+				     &r10_bio->state))
+				close_write(r10_bio);
 			raid_end_bio_io(r10_bio);
 		}
 	}

From 8bce6d35b308d73cdb2ee273c95d711a55be688c Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Thu, 22 Oct 2015 13:20:15 +1100
Subject: [PATCH 158/199] md/raid10: fix the 'new' raid10 layout to work
 correctly.

In Linux 3.9 we introduce a new 'far' layout for RAID10 which was
supposed to rotate the replicas differently and so provide better
resilience.  In particular it could survive more combinations of 2
drive failures.

Unfortunately. due to a coding error, this some did what was wanted,
sometimes improved less than we hoped, and sometimes - in very
unlikely circumstances - put multiple replicas on the same device so
the redundancy was harmed.

No public user-space tool has created arrays using this layout so it
is very unlikely that zero-redundancy arrays actually exist.  Probably
no arrays using any form of the new layout exist.  But we cannot be
certain.

So use another bit in the 'layout' number and introduce a bug-fixed
version of the layout.
Also when assembling an array, if it has a zero-redundancy layout,
give a warning.

Reported-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/raid10.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 23de2144ee13e..96f3659683069 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -39,6 +39,7 @@
  *    far_copies (stored in second byte of layout)
  *    far_offset (stored in bit 16 of layout )
  *    use_far_sets (stored in bit 17 of layout )
+ *    use_far_sets_bugfixed (stored in bit 18 of layout )
  *
  * The data to be stored is divided into chunks using chunksize.  Each device
  * is divided into far_copies sections.   In each section, chunks are laid out
@@ -1497,6 +1498,8 @@ static void status(struct seq_file *seq, struct mddev *mddev)
 			seq_printf(seq, " %d offset-copies", conf->geo.far_copies);
 		else
 			seq_printf(seq, " %d far-copies", conf->geo.far_copies);
+		if (conf->geo.far_set_size != conf->geo.raid_disks)
+			seq_printf(seq, " %d devices per set", conf->geo.far_set_size);
 	}
 	seq_printf(seq, " [%d/%d] [", conf->geo.raid_disks,
 					conf->geo.raid_disks - mddev->degraded);
@@ -3394,7 +3397,7 @@ static int setup_geo(struct geom *geo, struct mddev *mddev, enum geo_type new)
 		disks = mddev->raid_disks + mddev->delta_disks;
 		break;
 	}
-	if (layout >> 18)
+	if (layout >> 19)
 		return -1;
 	if (chunk < (PAGE_SIZE >> 9) ||
 	    !is_power_of_2(chunk))
@@ -3406,7 +3409,22 @@ static int setup_geo(struct geom *geo, struct mddev *mddev, enum geo_type new)
 	geo->near_copies = nc;
 	geo->far_copies = fc;
 	geo->far_offset = fo;
-	geo->far_set_size = (layout & (1<<17)) ? disks / fc : disks;
+	switch (layout >> 17) {
+	case 0:	/* original layout.  simple but not always optimal */
+		geo->far_set_size = disks;
+		break;
+	case 1: /* "improved" layout which was buggy.  Hopefully no-one is
+		 * actually using this, but leave code here just in case.*/
+		geo->far_set_size = disks/fc;
+		WARN(geo->far_set_size < fc,
+		     "This RAID10 layout does not provide data safety - please backup and create new array\n");
+		break;
+	case 2: /* "improved" layout fixed to match documentation */
+		geo->far_set_size = fc * nc;
+		break;
+	default: /* Not a valid layout */
+		return -1;
+	}
 	geo->chunk_mask = chunk - 1;
 	geo->chunk_shift = ffz(~chunk);
 	return nc*fc;

From 32b88194f71d6ae7768a29f87fbba454728273ee Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sun, 25 Oct 2015 10:39:47 +0900
Subject: [PATCH 159/199] Linux 4.3-rc7

---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index d33ab74bffce0..431067a41fcfe 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
 VERSION = 4
 PATCHLEVEL = 3
 SUBLEVEL = 0
-EXTRAVERSION = -rc6
+EXTRAVERSION = -rc7
 NAME = Blurry Fish Butt
 
 # *DOCUMENTATION*

From f9bf45e08ef36b6726a5744f0029325e81b3248a Mon Sep 17 00:00:00 2001
From: Sunil Goutham <sgoutham@cavium.com>
Date: Fri, 23 Oct 2015 17:14:07 -0700
Subject: [PATCH 160/199] net: thunderx: Remove PF soft reset.

In some silicon revisions, the soft reset clobbers PCI config space,
so quit doing the reset.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/cavium/thunder/nic_main.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index b3a5947a2cc03..d6e3219ddeb3f 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -305,9 +305,6 @@ static void nic_init_hw(struct nicpf *nic)
 {
 	int i;
 
-	/* Reset NIC, in case the driver is repeatedly inserted and removed */
-	nic_reg_write(nic, NIC_PF_SOFT_RESET, 1);
-
 	/* Enable NIC HW block */
 	nic_reg_write(nic, NIC_PF_CFG, 0x3);
 

From 4e85777ff071b51f500b130b6d036922af32be25 Mon Sep 17 00:00:00 2001
From: Sunil Goutham <sgoutham@cavium.com>
Date: Fri, 23 Oct 2015 17:14:08 -0700
Subject: [PATCH 161/199] net: thunderx: Fix incorrect subsystem devid of VF on
 pass2 silicon

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index b63e579aeb12d..a9377727c11c3 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -29,7 +29,7 @@
 static const struct pci_device_id nicvf_id_table[] = {
 	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_CAVIUM,
 			 PCI_DEVICE_ID_THUNDER_NIC_VF,
-			 PCI_VENDOR_ID_CAVIUM, 0xA11E) },
+			 PCI_VENDOR_ID_CAVIUM, 0xA134) },
 	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_CAVIUM,
 			 PCI_DEVICE_ID_THUNDER_PASS1_NIC_VF,
 			 PCI_VENDOR_ID_CAVIUM, 0xA11E) },

From 88ed237720bd618240439714a57fb69ea96428e7 Mon Sep 17 00:00:00 2001
From: David Daney <david.daney@cavium.com>
Date: Fri, 23 Oct 2015 17:14:09 -0700
Subject: [PATCH 162/199] net: thunderx: Rewrite silicon revision tests.

The test for pass-1 silicon was incorrect, it should be for all
revisions less than 8.  Also the revision is already present in the
pci_dev, so there is no need to read and keep a private copy.

Remove rev_id and code to read it from struct nicpf.  Create new
static inline function pass1_silicon() to be used to testing the
silicon version.  Use pass1_silicon() for revision checks, this will
be more widely used in follow on patches.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/cavium/thunder/nic_main.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index d6e3219ddeb3f..52e1acb69562c 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -22,7 +22,6 @@
 
 struct nicpf {
 	struct pci_dev		*pdev;
-	u8			rev_id;
 	u8			node;
 	unsigned int		flags;
 	u8			num_vf_en;      /* No of VF enabled */
@@ -54,6 +53,11 @@ struct nicpf {
 	bool			irq_allocated[NIC_PF_MSIX_VECTORS];
 };
 
+static inline bool pass1_silicon(struct nicpf *nic)
+{
+	return nic->pdev->revision < 8;
+}
+
 /* Supported devices */
 static const struct pci_device_id nic_id_table[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, PCI_DEVICE_ID_THUNDER_NIC_PF) },
@@ -117,7 +121,7 @@ static void nic_send_msg_to_vf(struct nicpf *nic, int vf, union nic_mbx *mbx)
 	 * when PF writes to MBOX(1), in next revisions when
 	 * PF writes to MBOX(0)
 	 */
-	if (nic->rev_id == 0) {
+	if (pass1_silicon(nic)) {
 		/* see the comment for nic_reg_write()/nic_reg_read()
 		 * functions above
 		 */
@@ -998,8 +1002,6 @@ static int nic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_release_regions;
 	}
 
-	pci_read_config_byte(pdev, PCI_REVISION_ID, &nic->rev_id);
-
 	nic->node = nic_get_node_id(pdev);
 
 	nic_set_lmac_vf_mapping(nic);

From 34411b68b132e403ddf395419e986475a9993d9b Mon Sep 17 00:00:00 2001
From: Thanneeru Srinivasulu <tsrinivasulu@caviumnetworks.com>
Date: Fri, 23 Oct 2015 17:14:10 -0700
Subject: [PATCH 163/199] net: thunderx: Incorporate pass2 silicon CPI index
 configuration changes

Add support for ThunderX pass2 CPI and MPI configuration changes.
MPI_ALG is not enabled i.e MCAM parsing is disabled.

Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@caviumnetworks.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 .../net/ethernet/cavium/thunder/nic_main.c    | 29 +++++++++++++++----
 drivers/net/ethernet/cavium/thunder/nic_reg.h |  4 +++
 2 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 52e1acb69562c..c561fdcb79a73 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -43,6 +43,7 @@ struct nicpf {
 	u8			duplex[MAX_LMAC];
 	u32			speed[MAX_LMAC];
 	u16			cpi_base[MAX_NUM_VFS_SUPPORTED];
+	u16			rssi_base[MAX_NUM_VFS_SUPPORTED];
 	u16			rss_ind_tbl_size;
 	bool			mbx_lock[MAX_NUM_VFS_SUPPORTED];
 
@@ -396,8 +397,18 @@ static void nic_config_cpi(struct nicpf *nic, struct cpi_cfg_msg *cfg)
 			padd = cpi % 8; /* 3 bits CS out of 6bits DSCP */
 
 		/* Leave RSS_SIZE as '0' to disable RSS */
-		nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi << 3),
-			      (vnic << 24) | (padd << 16) | (rssi_base + rssi));
+		if (pass1_silicon(nic)) {
+			nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi << 3),
+				      (vnic << 24) | (padd << 16) |
+				      (rssi_base + rssi));
+		} else {
+			/* Set MPI_ALG to '0' to disable MCAM parsing */
+			nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi << 3),
+				      (padd << 16));
+			/* MPI index is same as CPI if MPI_ALG is not enabled */
+			nic_reg_write(nic, NIC_PF_MPI_0_2047_CFG | (cpi << 3),
+				      (vnic << 24) | (rssi_base + rssi));
+		}
 
 		if ((rssi + 1) >= cfg->rq_cnt)
 			continue;
@@ -410,6 +421,7 @@ static void nic_config_cpi(struct nicpf *nic, struct cpi_cfg_msg *cfg)
 			rssi = ((cpi - cpi_base) & 0x38) >> 3;
 	}
 	nic->cpi_base[cfg->vf_id] = cpi_base;
+	nic->rssi_base[cfg->vf_id] = rssi_base;
 }
 
 /* Responsds to VF with its RSS indirection table size */
@@ -435,10 +447,9 @@ static void nic_config_rss(struct nicpf *nic, struct rss_cfg_msg *cfg)
 {
 	u8  qset, idx = 0;
 	u64 cpi_cfg, cpi_base, rssi_base, rssi;
+	u64 idx_addr;
 
-	cpi_base = nic->cpi_base[cfg->vf_id];
-	cpi_cfg = nic_reg_read(nic, NIC_PF_CPI_0_2047_CFG | (cpi_base << 3));
-	rssi_base = (cpi_cfg & 0x0FFF) + cfg->tbl_offset;
+	rssi_base = nic->rssi_base[cfg->vf_id] + cfg->tbl_offset;
 
 	rssi = rssi_base;
 	qset = cfg->vf_id;
@@ -455,9 +466,15 @@ static void nic_config_rss(struct nicpf *nic, struct rss_cfg_msg *cfg)
 		idx++;
 	}
 
+	cpi_base = nic->cpi_base[cfg->vf_id];
+	if (pass1_silicon(nic))
+		idx_addr = NIC_PF_CPI_0_2047_CFG;
+	else
+		idx_addr = NIC_PF_MPI_0_2047_CFG;
+	cpi_cfg = nic_reg_read(nic, idx_addr | (cpi_base << 3));
 	cpi_cfg &= ~(0xFULL << 20);
 	cpi_cfg |= (cfg->hash_bits << 20);
-	nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi_base << 3), cpi_cfg);
+	nic_reg_write(nic, idx_addr | (cpi_base << 3), cpi_cfg);
 }
 
 /* 4 level transmit side scheduler configutation
diff --git a/drivers/net/ethernet/cavium/thunder/nic_reg.h b/drivers/net/ethernet/cavium/thunder/nic_reg.h
index 58197bb2f8052..dd536be201931 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_reg.h
+++ b/drivers/net/ethernet/cavium/thunder/nic_reg.h
@@ -85,7 +85,11 @@
 #define   NIC_PF_ECC3_DBE_INT_W1S		(0x2708)
 #define   NIC_PF_ECC3_DBE_ENA_W1C		(0x2710)
 #define   NIC_PF_ECC3_DBE_ENA_W1S		(0x2718)
+#define   NIC_PF_MCAM_0_191_ENA			(0x100000)
+#define   NIC_PF_MCAM_0_191_M_0_5_DATA		(0x110000)
+#define   NIC_PF_MCAM_CTRL			(0x120000)
 #define   NIC_PF_CPI_0_2047_CFG			(0x200000)
+#define   NIC_PF_MPI_0_2047_CFG			(0x210000)
 #define   NIC_PF_RSSI_0_4097_RQ			(0x220000)
 #define   NIC_PF_LMAC_0_7_CFG			(0x240000)
 #define   NIC_PF_LMAC_0_7_SW_XOFF		(0x242000)

From 5188f7e5a7175975f8b943a4b25e499c98a7b9d6 Mon Sep 17 00:00:00 2001
From: Claudiu Manoil <claudiu.manoil@freescale.com>
Date: Fri, 23 Oct 2015 11:41:58 +0300
Subject: [PATCH 164/199] gianfar: Remove duplicated argument to bitwise OR

RQFCR_AND is duplicated.
Add missing space as well.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/freescale/gianfar_ethtool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar_ethtool.c b/drivers/net/ethernet/freescale/gianfar_ethtool.c
index 6bdc89179b72d..a33e4a8296015 100644
--- a/drivers/net/ethernet/freescale/gianfar_ethtool.c
+++ b/drivers/net/ethernet/freescale/gianfar_ethtool.c
@@ -676,14 +676,14 @@ static void ethflow_to_filer_rules (struct gfar_private *priv, u64 ethflow)
 	u32 fcr = 0x0, fpr = FPR_FILER_MASK;
 
 	if (ethflow & RXH_L2DA) {
-		fcr = RQFCR_PID_DAH |RQFCR_CMP_NOMATCH |
+		fcr = RQFCR_PID_DAH | RQFCR_CMP_NOMATCH |
 		      RQFCR_HASH | RQFCR_AND | RQFCR_HASHTBL_0;
 		priv->ftp_rqfpr[priv->cur_filer_idx] = fpr;
 		priv->ftp_rqfcr[priv->cur_filer_idx] = fcr;
 		gfar_write_filer(priv, priv->cur_filer_idx, fcr, fpr);
 		priv->cur_filer_idx = priv->cur_filer_idx - 1;
 
-		fcr = RQFCR_PID_DAL | RQFCR_AND | RQFCR_CMP_NOMATCH |
+		fcr = RQFCR_PID_DAL | RQFCR_CMP_NOMATCH |
 		      RQFCR_HASH | RQFCR_AND | RQFCR_HASHTBL_0;
 		priv->ftp_rqfpr[priv->cur_filer_idx] = fpr;
 		priv->ftp_rqfcr[priv->cur_filer_idx] = fcr;

From 15bf176db1fb00333af7050c0c699fc7b4e4a960 Mon Sep 17 00:00:00 2001
From: Claudiu Manoil <claudiu.manoil@freescale.com>
Date: Fri, 23 Oct 2015 11:41:59 +0300
Subject: [PATCH 165/199] gianfar: Don't enable the Filer w/o the Parser

Under one unusual circumstance it's possible to wrongly set
FILREN without enabling PRSDEP as well in the RCTRL register,
against the hardware specifications.  With the default config
this does not happen because the default Rx offloads (Rx csum
and Rx VLAN) properly enable PRSDEP.  But if anyone disables
all these offloads (via ethtool), we get a wrong configuration
were the Rx flow classification and hashing, and other Filer
based features (e.g. wake-on-filer interrupt) won't work.
This patch fixes the issue.
Also, account for Rx FCB insertion which happens every time
PRSDEP is set.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/freescale/gianfar.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 710715fcb23de..939ed8fe6318e 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -341,7 +341,7 @@ static void gfar_rx_offload_en(struct gfar_private *priv)
 	if (priv->ndev->features & (NETIF_F_RXCSUM | NETIF_F_HW_VLAN_CTAG_RX))
 		priv->uses_rxfcb = 1;
 
-	if (priv->hwts_rx_en)
+	if (priv->hwts_rx_en || priv->rx_filer_enable)
 		priv->uses_rxfcb = 1;
 }
 
@@ -351,7 +351,7 @@ static void gfar_mac_rx_config(struct gfar_private *priv)
 	u32 rctrl = 0;
 
 	if (priv->rx_filer_enable) {
-		rctrl |= RCTRL_FILREN;
+		rctrl |= RCTRL_FILREN | RCTRL_PRSDEP_INIT;
 		/* Program the RIR0 reg with the required distribution */
 		if (priv->poll_mode == GFAR_SQ_POLLING)
 			gfar_write(&regs->rir0, DEFAULT_2RXQ_RIR0);

From 1de65a5ea32de7b335ab505366d45cefadbbdf71 Mon Sep 17 00:00:00 2001
From: Claudiu Manoil <claudiu.manoil@freescale.com>
Date: Fri, 23 Oct 2015 11:42:00 +0300
Subject: [PATCH 166/199] gianfar: Fix Rx BSY error handling

The Rx BSY error interrupt indicates that a frame was
received and discarded due to lack of buffers, so it's
a rx ring overflow condition and has nothing to do with
with bad rx packets.  Use the right counter.

BSY conditions happen when the SoC is under performance
stress.  Doing *more* work in stress situations by trying
to schedule NAPI is not a good idea as the stressed system
becomes still more stressed.  The Rx interrupt is already
at work making sure the NAPI is scheduled.
So calling gfar_receive() here does not help.  This issue
was present since day 1.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/freescale/gianfar.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 939ed8fe6318e..ce38d266f931c 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -3462,11 +3462,9 @@ static irqreturn_t gfar_error(int irq, void *grp_id)
 		netif_dbg(priv, tx_err, dev, "Transmit Error\n");
 	}
 	if (events & IEVENT_BSY) {
-		dev->stats.rx_errors++;
+		dev->stats.rx_over_errors++;
 		atomic64_inc(&priv->extra_stats.rx_bsy);
 
-		gfar_receive(irq, grp_id);
-
 		netif_dbg(priv, rx_err, dev, "busy error (rstat: %x)\n",
 			  gfar_read(&regs->rstat));
 	}

From abb1ed7b793fcb10cadb378fe0eeee589b61a9e1 Mon Sep 17 00:00:00 2001
From: Claudiu Manoil <claudiu.manoil@freescale.com>
Date: Fri, 23 Oct 2015 11:42:01 +0300
Subject: [PATCH 167/199] MAINTAINERS: Add entry for gianfar ethernet driver

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fb7d2e4af2003..9b43ef23b9850 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4427,6 +4427,14 @@ L:	linuxppc-dev@lists.ozlabs.org
 S:	Maintained
 F:	drivers/net/ethernet/freescale/ucc_geth*
 
+FREESCALE eTSEC ETHERNET DRIVER (GIANFAR)
+M:	Claudiu Manoil <claudiu.manoil@freescale.com>
+L:	netdev@vger.kernel.org
+S:	Maintained
+F:	drivers/net/ethernet/freescale/gianfar*
+X:	drivers/net/ethernet/freescale/gianfar_ptp.c
+F:	Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
+
 FREESCALE QUICC ENGINE UCC UART DRIVER
 M:	Timur Tabi <timur@tabi.org>
 L:	linuxppc-dev@lists.ozlabs.org

From 195562194aad3a0a3915941077f283bcc6347b9b Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede@redhat.com>
Date: Mon, 26 Oct 2015 01:50:28 -0700
Subject: [PATCH 168/199] Input: alps - only the Dell Latitude D420/430/620/630
 have separate stick button bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

commit 92bac83dd79e ("Input: alps - non interleaved V2 dualpoint has
separate stick button bits") assumes that all alps v2 non-interleaved
dual point setups have the separate stick button bits.

Later we limited this to Dell laptops only because of reports that this
broke things on non Dell laptops. Now it turns out that this breaks things
on the Dell Latitude D600 too. So it seems that only the Dell Latitude
D420/430/620/630, which all share the same touchpad / stick combo,
have these separate bits.

This patch limits the checking of the separate bits to only these models
fixing regressions with other models.

Reported-and-tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: stable@vger.kernel.org
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-By: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---
 drivers/input/mouse/alps.c | 48 +++++++++++++++++++++++++++++++++-----
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
index 4d246861d692b..41e6cb501e6a1 100644
--- a/drivers/input/mouse/alps.c
+++ b/drivers/input/mouse/alps.c
@@ -100,7 +100,7 @@ static const struct alps_nibble_commands alps_v6_nibble_commands[] = {
 #define ALPS_FOUR_BUTTONS	0x40	/* 4 direction button present */
 #define ALPS_PS2_INTERLEAVED	0x80	/* 3-byte PS/2 packet interleaved with
 					   6-byte ALPS packet */
-#define ALPS_DELL		0x100	/* device is a Dell laptop */
+#define ALPS_STICK_BITS		0x100	/* separate stick button bits */
 #define ALPS_BUTTONPAD		0x200	/* device is a clickpad */
 
 static const struct alps_model_info alps_model_data[] = {
@@ -159,6 +159,43 @@ static const struct alps_protocol_info alps_v8_protocol_data = {
 	ALPS_PROTO_V8, 0x18, 0x18, 0
 };
 
+/*
+ * Some v2 models report the stick buttons in separate bits
+ */
+static const struct dmi_system_id alps_dmi_has_separate_stick_buttons[] = {
+#if defined(CONFIG_DMI) && defined(CONFIG_X86)
+	{
+		/* Extrapolated from other entries */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Latitude D420"),
+		},
+	},
+	{
+		/* Reported-by: Hans de Bruin <jmdebruin@xmsnet.nl> */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Latitude D430"),
+		},
+	},
+	{
+		/* Reported-by: Hans de Goede <hdegoede@redhat.com> */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Latitude D620"),
+		},
+	},
+	{
+		/* Extrapolated from other entries */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Latitude D630"),
+		},
+	},
+#endif
+	{ }
+};
+
 static void alps_set_abs_params_st(struct alps_data *priv,
 				   struct input_dev *dev1);
 static void alps_set_abs_params_semi_mt(struct alps_data *priv,
@@ -253,9 +290,8 @@ static void alps_process_packet_v1_v2(struct psmouse *psmouse)
 		return;
 	}
 
-	/* Dell non interleaved V2 dualpoint has separate stick button bits */
-	if (priv->proto_version == ALPS_PROTO_V2 &&
-	    priv->flags == (ALPS_DELL | ALPS_PASS | ALPS_DUALPOINT)) {
+	/* Some models have separate stick button bits */
+	if (priv->flags & ALPS_STICK_BITS) {
 		left |= packet[0] & 1;
 		right |= packet[0] & 2;
 		middle |= packet[0] & 4;
@@ -2552,8 +2588,6 @@ static int alps_set_protocol(struct psmouse *psmouse,
 	priv->byte0 = protocol->byte0;
 	priv->mask0 = protocol->mask0;
 	priv->flags = protocol->flags;
-	if (dmi_name_in_vendors("Dell"))
-		priv->flags |= ALPS_DELL;
 
 	priv->x_max = 2000;
 	priv->y_max = 1400;
@@ -2568,6 +2602,8 @@ static int alps_set_protocol(struct psmouse *psmouse,
 		priv->set_abs_params = alps_set_abs_params_st;
 		priv->x_max = 1023;
 		priv->y_max = 767;
+		if (dmi_check_system(alps_dmi_has_separate_stick_buttons))
+			priv->flags |= ALPS_STICK_BITS;
 		break;
 
 	case ALPS_PROTO_V3:

From ab8579169b79c062935dade949287113c7c1ba73 Mon Sep 17 00:00:00 2001
From: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Date: Sat, 24 Oct 2015 00:46:03 +0300
Subject: [PATCH 169/199] sh_eth: fix RX buffer size alignment

Both  Renesas R-Car and RZ/A1 manuals state that RX buffer  length must be
a multiple of 32 bytes, while the driver  only uses 16 byte granularity...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/renesas/sh_eth.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 257ea713b4c15..d8334d8a53b38 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1148,8 +1148,8 @@ static void sh_eth_ring_format(struct net_device *ndev)
 
 		/* RX descriptor */
 		rxdesc = &mdp->rx_ring[i];
-		/* The size of the buffer is a multiple of 16 bytes. */
-		rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16);
+		/* The size of the buffer is a multiple of 32 bytes. */
+		rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 32);
 		dma_addr = dma_map_single(&ndev->dev, skb->data,
 					  rxdesc->buffer_length,
 					  DMA_FROM_DEVICE);
@@ -1506,7 +1506,7 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
 			if (mdp->cd->rpadir)
 				skb_reserve(skb, NET_IP_ALIGN);
 			dma_unmap_single(&ndev->dev, rxdesc->addr,
-					 ALIGN(mdp->rx_buf_sz, 16),
+					 ALIGN(mdp->rx_buf_sz, 32),
 					 DMA_FROM_DEVICE);
 			skb_put(skb, pkt_len);
 			skb->protocol = eth_type_trans(skb, ndev);
@@ -1524,8 +1524,8 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
 	for (; mdp->cur_rx - mdp->dirty_rx > 0; mdp->dirty_rx++) {
 		entry = mdp->dirty_rx % mdp->num_rx_ring;
 		rxdesc = &mdp->rx_ring[entry];
-		/* The size of the buffer is 16 byte boundary. */
-		rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16);
+		/* The size of the buffer is 32 byte boundary. */
+		rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 32);
 
 		if (mdp->rx_skbuff[entry] == NULL) {
 			skb = netdev_alloc_skb(ndev, skbuff_size);

From cb3685958dd4c46d7646d244063ea3ec8adf3618 Mon Sep 17 00:00:00 2001
From: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Date: Sat, 24 Oct 2015 00:46:40 +0300
Subject: [PATCH 170/199] sh_eth: fix RX buffer size calculation

The RX buffer size calulation failed to account for the length granularity
(which is now 32 bytes)...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/renesas/sh_eth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index d8334d8a53b38..a484d8beb8557 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1127,7 +1127,7 @@ static void sh_eth_ring_format(struct net_device *ndev)
 	struct sh_eth_txdesc *txdesc = NULL;
 	int rx_ringsize = sizeof(*rxdesc) * mdp->num_rx_ring;
 	int tx_ringsize = sizeof(*txdesc) * mdp->num_tx_ring;
-	int skbuff_size = mdp->rx_buf_sz + SH_ETH_RX_ALIGN - 1;
+	int skbuff_size = mdp->rx_buf_sz + SH_ETH_RX_ALIGN + 32 - 1;
 	dma_addr_t dma_addr;
 
 	mdp->cur_rx = 0;
@@ -1450,7 +1450,7 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
 	struct sk_buff *skb;
 	u16 pkt_len = 0;
 	u32 desc_status;
-	int skbuff_size = mdp->rx_buf_sz + SH_ETH_RX_ALIGN - 1;
+	int skbuff_size = mdp->rx_buf_sz + SH_ETH_RX_ALIGN + 32 - 1;
 	dma_addr_t dma_addr;
 
 	boguscnt = min(boguscnt, *quota);

From 7e3b6e7423d5f994257c1de88e06b509673fdbcf Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet@google.com>
Date: Sat, 24 Oct 2015 05:47:44 -0700
Subject: [PATCH 171/199] ipv6: gre: support SIT encapsulation

gre_gso_segment() chokes if SIT frames were aggregated by GRO engine.

Fixes: 61c1db7fae21e ("ipv6: sit: add GSO/TSO support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv4/gre_offload.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index 5aa46d4b44efb..5a8ee32825508 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -36,7 +36,8 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 				  SKB_GSO_TCP_ECN |
 				  SKB_GSO_GRE |
 				  SKB_GSO_GRE_CSUM |
-				  SKB_GSO_IPIP)))
+				  SKB_GSO_IPIP |
+				  SKB_GSO_SIT)))
 		goto out;
 
 	if (!skb->encapsulation)

From 8c387ebbaff8652943a1cbcab496aecadc6a8875 Mon Sep 17 00:00:00 2001
From: Julia Lawall <julia.lawall@lip6.fr>
Date: Sun, 25 Oct 2015 14:57:00 +0100
Subject: [PATCH 172/199] net: thunderx: add missing of_node_put

for_each_child_of_node performs an of_node_get on each iteration, so
a break out of the loop requires an of_node_put.

A simplified version of the semantic patch that fixes this problem is as
follows (http://coccinelle.lip6.fr):

// <smpl>
@@
local idexpression r.n;
expression r,e;
@@

 for_each_child_of_node(r,n) {
   ...
(
   of_node_put(n);
|
   e = n
|
+  of_node_put(n);
?  break;
)
   ...
 }
... when != n
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 574c492789009..180aa9fabf482 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -977,8 +977,10 @@ static int bgx_init_of_phy(struct bgx *bgx)
 		SET_NETDEV_DEV(&bgx->lmac[lmac].netdev, &bgx->pdev->dev);
 		bgx->lmac[lmac].lmacid = lmac;
 		lmac++;
-		if (lmac == MAX_LMAC_PER_BGX)
+		if (lmac == MAX_LMAC_PER_BGX) {
+			of_node_put(np_child);
 			break;
+		}
 	}
 	return 0;
 }

From bd252796852193277a07da505601a2f407c70e0b Mon Sep 17 00:00:00 2001
From: Julia Lawall <julia.lawall@lip6.fr>
Date: Sun, 25 Oct 2015 14:57:01 +0100
Subject: [PATCH 173/199] net: netcp: add missing of_node_put

for_each_child_of_node performs an of_node_get on each iteration, so
a break out of the loop requires an of_node_put.

A simplified version of the semantic patch that fixes this problem is as
follows (http://coccinelle.lip6.fr):

// <smpl>
@@
local idexpression r.n;
expression r,e;
@@

 for_each_child_of_node(r,n) {
   ...
(
   of_node_put(n);
|
   e = n
|
+  of_node_put(n);
?  break;
)
   ...
 }
... when != n
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/ti/netcp_ethss.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c
index 6bff8d82ceab7..4e70e7586a091 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -2637,8 +2637,10 @@ static void init_secondary_ports(struct gbe_priv *gbe_dev,
 			mac_phy_link = true;
 
 		slave->open = true;
-		if (gbe_dev->num_slaves >= gbe_dev->max_num_slaves)
+		if (gbe_dev->num_slaves >= gbe_dev->max_num_slaves) {
+			of_node_put(port);
 			break;
+		}
 	}
 
 	/* of_phy_connect() is needed only for MAC-PHY interface */
@@ -3137,8 +3139,10 @@ static int gbe_probe(struct netcp_device *netcp_device, struct device *dev,
 			continue;
 		}
 		gbe_dev->num_slaves++;
-		if (gbe_dev->num_slaves >= gbe_dev->max_num_slaves)
+		if (gbe_dev->num_slaves >= gbe_dev->max_num_slaves) {
+			of_node_put(interface);
 			break;
+		}
 	}
 	of_node_put(interfaces);
 

From 447ed7360037b6e38c0206ddcbd04a256ec94099 Mon Sep 17 00:00:00 2001
From: Julia Lawall <julia.lawall@lip6.fr>
Date: Sun, 25 Oct 2015 14:57:02 +0100
Subject: [PATCH 174/199] netdev/phy: add missing of_node_put

for_each_available_child_of_node performs an of_node_get on each iteration, so
a break out of the loop requires an of_node_put.

A simplified version of the semantic patch that fixes this problem is as
follows (http://coccinelle.lip6.fr):

// <smpl>
@@
local idexpression r.n;
expression r,e;
@@

 for_each_available_child_of_node(r,n) {
   ...
(
   of_node_put(n);
|
   e = n
|
+  of_node_put(n);
?  break;
)
   ...
 }
... when != n
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/phy/mdio-mux.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c
index 280c7c311f724..908e8d4863429 100644
--- a/drivers/net/phy/mdio-mux.c
+++ b/drivers/net/phy/mdio-mux.c
@@ -144,6 +144,7 @@ int mdio_mux_init(struct device *dev,
 			dev_err(dev,
 				"Error: Failed to allocate memory for child\n");
 			ret_val = -ENOMEM;
+			of_node_put(child_bus_node);
 			break;
 		}
 		cb->bus_number = v;

From 028623418766ea64f4256035b06ac6cbc0a67892 Mon Sep 17 00:00:00 2001
From: Julia Lawall <julia.lawall@lip6.fr>
Date: Sun, 25 Oct 2015 14:57:03 +0100
Subject: [PATCH 175/199] net: phy: mdio: add missing of_node_put

for_each_available_child_of_node performs an of_node_get on each iteration, so
a break out of the loop requires an of_node_put.

A simplified version of the semantic patch that fixes this problem is as
follows (http://coccinelle.lip6.fr):

// <smpl>
@@
expression root,e;
local idexpression child;
@@

 for_each_available_child_of_node(root, child) {
   ... when != of_node_put(child)
       when != e = child
(
   return child;
|
+  of_node_put(child);
?  return ...;
)
   ...
 }
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/phy/mdio-mux-mmioreg.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/phy/mdio-mux-mmioreg.c b/drivers/net/phy/mdio-mux-mmioreg.c
index 2377c1341172f..7fde454fbc4f1 100644
--- a/drivers/net/phy/mdio-mux-mmioreg.c
+++ b/drivers/net/phy/mdio-mux-mmioreg.c
@@ -113,12 +113,14 @@ static int mdio_mux_mmioreg_probe(struct platform_device *pdev)
 		if (!iprop || len != sizeof(uint32_t)) {
 			dev_err(&pdev->dev, "mdio-mux child node %s is "
 				"missing a 'reg' property\n", np2->full_name);
+			of_node_put(np2);
 			return -ENODEV;
 		}
 		if (be32_to_cpup(iprop) & ~s->mask) {
 			dev_err(&pdev->dev, "mdio-mux child node %s has "
 				"a 'reg' value with unmasked bits\n",
 				np2->full_name);
+			of_node_put(np2);
 			return -ENODEV;
 		}
 	}

From 81a577034b000964ca791281a975f0ba9a9d7eed Mon Sep 17 00:00:00 2001
From: Julia Lawall <julia.lawall@lip6.fr>
Date: Sun, 25 Oct 2015 14:57:06 +0100
Subject: [PATCH 176/199] ath6kl: add missing of_node_put

for_each_compatible_node performs an of_node_get on each iteration, so
a break out of the loop requires an of_node_put.

A simplified version of the semantic patch that fixes this problem is as
follows (http://coccinelle.lip6.fr):

// <smpl>
@@
expression e;
local idexpression n;
@@

 for_each_compatible_node(n,...) {
   ... when != of_node_put(n)
       when != e = n
(
   return n;
|
+  of_node_put(n);
?  return ...;
)
   ...
 }
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/wireless/ath/ath6kl/init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/ath/ath6kl/init.c b/drivers/net/wireless/ath/ath6kl/init.c
index 6e473fa4b13ca..12241b1c57cd2 100644
--- a/drivers/net/wireless/ath/ath6kl/init.c
+++ b/drivers/net/wireless/ath/ath6kl/init.c
@@ -715,6 +715,7 @@ static bool check_device_tree(struct ath6kl *ar)
 				   board_filename, ret);
 			continue;
 		}
+		of_node_put(node);
 		return true;
 	}
 	return false;

From 26b7974d9ad7f93891ee8c39ee63bd2515da7744 Mon Sep 17 00:00:00 2001
From: Julia Lawall <julia.lawall@lip6.fr>
Date: Sun, 25 Oct 2015 14:57:07 +0100
Subject: [PATCH 177/199] net: mv643xx_eth: add missing of_node_put

for_each_available_child_of_node performs an of_node_get on each iteration, so
a break out of the loop requires an of_node_put.

A simplified version of the semantic patch that fixes this problem is as
follows (http://coccinelle.lip6.fr):

// <smpl>
@@
expression root,e;
local idexpression child;
@@

 for_each_available_child_of_node(root, child) {
   ... when != of_node_put(child)
       when != e = child
(
   return child;
|
+  of_node_put(child);
?  return ...;
)
   ...
 }
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/marvell/mv643xx_eth.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index e893a35143c52..dfb6d5f79a100 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -2817,8 +2817,10 @@ static int mv643xx_eth_shared_of_probe(struct platform_device *pdev)
 
 	for_each_available_child_of_node(np, pnp) {
 		ret = mv643xx_eth_shared_of_add_port(pdev, pnp);
-		if (ret)
+		if (ret) {
+			of_node_put(pnp);
 			return ret;
+		}
 	}
 	return 0;
 }

From 174fd8d369613c4e06660f3704caaba48dac8554 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Thu, 22 Oct 2015 09:27:12 +0900
Subject: [PATCH 178/199] blkcg: fix incorrect read/write sync/async stat
 accounting

While unifying how blkcg stats are collected, 77ea733884eb ("blkcg:
move io_service_bytes and io_serviced stats into blkcg_gq")
incorrectly used bio->flags instead of bio->rw to tell the IO type.
This made IOs to be accounted as the wrong type.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 77ea733884eb ("blkcg: move io_service_bytes and io_serviced stats into blkcg_gq")
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 include/linux/blk-cgroup.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 0a5cc7a1109b9..c02e669945e92 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -713,9 +713,9 @@ static inline bool blkcg_bio_issue_check(struct request_queue *q,
 
 	if (!throtl) {
 		blkg = blkg ?: q->root_blkg;
-		blkg_rwstat_add(&blkg->stat_bytes, bio->bi_flags,
+		blkg_rwstat_add(&blkg->stat_bytes, bio->bi_rw,
 				bio->bi_iter.bi_size);
-		blkg_rwstat_add(&blkg->stat_ios, bio->bi_flags, 1);
+		blkg_rwstat_add(&blkg->stat_ios, bio->bi_rw, 1);
 	}
 
 	rcu_read_unlock();

From a22c4d7e34402ccdf3414f64c50365436eba7b93 Mon Sep 17 00:00:00 2001
From: Ming Lin <ming.l@ssi.samsung.com>
Date: Thu, 22 Oct 2015 09:59:42 -0700
Subject: [PATCH 179/199] block: re-add discard_granularity and alignment
 checks

In commit b49a087("block: remove split code in
blkdev_issue_{discard,write_same}"), discard_granularity and alignment
checks were removed. Ideally, with bio late splitting, the upper layers
shouldn't need to depend on device's limits.

Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe
device when mkfs.xfs. We have not found the root cause yet.

This patch re-adds discard_granularity and alignment checks by reverting
the related changes in commit b49a087. The good thing is now we can
remove the 2G discard size cap and just use UINT_MAX to avoid bi_size
overflow.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 block/blk-lib.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index bd40292e50096..9ebf65379556a 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -26,13 +26,6 @@ static void bio_batch_end_io(struct bio *bio)
 	bio_put(bio);
 }
 
-/*
- * Ensure that max discard sectors doesn't overflow bi_size and hopefully
- * it is of the proper granularity as long as the granularity is a power
- * of two.
- */
-#define MAX_BIO_SECTORS ((1U << 31) >> 9)
-
 /**
  * blkdev_issue_discard - queue a discard
  * @bdev:	blockdev to issue discard for
@@ -50,6 +43,8 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	DECLARE_COMPLETION_ONSTACK(wait);
 	struct request_queue *q = bdev_get_queue(bdev);
 	int type = REQ_WRITE | REQ_DISCARD;
+	unsigned int granularity;
+	int alignment;
 	struct bio_batch bb;
 	struct bio *bio;
 	int ret = 0;
@@ -61,6 +56,10 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	if (!blk_queue_discard(q))
 		return -EOPNOTSUPP;
 
+	/* Zero-sector (unknown) and one-sector granularities are the same.  */
+	granularity = max(q->limits.discard_granularity >> 9, 1U);
+	alignment = (bdev_discard_alignment(bdev) >> 9) % granularity;
+
 	if (flags & BLKDEV_DISCARD_SECURE) {
 		if (!blk_queue_secdiscard(q))
 			return -EOPNOTSUPP;
@@ -74,7 +73,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	blk_start_plug(&plug);
 	while (nr_sects) {
 		unsigned int req_sects;
-		sector_t end_sect;
+		sector_t end_sect, tmp;
 
 		bio = bio_alloc(gfp_mask, 1);
 		if (!bio) {
@@ -82,8 +81,22 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 			break;
 		}
 
-		req_sects = min_t(sector_t, nr_sects, MAX_BIO_SECTORS);
+		/* Make sure bi_size doesn't overflow */
+		req_sects = min_t(sector_t, nr_sects, UINT_MAX >> 9);
+
+		/*
+		 * If splitting a request, and the next starting sector would be
+		 * misaligned, stop the discard at the previous aligned sector.
+		 */
 		end_sect = sector + req_sects;
+		tmp = end_sect;
+		if (req_sects < nr_sects &&
+		    sector_div(tmp, granularity) != alignment) {
+			end_sect = end_sect - alignment;
+			sector_div(end_sect, granularity);
+			end_sect = end_sect * granularity + alignment;
+			req_sects = end_sect - sector;
+		}
 
 		bio->bi_iter.bi_sector = sector;
 		bio->bi_end_io = bio_batch_end_io;

From c2229fe1430d4e1c70e36520229dd64a87802b20 Mon Sep 17 00:00:00 2001
From: Alexander Duyck <aduyck@mirantis.com>
Date: Tue, 27 Oct 2015 15:06:45 -0700
Subject: [PATCH 180/199] fib_trie: leaf_walk_rcu should not compute key if key
 is less than pn->key

We were computing the child index in cases where the key value we were
looking for was actually less than the base key of the tnode.  As a result
we were getting incorrect index values that would cause us to skip over
some children.

To fix this I have added a test that will force us to use child index 0 if
the key we are looking for is less than the key of the current tnode.

Fixes: 8be33e955cb9 ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf")
Reported-by: Brian Rak <brak@gameservers.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv4/fib_trie.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 6c2af797f2f92..744e5936c10d7 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1569,7 +1569,7 @@ static struct key_vector *leaf_walk_rcu(struct key_vector **tn, t_key key)
 	do {
 		/* record parent and next child index */
 		pn = n;
-		cindex = key ? get_index(key, pn) : 0;
+		cindex = (key > pn->key) ? get_index(key, pn) : 0;
 
 		if (cindex >> pn->bits)
 			break;

From 74c16618137f1505b0a32dea3ec73a2ef6f8f842 Mon Sep 17 00:00:00 2001
From: Joe Stringer <joestringer@nicira.com>
Date: Sun, 25 Oct 2015 20:21:48 -0700
Subject: [PATCH 181/199] openvswitch: Fix double-free on ip_defrag() errors

If ip_defrag() returns an error other than -EINPROGRESS, then the skb is
freed. When handle_fragments() passes this back up to
do_execute_actions(), it will be freed again. Prevent this double free
by never freeing the skb in do_execute_actions() for errors returned by
ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead.

Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/openvswitch/actions.c   |  4 ++--
 net/openvswitch/conntrack.c | 17 +++++++++++++----
 net/openvswitch/conntrack.h |  1 +
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 0bf0f406de523..dba635d086b2f 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -1109,8 +1109,8 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 					     nla_data(a));
 
 			/* Hide stolen IP fragments from user space. */
-			if (err == -EINPROGRESS)
-				return 0;
+			if (err)
+				return err == -EINPROGRESS ? 0 : err;
 			break;
 		}
 
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index a5ec34f8502f0..b5dcc0abde660 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -293,6 +293,9 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
 	return helper->help(skb, protoff, ct, ctinfo);
 }
 
+/* Returns 0 on success, -EINPROGRESS if 'skb' is stolen, or other nonzero
+ * value if 'skb' is freed.
+ */
 static int handle_fragments(struct net *net, struct sw_flow_key *key,
 			    u16 zone, struct sk_buff *skb)
 {
@@ -308,8 +311,8 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key,
 			return err;
 
 		ovs_cb.mru = IPCB(skb)->frag_max_size;
-	} else if (key->eth.type == htons(ETH_P_IPV6)) {
 #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
+	} else if (key->eth.type == htons(ETH_P_IPV6)) {
 		enum ip6_defrag_users user = IP6_DEFRAG_CONNTRACK_IN + zone;
 		struct sk_buff *reasm;
 
@@ -318,17 +321,18 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key,
 		if (!reasm)
 			return -EINPROGRESS;
 
-		if (skb == reasm)
+		if (skb == reasm) {
+			kfree_skb(skb);
 			return -EINVAL;
+		}
 
 		key->ip.proto = ipv6_hdr(reasm)->nexthdr;
 		skb_morph(skb, reasm);
 		consume_skb(reasm);
 		ovs_cb.mru = IP6CB(skb)->frag_max_size;
-#else
-		return -EPFNOSUPPORT;
 #endif
 	} else {
+		kfree_skb(skb);
 		return -EPFNOSUPPORT;
 	}
 
@@ -473,6 +477,9 @@ static bool labels_nonzero(const struct ovs_key_ct_labels *labels)
 	return false;
 }
 
+/* Returns 0 on success, -EINPROGRESS if 'skb' is stolen, or other nonzero
+ * value if 'skb' is freed.
+ */
 int ovs_ct_execute(struct net *net, struct sk_buff *skb,
 		   struct sw_flow_key *key,
 		   const struct ovs_conntrack_info *info)
@@ -508,6 +515,8 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
 					&info->labels.mask);
 err:
 	skb_push(skb, nh_ofs);
+	if (err)
+		kfree_skb(skb);
 	return err;
 }
 
diff --git a/net/openvswitch/conntrack.h b/net/openvswitch/conntrack.h
index 82e0dfc660280..a7544f405c162 100644
--- a/net/openvswitch/conntrack.h
+++ b/net/openvswitch/conntrack.h
@@ -67,6 +67,7 @@ static inline int ovs_ct_execute(struct net *net, struct sk_buff *skb,
 				 struct sw_flow_key *key,
 				 const struct ovs_conntrack_info *info)
 {
+	kfree_skb(skb);
 	return -ENOTSUPP;
 }
 

From 190b8ffbb700a9aa47acc559779bc79c0cb14766 Mon Sep 17 00:00:00 2001
From: Joe Stringer <joestringer@nicira.com>
Date: Sun, 25 Oct 2015 20:21:49 -0700
Subject: [PATCH 182/199] ipv6: Export nf_ct_frag6_consume_orig()

This is needed in openvswitch to fix an skb leak in the next patch.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 701cd2bae0a92..c7196ad1d69f8 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -646,6 +646,7 @@ void nf_ct_frag6_consume_orig(struct sk_buff *skb)
 		s = s2;
 	}
 }
+EXPORT_SYMBOL_GPL(nf_ct_frag6_consume_orig);
 
 static int nf_ct_net_init(struct net *net)
 {

From 6f5cadee44d83395dcd78d557b577e1021e192e4 Mon Sep 17 00:00:00 2001
From: Joe Stringer <joestringer@nicira.com>
Date: Sun, 25 Oct 2015 20:21:50 -0700
Subject: [PATCH 183/199] openvswitch: Fix skb leak using IPv6 defrag

nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the
reassembly is successful, expects the caller to free all of the original
skbs using nf_ct_frag6_consume_orig(). This call was previously missing,
meaning that the original fragments were never freed (with the exception
of the last fragment to arrive).

Fix this by ensuring that all original fragments except for the last
fragment are freed via nf_ct_frag6_consume_orig(). The last fragment
will be morphed into the head, so it must not be freed yet. Furthermore,
retain the ->next pointer for the head after skb_morph().

Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/openvswitch/conntrack.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index b5dcc0abde660..50095820edb7b 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -326,8 +326,15 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key,
 			return -EINVAL;
 		}
 
+		/* Don't free 'skb' even though it is one of the original
+		 * fragments, as we're going to morph it into the head.
+		 */
+		skb_get(skb);
+		nf_ct_frag6_consume_orig(reasm);
+
 		key->ip.proto = ipv6_hdr(reasm)->nexthdr;
 		skb_morph(skb, reasm);
+		skb->next = reasm->next;
 		consume_skb(reasm);
 		ovs_cb.mru = IP6CB(skb)->frag_max_size;
 #endif

From 0b7c874348ea14ec3c358fe95e56d6f830540248 Mon Sep 17 00:00:00 2001
From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon, 26 Oct 2015 12:24:22 -0400
Subject: [PATCH 184/199] forcedeth: fix unilateral interrupt disabling in
 netpoll path

Forcedeth currently uses disable_irq_lockdep and enable_irq_lockdep, which in
some configurations simply calls local_irq_disable.  This causes errant warnings
in the netpoll path as in netpoll_send_skb_on_dev, where we disable irqs using
local_irq_save, leading to the following warning:

WARNING: at net/core/netpoll.c:352 netpoll_send_skb_on_dev+0x243/0x250() (Not
tainted)
Hardware name:
netpoll_send_skb_on_dev(): eth0 enabled interrupts in poll
(nv_start_xmit_optimized+0x0/0x860 [forcedeth])
Modules linked in: netconsole(+) configfs ipv6 iptable_filter ip_tables ppdev
parport_pc parport sg microcode serio_raw edac_core edac_mce_amd k8temp
snd_hda_codec_realtek snd_hda_codec_generic forcedeth snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore
snd_page_alloc i2c_nforce2 i2c_core shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod
crc_t10dif pata_amd ata_generic pata_acpi sata_nv dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 1940, comm: modprobe Not tainted 2.6.32-573.7.1.el6.x86_64.debug #1
Call Trace:
 [<ffffffff8107bbc1>] ? warn_slowpath_common+0x91/0xe0
 [<ffffffff8107bcc6>] ? warn_slowpath_fmt+0x46/0x60
 [<ffffffffa00fe5b0>] ? nv_start_xmit_optimized+0x0/0x860 [forcedeth]
 [<ffffffff814b3593>] ? netpoll_send_skb_on_dev+0x243/0x250
 [<ffffffff814b37c9>] ? netpoll_send_udp+0x229/0x270
 [<ffffffffa02e3299>] ? write_msg+0x39/0x110 [netconsole]
 [<ffffffffa02e331b>] ? write_msg+0xbb/0x110 [netconsole]
 [<ffffffff8107bd55>] ? __call_console_drivers+0x75/0x90
 [<ffffffff8107bdba>] ? _call_console_drivers+0x4a/0x80
 [<ffffffff8107c445>] ? release_console_sem+0xe5/0x250
 [<ffffffff8107d200>] ? register_console+0x190/0x3e0
 [<ffffffffa02e71a6>] ? init_netconsole+0x1a6/0x216 [netconsole]
 [<ffffffffa02e7000>] ? init_netconsole+0x0/0x216 [netconsole]
 [<ffffffff810020d0>] ? do_one_initcall+0xc0/0x280
 [<ffffffff810d4933>] ? sys_init_module+0xe3/0x260
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
---[ end trace f349c7af88e6a6d5 ]---
console [netcon0] enabled
netconsole: network logging started

Fix it by modifying the forcedeth code to use
disable_irq_nosync_lockdep_irqsavedisable_irq_nosync_lockdep_irqsave instead,
which saves and restores irq state properly.  This also saves us a little code
in the process

Tested by the reporter, with successful restuls

Patch applies to the head of the net tree

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
Reported-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/nvidia/forcedeth.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index a41bb5e6b954f..75e88f4c15315 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -4076,6 +4076,8 @@ static void nv_do_nic_poll(unsigned long data)
 	struct fe_priv *np = netdev_priv(dev);
 	u8 __iomem *base = get_hwbase(dev);
 	u32 mask = 0;
+	unsigned long flags;
+	unsigned int irq = 0;
 
 	/*
 	 * First disable irq(s) and then
@@ -4085,25 +4087,27 @@ static void nv_do_nic_poll(unsigned long data)
 
 	if (!using_multi_irqs(dev)) {
 		if (np->msi_flags & NV_MSI_X_ENABLED)
-			disable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_ALL].vector);
+			irq = np->msi_x_entry[NV_MSI_X_VECTOR_ALL].vector;
 		else
-			disable_irq_lockdep(np->pci_dev->irq);
+			irq = np->pci_dev->irq;
 		mask = np->irqmask;
 	} else {
 		if (np->nic_poll_irq & NVREG_IRQ_RX_ALL) {
-			disable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_RX].vector);
+			irq = np->msi_x_entry[NV_MSI_X_VECTOR_RX].vector;
 			mask |= NVREG_IRQ_RX_ALL;
 		}
 		if (np->nic_poll_irq & NVREG_IRQ_TX_ALL) {
-			disable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_TX].vector);
+			irq = np->msi_x_entry[NV_MSI_X_VECTOR_TX].vector;
 			mask |= NVREG_IRQ_TX_ALL;
 		}
 		if (np->nic_poll_irq & NVREG_IRQ_OTHER) {
-			disable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_OTHER].vector);
+			irq = np->msi_x_entry[NV_MSI_X_VECTOR_OTHER].vector;
 			mask |= NVREG_IRQ_OTHER;
 		}
 	}
-	/* disable_irq() contains synchronize_irq, thus no irq handler can run now */
+
+	disable_irq_nosync_lockdep_irqsave(irq, &flags);
+	synchronize_irq(irq);
 
 	if (np->recover_error) {
 		np->recover_error = 0;
@@ -4156,28 +4160,22 @@ static void nv_do_nic_poll(unsigned long data)
 			nv_nic_irq_optimized(0, dev);
 		else
 			nv_nic_irq(0, dev);
-		if (np->msi_flags & NV_MSI_X_ENABLED)
-			enable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_ALL].vector);
-		else
-			enable_irq_lockdep(np->pci_dev->irq);
 	} else {
 		if (np->nic_poll_irq & NVREG_IRQ_RX_ALL) {
 			np->nic_poll_irq &= ~NVREG_IRQ_RX_ALL;
 			nv_nic_irq_rx(0, dev);
-			enable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_RX].vector);
 		}
 		if (np->nic_poll_irq & NVREG_IRQ_TX_ALL) {
 			np->nic_poll_irq &= ~NVREG_IRQ_TX_ALL;
 			nv_nic_irq_tx(0, dev);
-			enable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_TX].vector);
 		}
 		if (np->nic_poll_irq & NVREG_IRQ_OTHER) {
 			np->nic_poll_irq &= ~NVREG_IRQ_OTHER;
 			nv_nic_irq_other(0, dev);
-			enable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_OTHER].vector);
 		}
 	}
 
+	enable_irq_lockdep_irqrestore(irq, &flags);
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER

From 8ce675ff39b9958d1c10f86cf58e357efaafc856 Mon Sep 17 00:00:00 2001
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Mon, 26 Oct 2015 12:46:37 -0400
Subject: [PATCH 185/199] RDS-TCP: Recover correctly from
 pskb_pull()/pksb_trim() failure in rds_tcp_data_recv

Either of pskb_pull() or pskb_trim() may fail under low memory conditions.
If rds_tcp_data_recv() ignores such failures, the application will
receive corrupted data because the skb has not been correctly
carved to the RDS datagram size.

Avoid this by handling pskb_pull/pskb_trim failure in the same
manner as the skb_clone failure: bail out of rds_tcp_data_recv(), and
retry via the deferred call to rds_send_worker() that gets set up on
ENOMEM from rds_tcp_read_sock()

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/rds/tcp_recv.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/rds/tcp_recv.c b/net/rds/tcp_recv.c
index fbc5ef88bc0e6..27a992154804c 100644
--- a/net/rds/tcp_recv.c
+++ b/net/rds/tcp_recv.c
@@ -214,8 +214,15 @@ static int rds_tcp_data_recv(read_descriptor_t *desc, struct sk_buff *skb,
 			}
 
 			to_copy = min(tc->t_tinc_data_rem, left);
-			pskb_pull(clone, offset);
-			pskb_trim(clone, to_copy);
+			if (!pskb_pull(clone, offset) ||
+			    pskb_trim(clone, to_copy)) {
+				pr_warn("rds_tcp_data_recv: pull/trim failed "
+					"left %zu data_rem %zu skb_len %d\n",
+					left, tc->t_tinc_data_rem, skb->len);
+				kfree_skb(clone);
+				desc->error = -ENOMEM;
+				goto out;
+			}
 			skb_queue_tail(&tinc->ti_skb_list, clone);
 
 			rdsdebug("skb %p data %p len %d off %u to_copy %zu -> "

From 20986ed826cbb36bb8f2d77f872e3c52d8d30647 Mon Sep 17 00:00:00 2001
From: "Lendacky, Thomas" <Thomas.Lendacky@amd.com>
Date: Mon, 26 Oct 2015 17:13:54 -0500
Subject: [PATCH 186/199] amd-xgbe: Fix race between access of desc and desc
 index

During Tx cleanup it's still possible for the descriptor data to be
read ahead of the descriptor index. A memory barrier is required between
the read of the descriptor index and the start of the Tx cleanup loop.
This allows a change to a lighter-weight barrier in the Tx transmit
routine just before updating the current descriptor index.

Since the memory barrier does result in extra overhead on arm64, keep
the previous change to not chase the current descriptor value. This
prevents the execution of the barrier for each loop performed.

Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c | 2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
index e9ab8b9f3b9cb..f672dba345f7f 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -1595,7 +1595,7 @@ static void xgbe_dev_xmit(struct xgbe_channel *channel)
 				  packet->rdesc_count, 1);
 
 	/* Make sure ownership is written to the descriptor */
-	wmb();
+	smp_wmb();
 
 	ring->cur = cur_index + 1;
 	if (!packet->skb->xmit_more ||
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index d2b77d985441f..dde0486667e0c 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1816,6 +1816,10 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
 		return 0;
 
 	cur = ring->cur;
+
+	/* Be sure we get ring->cur before accessing descriptor data */
+	smp_rmb();
+
 	txq = netdev_get_tx_queue(netdev, channel->queue_index);
 
 	while ((processed < XGBE_TX_DESC_MAX_PROC) &&

From 85ff8a43f39fa6d2f970b5e1e5c03df87abde242 Mon Sep 17 00:00:00 2001
From: Yang Shi <yang.shi@linaro.org>
Date: Mon, 26 Oct 2015 17:02:19 -0700
Subject: [PATCH 187/199] bpf: sample: define aarch64 specific registers

Define aarch64 specific registers for building bpf samples correctly.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 samples/bpf/bpf_helpers.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 3a44d3a272af4..af44e564d6ddc 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -86,5 +86,17 @@ static int (*bpf_l4_csum_replace)(void *ctx, int off, int from, int to, int flag
 #define PT_REGS_RC(x) ((x)->gprs[2])
 #define PT_REGS_SP(x) ((x)->gprs[15])
 
+#elif defined(__aarch64__)
+
+#define PT_REGS_PARM1(x) ((x)->regs[0])
+#define PT_REGS_PARM2(x) ((x)->regs[1])
+#define PT_REGS_PARM3(x) ((x)->regs[2])
+#define PT_REGS_PARM4(x) ((x)->regs[3])
+#define PT_REGS_PARM5(x) ((x)->regs[4])
+#define PT_REGS_RET(x) ((x)->regs[30])
+#define PT_REGS_FP(x) ((x)->regs[29]) /* Works only with CONFIG_FRAME_POINTER */
+#define PT_REGS_RC(x) ((x)->regs[0])
+#define PT_REGS_SP(x) ((x)->sp)
+
 #endif
 #endif

From e407f39afdc0741dcf20aed100b8e738ccab7cb1 Mon Sep 17 00:00:00 2001
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Tue, 27 Oct 2015 11:37:39 +0200
Subject: [PATCH 188/199] vhost: fix performance on LE hosts

commit 2751c9882b947292fcfb084c4f604e01724af804 ("vhost: cross-endian
support for legacy devices") introduced a minor regression: even with
cross-endian disabled, and even on LE host, vhost_is_little_endian is
checking is_le flag so there's always a branch.

To fix, simply check virtio_legacy_is_little_endian first.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/vhost/vhost.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4772862b71a74..d3f767448a72c 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -183,10 +183,17 @@ static inline bool vhost_has_feature(struct vhost_virtqueue *vq, int bit)
 	return vq->acked_features & (1ULL << bit);
 }
 
+#ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY
 static inline bool vhost_is_little_endian(struct vhost_virtqueue *vq)
 {
 	return vq->is_le;
 }
+#else
+static inline bool vhost_is_little_endian(struct vhost_virtqueue *vq)
+{
+	return virtio_legacy_is_little_endian() || vq->is_le;
+}
+#endif
 
 /* Memory accessors */
 static inline u16 vhost16_to_cpu(struct vhost_virtqueue *vq, __virtio16 val)

From 092bf0fc80f5fb7928244ad63d8a2a8df8a72a3e Mon Sep 17 00:00:00 2001
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
Date: Tue, 27 Oct 2015 17:36:19 +0200
Subject: [PATCH 189/199] net/mlx4_en: Explicitly set no vlan tags in WQE ctrl
 segment when no vlan is present

We do not set the ins_vlan field to zero when no vlan id is present in the packet.

Since WQEs in the TX ring are not zeroed out between uses, this oversight
could result in having vlan flags present in the WQE ctrl segment when no
vlan is preset.

Fixes: e38af4faf01d ('net/mlx4_en: Add support for hardware accelerated 802.1ad vlan')
Reported-by: Gideon Naim <gideonn@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 494e7762fdb19..4421bf5463f67 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -964,6 +964,8 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 			tx_desc->ctrl.ins_vlan = MLX4_WQE_CTRL_INS_SVLAN;
 		else if (vlan_proto == ETH_P_8021Q)
 			tx_desc->ctrl.ins_vlan = MLX4_WQE_CTRL_INS_CVLAN;
+		else
+			tx_desc->ctrl.ins_vlan = 0;
 
 		tx_desc->ctrl.fence_size = real_size;
 

From c02b05011fadf8e409e41910217ca689f2fc9d91 Mon Sep 17 00:00:00 2001
From: Carol L Soto <clsoto@linux.vnet.ibm.com>
Date: Tue, 27 Oct 2015 17:36:20 +0200
Subject: [PATCH 190/199] net/mlx4: Copy/set only sizeof struct mlx4_eqe bytes

When doing memcpy/memset of EQEs, we should use sizeof struct
mlx4_eqe as the base size and not caps.eqe_size which could be bigger.

If caps.eqe_size is bigger than the struct mlx4_eqe then we corrupt
data in the master context.

When using a 64 byte stride, the memcpy copied over 63 bytes to the
slave_eq structure.  This resulted in copying over the entire eqe of
interest, including its ownership bit -- and also 31 bytes of garbage
into the next WQE in the slave EQ -- which did NOT include the ownership
bit (and therefore had no impact).

However, once the stride is increased to 128, we are overwriting the
ownership bits of *three* eqes in the slave_eq struct.  This results
in an incorrect ownership bit for those eqes, which causes the eq to
seem to be full. The issue therefore surfaced only once 128-byte EQEs
started being used in SRIOV and (overarchitectures that have 128/256
byte cache-lines such as PPC) - e.g after commit 77507aa249ae
"net/mlx4_core: Enable CQE/EQE stride support".

Fixes: 08ff32352d6f ('mlx4: 64-byte CQE/EQE support')
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c | 2 +-
 drivers/net/ethernet/mellanox/mlx4/eq.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 0a3202047569c..2177e56ed0be7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2398,7 +2398,7 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
 			}
 		}
 
-		memset(&priv->mfunc.master.cmd_eqe, 0, dev->caps.eqe_size);
+		memset(&priv->mfunc.master.cmd_eqe, 0, sizeof(struct mlx4_eqe));
 		priv->mfunc.master.cmd_eqe.type = MLX4_EVENT_TYPE_CMD;
 		INIT_WORK(&priv->mfunc.master.comm_work,
 			  mlx4_master_comm_channel);
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index c344884793657..603d1c3d3b2ea 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -196,7 +196,7 @@ static void slave_event(struct mlx4_dev *dev, u8 slave, struct mlx4_eqe *eqe)
 		return;
 	}
 
-	memcpy(s_eqe, eqe, dev->caps.eqe_size - 1);
+	memcpy(s_eqe, eqe, sizeof(struct mlx4_eqe) - 1);
 	s_eqe->slave_id = slave;
 	/* ensure all information is written before setting the ownersip bit */
 	dma_wmb();

From 977bf062bba3eb8d03f66d5b4e227e5d7ebc1e08 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Tue, 27 Oct 2015 17:20:05 +0900
Subject: [PATCH 191/199] powerpc/dma: dma_set_coherent_mask() should not be
 GPL only

When turning this from inline to an exported function I was a bit
over-eager and made it GPL only. This prevents the use of pretty much
all non-GPL PCI driver which is a bit over the top. Let's bring it
back in line with other architecture.

Fixes: 817820b0226a ("powerpc/iommu: Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 59503ed98e5fc..3f1472a78f393 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -303,7 +303,7 @@ int dma_set_coherent_mask(struct device *dev, u64 mask)
 	dev->coherent_dma_mask = mask;
 	return 0;
 }
-EXPORT_SYMBOL_GPL(dma_set_coherent_mask);
+EXPORT_SYMBOL(dma_set_coherent_mask);
 
 #define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16)
 

From 589cb22bbedacf325951014c07a35a2b01ca57f6 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Thu, 15 Oct 2015 13:55:53 +0100
Subject: [PATCH 192/199] arm64: compat: fix stxr failure case in SWP emulation

If the STXR instruction fails in the SWP emulation code, we leave *data
overwritten with the loaded value, therefore corrupting the data written
by a subsequent, successful attempt.

This patch re-jigs the code so that we only write back to *data once we
know that the update has happened.

Cc: <stable@vger.kernel.org>
Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm")
Reported-by: Shengjiu Wang <shengjiu.wang@freescale.com>
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/armv8_deprecated.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index bcee7abac68eb..937f5e58a4d34 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -284,21 +284,23 @@ static void register_insn_emulation_sysctl(struct ctl_table *table)
 	__asm__ __volatile__(					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN,	\
 		    CONFIG_ARM64_PAN)				\
-	"	mov		%w2, %w1\n"			\
-	"0:	ldxr"B"		%w1, [%3]\n"			\
-	"1:	stxr"B"		%w0, %w2, [%3]\n"		\
+	"0:	ldxr"B"		%w2, [%3]\n"			\
+	"1:	stxr"B"		%w0, %w1, [%3]\n"		\
 	"	cbz		%w0, 2f\n"			\
 	"	mov		%w0, %w4\n"			\
+	"	b		3f\n"				\
 	"2:\n"							\
+	"	mov		%w1, %w2\n"			\
+	"3:\n"							\
 	"	.pushsection	 .fixup,\"ax\"\n"		\
 	"	.align		2\n"				\
-	"3:	mov		%w0, %w5\n"			\
-	"	b		2b\n"				\
+	"4:	mov		%w0, %w5\n"			\
+	"	b		3b\n"				\
 	"	.popsection"					\
 	"	.pushsection	 __ex_table,\"a\"\n"		\
 	"	.align		3\n"				\
-	"	.quad		0b, 3b\n"			\
-	"	.quad		1b, 3b\n"			\
+	"	.quad		0b, 4b\n"			\
+	"	.quad		1b, 4b\n"			\
 	"	.popsection\n"					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\

From e13d918a19a7b6cba62b32884f5e336e764c2cc6 Mon Sep 17 00:00:00 2001
From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Date: Tue, 27 Oct 2015 17:29:10 +0000
Subject: [PATCH 193/199] arm64: kernel: fix tcr_el1.t0sz restore on systems
 with extended idmap

Commit dd006da21646 ("arm64: mm: increase VA range of identity map")
introduced a mechanism to extend the virtual memory map range
to support arm64 systems with system RAM located at very high offset,
where the identity mapping used to enable/disable the MMU requires
additional translation levels to map the physical memory at an equal
virtual offset.

The kernel detects at boot time the tcr_el1.t0sz value required by the
identity mapping and sets-up the tcr_el1.t0sz register field accordingly,
any time the identity map is required in the kernel (ie when enabling the
MMU).

After enabling the MMU, in the cold boot path the kernel resets the
tcr_el1.t0sz to its default value (ie the actual configuration value for
the system virtual address space) so that after enabling the MMU the
memory space translated by ttbr0_el1 is restored as expected.

Commit dd006da21646 ("arm64: mm: increase VA range of identity map")
also added code to set-up the tcr_el1.t0sz value when the kernel resumes
from low-power states with the MMU off through cpu_resume() in order to
effectively use the identity mapping to enable the MMU but failed to add
the code required to restore the tcr_el1.t0sz to its default value, when
the core returns to the kernel with the MMU enabled, so that the kernel
might end up running with tcr_el1.t0sz value set-up for the identity
mapping which can be lower than the value required by the actual virtual
address space, resulting in an erroneous set-up.

This patchs adds code in the resume path that restores the tcr_el1.t0sz
default value upon core resume, mirroring this way the cold boot path
behaviour therefore fixing the issue.

Cc: <stable@vger.kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Fixes: dd006da21646 ("arm64: mm: increase VA range of identity map")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/suspend.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 8297d502217e1..44ca4143b0132 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -80,17 +80,21 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	if (ret == 0) {
 		/*
 		 * We are resuming from reset with TTBR0_EL1 set to the
-		 * idmap to enable the MMU; restore the active_mm mappings in
-		 * TTBR0_EL1 unless the active_mm == &init_mm, in which case
-		 * the thread entered cpu_suspend with TTBR0_EL1 set to
-		 * reserved TTBR0 page tables and should be restored as such.
+		 * idmap to enable the MMU; set the TTBR0 to the reserved
+		 * page tables to prevent speculative TLB allocations, flush
+		 * the local tlb and set the default tcr_el1.t0sz so that
+		 * the TTBR0 address space set-up is properly restored.
+		 * If the current active_mm != &init_mm we entered cpu_suspend
+		 * with mappings in TTBR0 that must be restored, so we switch
+		 * them back to complete the address space configuration
+		 * restoration before returning.
 		 */
-		if (mm == &init_mm)
-			cpu_set_reserved_ttbr0();
-		else
-			cpu_switch_mm(mm->pgd, mm);
-
+		cpu_set_reserved_ttbr0();
 		flush_tlb_all();
+		cpu_set_default_tcr_t0sz();
+
+		if (mm != &init_mm)
+			cpu_switch_mm(mm->pgd, mm);
 
 		/*
 		 * Restore per-cpu offset before any kernel

From 9702970c7bd3e2d6fecb642a190269131d4ac16c Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Wed, 28 Oct 2015 16:56:13 +0000
Subject: [PATCH 194/199] Revert "ARM64: unwind: Fix PC calculation"

This reverts commit e306dfd06fcb44d21c80acb8e5a88d55f3d1cf63.

With this patch applied, we were the only architecture making this sort
of adjustment to the PC calculation in the unwinder. This causes
problems for ftrace, where the PC values are matched against the
contents of the stack frames in the callchain and fail to match any
records after the address adjustment.

Whilst there has been some effort to change ftrace to workaround this,
those patches are not yet ready for mainline and, since we're the odd
architecture in this regard, let's just step in line with other
architectures (like arch/arm/) for now.

Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/stacktrace.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 407991bf79f51..ccb6078ed9f20 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -48,11 +48,7 @@ int notrace unwind_frame(struct stackframe *frame)
 
 	frame->sp = fp + 0x10;
 	frame->fp = *(unsigned long *)(fp);
-	/*
-	 * -4 here because we care about the PC at time of bl,
-	 * not where the return will go.
-	 */
-	frame->pc = *(unsigned long *)(fp + 8) - 4;
+	frame->pc = *(unsigned long *)(fp + 8);
 
 	return 0;
 }

From d305c4773458fdd6ff9c52bfdea8c67fbd3b2072 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C3=89meric=20MASCHINO?= <emeric.maschino@gmail.com>
Date: Tue, 22 Sep 2015 23:58:48 +0200
Subject: [PATCH 195/199] [IA64] Wire up kcmp syscall
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

systemd > 218 fails to compile on ia64 with:

     error: ‘__NR_kcmp’ undeclared [1].

I've been told that this is because the kcmp syscall hasn't been wired up
for the ia64 arch [2].

The proposed patch thus wire up the kcmp syscall for the ia64 arch.

[1] https://bugs.gentoo.org/show_bug.cgi?id=560492
[2] https://bugs.gentoo.org/show_bug.cgi?id=560492#c17

Signed-off-by: Émeric MASCHINO <emeric.maschino@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/ia64/include/asm/unistd.h      | 2 +-
 arch/ia64/include/uapi/asm/unistd.h | 1 +
 arch/ia64/kernel/entry.S            | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h
index 99c96a5e6016b..db73390568c8d 100644
--- a/arch/ia64/include/asm/unistd.h
+++ b/arch/ia64/include/asm/unistd.h
@@ -11,7 +11,7 @@
 
 
 
-#define NR_syscalls			321 /* length of syscall table */
+#define NR_syscalls			322 /* length of syscall table */
 
 /*
  * The following defines stop scripts/checksyscalls.sh from complaining about
diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h
index 98e94e19a5a08..9038726e7d267 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -334,5 +334,6 @@
 #define __NR_execveat			1342
 #define __NR_userfaultfd		1343
 #define __NR_membarrier			1344
+#define __NR_kcmp			1345
 
 #endif /* _UAPI_ASM_IA64_UNISTD_H */
diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
index 37cc7a65cd3ee..dcd97f84d0654 100644
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -1770,5 +1770,6 @@ sys_call_table:
 	data8 sys_execveat
 	data8 sys_userfaultfd
 	data8 sys_membarrier
+	data8 sys_kcmp				// 1345
 
 	.org sys_call_table + 8*NR_syscalls	// guard against failures to increase NR_syscalls

From 1e0d69a9cc9172d7896c2113f983a74f6e8ff303 Mon Sep 17 00:00:00 2001
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed, 28 Oct 2015 13:21:03 +0100
Subject: [PATCH 196/199] Revert "Merge branch 'ipv6-overflow-arith'"

Linus dislikes these changes. To not hold up the net-merge let's revert
it for now and fix the bug like Linus suggested.

This reverts commit ec3661b42257d9a06cf0d318175623ac7a660113, reversing
changes made to c80dbe04612986fd6104b4a1be21681b113b5ac9.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/compiler-gcc.h   |  4 ----
 include/linux/overflow-arith.h | 18 ------------------
 net/ipv6/ip6_output.c          |  6 +-----
 3 files changed, 1 insertion(+), 27 deletions(-)
 delete mode 100644 include/linux/overflow-arith.h

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 82c159e0532ae..dfaa7b3e9ae90 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -237,10 +237,6 @@
 #define KASAN_ABI_VERSION 3
 #endif
 
-#if GCC_VERSION >= 50000
-#define CC_HAVE_BUILTIN_OVERFLOW
-#endif
-
 #endif	/* gcc version >= 40000 specific checks */
 
 #if !defined(__noclone)
diff --git a/include/linux/overflow-arith.h b/include/linux/overflow-arith.h
deleted file mode 100644
index e12ccf854a70f..0000000000000
--- a/include/linux/overflow-arith.h
+++ /dev/null
@@ -1,18 +0,0 @@
-#pragma once
-
-#include <linux/kernel.h>
-
-#ifdef CC_HAVE_BUILTIN_OVERFLOW
-
-#define overflow_usub __builtin_usub_overflow
-
-#else
-
-static inline bool overflow_usub(unsigned int a, unsigned int b,
-				 unsigned int *res)
-{
-	*res = a - b;
-	return *res > a ? true : false;
-}
-
-#endif
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 8dddb45c433e5..d03d6da772f3b 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -28,7 +28,6 @@
 
 #include <linux/errno.h>
 #include <linux/kernel.h>
-#include <linux/overflow-arith.h>
 #include <linux/string.h>
 #include <linux/socket.h>
 #include <linux/net.h>
@@ -585,10 +584,7 @@ int ip6_fragment(struct sock *sk, struct sk_buff *skb,
 		if (np->frag_size)
 			mtu = np->frag_size;
 	}
-
-	if (overflow_usub(mtu, hlen + sizeof(struct frag_hdr), &mtu) ||
-	    mtu <= 7)
-		goto fail_toobig;
+	mtu -= hlen + sizeof(struct frag_hdr);
 
 	frag_id = ipv6_select_ident(net, &ipv6_hdr(skb)->daddr,
 				    &ipv6_hdr(skb)->saddr);

From 89bc7848a91bc99532f5c21b2885472ba710f249 Mon Sep 17 00:00:00 2001
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed, 28 Oct 2015 13:21:04 +0100
Subject: [PATCH 197/199] ipv6: protect mtu calculation of wrap-around and
 infinite loop by rounding issues

Raw sockets with hdrincl enabled can insert ipv6 extension headers
right into the data stream. In case we need to fragment those packets,
we reparse the options header to find the place where we can insert
the fragment header. If the extension headers exceed the link's MTU we
actually cannot make progress in such a case.

Instead of ending up in broken arithmetic or rounding towards 0 and
entering an endless loop in ip6_fragment, just prevent those cases by
aborting early and signal -EMSGSIZE to user space.

This is the second version of the patch which doesn't use the
overflow_usub function, which got reverted for now.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv6/ip6_output.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index d03d6da772f3b..f84ec4e9b2de7 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -584,6 +584,8 @@ int ip6_fragment(struct sock *sk, struct sk_buff *skb,
 		if (np->frag_size)
 			mtu = np->frag_size;
 	}
+	if (mtu < hlen + sizeof(struct frag_hdr) + 8)
+		goto fail_toobig;
 	mtu -= hlen + sizeof(struct frag_hdr);
 
 	frag_id = ipv6_select_ident(net, &ipv6_hdr(skb)->daddr,

From 73effccb9196ccc0241c3fb51dfd8de1d14ae8ed Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Date: Thu, 29 Oct 2015 15:07:25 +0100
Subject: [PATCH 198/199] arm64/efi: do not assume DRAM base is aligned to 2 MB

The current arm64 Image relocation code in the UEFI stub assumes that
the dram_base argument it receives is always a multiple of 2 MB. In
reality, it is simply the lowest start address of all RAM entries in
the UEFI memory map, which means it could be any multiple of 4 KB.

Since the arm64 kernel Image needs to reside TEXT_OFFSET bytes beyond
a 2 MB aligned base, or it will fail to boot, make sure we round dram_base
to 2 MB before using it to calculate the relocation address.

Fixes: e38457c361b30c5a ("arm64: efi: prefer AllocatePages() over efi_low_alloc() for vmlinux")
Reported-by: Timur Tabi <timur@codeaurora.org>
Tested-by: Timur Tabi <timur@codeaurora.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/efi-stub.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c
index 816120ece6bce..78dfbd34b6bff 100644
--- a/arch/arm64/kernel/efi-stub.c
+++ b/arch/arm64/kernel/efi-stub.c
@@ -25,10 +25,20 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 	unsigned long kernel_size, kernel_memsize = 0;
 	unsigned long nr_pages;
 	void *old_image_addr = (void *)*image_addr;
+	unsigned long preferred_offset;
+
+	/*
+	 * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
+	 * a 2 MB aligned base, which itself may be lower than dram_base, as
+	 * long as the resulting offset equals or exceeds it.
+	 */
+	preferred_offset = round_down(dram_base, SZ_2M) + TEXT_OFFSET;
+	if (preferred_offset < dram_base)
+		preferred_offset += SZ_2M;
 
 	/* Relocate the image, if required. */
 	kernel_size = _edata - _text;
-	if (*image_addr != (dram_base + TEXT_OFFSET)) {
+	if (*image_addr != preferred_offset) {
 		kernel_memsize = kernel_size + (_end - _edata);
 
 		/*
@@ -42,7 +52,7 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 		 * Mustang), we can still place the kernel at the address
 		 * 'dram_base + TEXT_OFFSET'.
 		 */
-		*image_addr = *reserve_addr = dram_base + TEXT_OFFSET;
+		*image_addr = *reserve_addr = preferred_offset;
 		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
 			   EFI_PAGE_SIZE;
 		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,

From bae818ee1577c27356093901a0ea48f672eda514 Mon Sep 17 00:00:00 2001
From: Ronny Hegewald <ronny.hegewald@online.de>
Date: Thu, 15 Oct 2015 18:50:46 +0000
Subject: [PATCH 199/199] rbd: require stable pages if message data CRCs are
 enabled

rbd requires stable pages, as it performs a crc of the page data before
they are send to the OSDs.

But since kernel 3.9 (patch 1d1d1a767206fbe5d4c69493b7e6d2a8d08cc0a0
"mm: only enforce stable page writes if the backing device requires
it") it is not assumed anymore that block devices require stable pages.

This patch sets the necessary flag to get stable pages back for rbd.

In a ceph installation that provides multiple ext4 formatted rbd
devices "bad crc" messages appeared regularly (ca 1 message every 1-2
minutes on every OSD that provided the data for the rbd) in the
OSD-logs before this patch. After this patch this messages are pretty
much gone (only ca 1-2 / month / OSD).

Cc: stable@vger.kernel.org # 3.9+, needs backporting
Signed-off-by: Ronny Hegewald <Ronny.Hegewald@online.de>
[idryomov@gmail.com: require stable pages only in crc case, changelog]
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
---
 drivers/block/rbd.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 6f26cf38c6f9c..128e7df5b8072 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3780,6 +3780,9 @@ static int rbd_init_disk(struct rbd_device *rbd_dev)
 	blk_queue_max_discard_sectors(q, segment_size / SECTOR_SIZE);
 	q->limits.discard_zeroes_data = 1;
 
+	if (!ceph_test_opt(rbd_dev->rbd_client->client, NOCRC))
+		q->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
+
 	disk->queue = q;
 
 	q->queuedata = rbd_dev;