From 0d3e45bc6507bd1f8728bf586ebd16c2d9e40613 Mon Sep 17 00:00:00 2001 From: Dong Bo Date: Fri, 26 Jan 2018 11:21:49 +0800 Subject: [PATCH 001/336] libata: Fix compile warning with ATA_DEBUG enabled This fixs the following comile warnings with ATA_DEBUG enabled, which detected by Linaro GCC 5.2-2015.11: drivers/ata/libata-scsi.c: In function 'ata_scsi_dump_cdb': ./include/linux/kern_levels.h:5:18: warning: format '%d' expects argument of type 'int', but argument 6 has type 'u64 {aka long long unsigned int}' [-Wformat=] tj: Patch hand-applied and description trimmed. Signed-off-by: Dong Bo Signed-off-by: Tejun Heo --- drivers/ata/libata-scsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 66be961c93a4e..d959b154de4f5 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -4282,7 +4282,7 @@ static inline void ata_scsi_dump_cdb(struct ata_port *ap, #ifdef ATA_DEBUG struct scsi_device *scsidev = cmd->device; - DPRINTK("CDB (%u:%d,%d,%d) %9ph\n", + DPRINTK("CDB (%u:%d,%d,%lld) %9ph\n", ap->print_id, scsidev->channel, scsidev->id, scsidev->lun, cmd->cmnd); From 3b61e5121d5c4d0ea79fe90ced8df2fe5cb67dc2 Mon Sep 17 00:00:00 2001 From: Stefan Roese Date: Tue, 30 Jan 2018 11:02:55 +0100 Subject: [PATCH 002/336] ahci: Add check for device presence (PCIe hot unplug) in ahci_stop_engine() Exit directly with ENODEV, if the AHCI controller is not available anymore. Otherwise a delay of 500ms for each port is added to the remove function while trying to issue a command on the non-existent controller. Signed-off-by: Stefan Roese Cc: Tejun Heo Signed-off-by: Tejun Heo --- drivers/ata/libahci.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c index a0de7a38430c9..7adcf3caabd00 100644 --- a/drivers/ata/libahci.c +++ b/drivers/ata/libahci.c @@ -665,6 +665,16 @@ int ahci_stop_engine(struct ata_port *ap) if ((tmp & (PORT_CMD_START | PORT_CMD_LIST_ON)) == 0) return 0; + /* + * Don't try to issue commands but return with ENODEV if the + * AHCI controller not available anymore (e.g. due to PCIe hot + * unplugging). Otherwise a 500ms delay for each port is added. + */ + if (tmp == 0xffffffff) { + dev_err(ap->host->dev, "AHCI controller unavailable!\n"); + return -ENODEV; + } + /* setting HBA to idle */ tmp &= ~PORT_CMD_START; writel(tmp, port_mmio + PORT_CMD); From 9f2b51db5b551085e26c8af5fbe484d62b891ec9 Mon Sep 17 00:00:00 2001 From: Baruch Siach Date: Mon, 5 Feb 2018 13:50:36 +0200 Subject: [PATCH 003/336] ata: libahci: fix comment indentation Indent the numbered item with one space like all other items in the same list. Signed-off-by: Baruch Siach Signed-off-by: Tejun Heo --- drivers/ata/libahci_platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/ata/libahci_platform.c b/drivers/ata/libahci_platform.c index 341d0ef82cbdd..30cc8f1a31e12 100644 --- a/drivers/ata/libahci_platform.c +++ b/drivers/ata/libahci_platform.c @@ -340,7 +340,7 @@ static int ahci_platform_get_regulator(struct ahci_host_priv *hpriv, u32 port, * 2) regulator for controlling the targets power (optional) * 3) 0 - AHCI_MAX_CLKS clocks, as specified in the devs devicetree node, * or for non devicetree enabled platforms a single clock - * 4) phys (optional) + * 4) phys (optional) * * RETURNS: * The allocated ahci_host_priv on success, otherwise an ERR_PTR value From 058f58e235cbe03e923b30ea7c49995a46a8725f Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Sat, 3 Feb 2018 20:30:56 -0800 Subject: [PATCH 004/336] libata: fix length validation of ATAPI-relayed SCSI commands syzkaller reported a crash in ata_bmdma_fill_sg() when writing to /dev/sg1. The immediate cause was that the ATA command's scatterlist was not DMA-mapped, which causes 'pi - 1' to underflow, resulting in a write to 'qc->ap->bmdma_prd[0xffffffff]'. Strangely though, the flag ATA_QCFLAG_DMAMAP was set in qc->flags. The root cause is that when __ata_scsi_queuecmd() is preparing to relay a SCSI command to an ATAPI device, it doesn't correctly validate the CDB length before copying it into the 16-byte buffer 'cdb' in 'struct ata_queued_cmd'. Namely, it validates the fixed CDB length expected based on the SCSI opcode but not the actual CDB length, which can be larger due to the use of the SG_NEXT_CMD_LEN ioctl. Since 'flags' is the next member in ata_queued_cmd, a buffer overflow corrupts it. Fix it by requiring that the actual CDB length be <= 16 (ATAPI_CDB_LEN). [Really it seems the length should be required to be <= dev->cdb_len, but the current behavior seems to have been intentionally introduced by commit 607126c2a21c ("libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs") to work around a userspace bug in mplayer. Probably the workaround is no longer needed (mplayer was fixed in 2007), but continuing to allow lengths to up 16 appears harmless for now.] Here's a reproducer that works in QEMU when /dev/sg1 refers to the CD-ROM drive that qemu-system-x86_64 creates by default: #include #include #include #define SG_NEXT_CMD_LEN 0x2283 int main() { char buf[53] = { [36] = 0x7e, [52] = 0x02 }; int fd = open("/dev/sg1", O_RDWR); ioctl(fd, SG_NEXT_CMD_LEN, &(int){ 17 }); write(fd, buf, sizeof(buf)); } The crash was: BUG: unable to handle kernel paging request at ffff8cb97db37ffc IP: ata_bmdma_fill_sg drivers/ata/libata-sff.c:2623 [inline] IP: ata_bmdma_qc_prep+0xa4/0xc0 drivers/ata/libata-sff.c:2727 PGD fb6c067 P4D fb6c067 PUD 0 Oops: 0002 [#1] SMP CPU: 1 PID: 150 Comm: syz_ata_bmdma_q Not tainted 4.15.0-next-20180202 #99 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 [...] Call Trace: ata_qc_issue+0x100/0x1d0 drivers/ata/libata-core.c:5421 ata_scsi_translate+0xc9/0x1a0 drivers/ata/libata-scsi.c:2024 __ata_scsi_queuecmd drivers/ata/libata-scsi.c:4326 [inline] ata_scsi_queuecmd+0x8c/0x210 drivers/ata/libata-scsi.c:4375 scsi_dispatch_cmd+0xa2/0xe0 drivers/scsi/scsi_lib.c:1727 scsi_request_fn+0x24c/0x530 drivers/scsi/scsi_lib.c:1865 __blk_run_queue_uncond block/blk-core.c:412 [inline] __blk_run_queue+0x3a/0x60 block/blk-core.c:432 blk_execute_rq_nowait+0x93/0xc0 block/blk-exec.c:78 sg_common_write.isra.7+0x272/0x5a0 drivers/scsi/sg.c:806 sg_write+0x1ef/0x340 drivers/scsi/sg.c:677 __vfs_write+0x31/0x160 fs/read_write.c:480 vfs_write+0xa7/0x160 fs/read_write.c:544 SYSC_write fs/read_write.c:589 [inline] SyS_write+0x4d/0xc0 fs/read_write.c:581 do_syscall_64+0x5e/0x110 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x21/0x86 Fixes: 607126c2a21c ("libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs") Reported-by: syzbot+1ff6f9fcc3c35f1c72a95e26528c8e7e3276e4da@syzkaller.appspotmail.com Cc: # v2.6.24+ Signed-off-by: Eric Biggers Signed-off-by: Tejun Heo --- drivers/ata/libata-scsi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index d959b154de4f5..9ae8986bae486 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -4309,7 +4309,9 @@ static inline int __ata_scsi_queuecmd(struct scsi_cmnd *scmd, if (likely((scsi_op != ATA_16) || !atapi_passthru16)) { /* relay SCSI command to ATAPI device */ int len = COMMAND_SIZE(scsi_op); - if (unlikely(len > scmd->cmd_len || len > dev->cdb_len)) + if (unlikely(len > scmd->cmd_len || + len > dev->cdb_len || + scmd->cmd_len > ATAPI_CDB_LEN)) goto bad_cdb_len; xlat_func = atapi_xlat; From 9173e5e80729c8434b8d27531527c5245f4a5594 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Sat, 3 Feb 2018 20:33:27 -0800 Subject: [PATCH 005/336] libata: remove WARN() for DMA or PIO command without data syzkaller hit a WARN() in ata_qc_issue() when writing to /dev/sg0. This happened because it issued a READ_6 command with no data buffer. Just remove the WARN(), as it doesn't appear indicate a kernel bug. The expected behavior is to fail the command, which the code does. Here's a reproducer that works in QEMU when /dev/sg0 refers to a disk of the default type ("82371SB PIIX3 IDE"): #include #include int main() { char buf[42] = { [36] = 0x8 /* READ_6 */ }; write(open("/dev/sg0", O_RDWR), buf, sizeof(buf)); } Fixes: f92a26365a72 ("libata: change ATA_QCFLAG_DMAMAP semantics") Reported-by: syzbot+f7b556d1766502a69d85071d2ff08bd87be53d0f@syzkaller.appspotmail.com Cc: # v2.6.25+ Signed-off-by: Eric Biggers Signed-off-by: Tejun Heo --- drivers/ata/libata-core.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 3c09122bf0382..61b09968d0326 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -5401,8 +5401,7 @@ void ata_qc_issue(struct ata_queued_cmd *qc) * We guarantee to LLDs that they will have at least one * non-zero sg if the command is a data command. */ - if (WARN_ON_ONCE(ata_is_data(prot) && - (!qc->sg || !qc->n_elem || !qc->nbytes))) + if (ata_is_data(prot) && (!qc->sg || !qc->n_elem || !qc->nbytes)) goto sys_err; if (ata_is_dma(prot) || (ata_is_pio(prot) && From 2c1ec6fda2d07044cda922ee25337cf5d4b429b3 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Sat, 3 Feb 2018 20:33:51 -0800 Subject: [PATCH 006/336] libata: don't try to pass through NCQ commands to non-NCQ devices syzkaller hit a WARN() in ata_bmdma_qc_issue() when writing to /dev/sg0. This happened because it issued an ATA pass-through command (ATA_16) where the protocol field indicated that NCQ should be used -- but the device did not support NCQ. We could just remove the WARN() from libata-sff.c, but the real problem seems to be that the SCSI -> ATA translation code passes through NCQ commands without verifying that the device actually supports NCQ. Fix this by adding the appropriate check to ata_scsi_pass_thru(). Here's reproducer that works in QEMU when /dev/sg0 refers to a disk of the default type ("82371SB PIIX3 IDE"): #include #include int main() { char buf[53] = { 0 }; buf[36] = 0x85; /* ATA_16 */ buf[37] = (12 << 1); /* FPDMA */ buf[38] = 0x1; /* Has data */ buf[51] = 0xC8; /* ATA_CMD_READ */ write(open("/dev/sg0", O_RDWR), buf, sizeof(buf)); } Fixes: ee7fb331c3ac ("libata: add support for NCQ commands for SG interface") Reported-by: syzbot+2f69ca28df61bdfc77cd36af2e789850355a221e@syzkaller.appspotmail.com Cc: # v4.4+ Signed-off-by: Eric Biggers Signed-off-by: Tejun Heo --- drivers/ata/libata-scsi.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 9ae8986bae486..89a9d4a2efc8a 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -3316,6 +3316,12 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc) goto invalid_fld; } + /* We may not issue NCQ commands to devices not supporting NCQ */ + if (ata_is_ncq(tf->protocol) && !ata_ncq_enabled(dev)) { + fp = 1; + goto invalid_fld; + } + /* sanity check for pio multi commands */ if ((cdb[1] & 0xe0) && !is_multi_taskfile(tf)) { fp = 1; From da77d76b95a0e8940793f4f7fe12a4a2d2048e39 Mon Sep 17 00:00:00 2001 From: Khiem Nguyen Date: Mon, 5 Feb 2018 04:18:51 +0900 Subject: [PATCH 007/336] sata_rcar: Reset SATA PHY when Salvator-X board resumes Because power of Salvator-X board is cut off in suspend, it needs to reset SATA PHY state in resume. Otherwise, SATA partition could not be accessed anymore. Signed-off-by: Khiem Nguyen Signed-off-by: Hien Dang [reinit phy in sata_rcar_resume() function on R-Car Gen3 only] [factor out SATA module init sequence] [fixed the prefix for the subject] Signed-off-by: Yoshihiro Kaneko Signed-off-by: Tejun Heo --- drivers/ata/sata_rcar.c | 63 ++++++++++++++++++++++++++--------------- 1 file changed, 40 insertions(+), 23 deletions(-) diff --git a/drivers/ata/sata_rcar.c b/drivers/ata/sata_rcar.c index 80ee2f2a50d02..6f47ca34767d7 100644 --- a/drivers/ata/sata_rcar.c +++ b/drivers/ata/sata_rcar.c @@ -146,6 +146,7 @@ enum sata_rcar_type { RCAR_GEN1_SATA, RCAR_GEN2_SATA, + RCAR_GEN3_SATA, RCAR_R8A7790_ES1_SATA, }; @@ -784,26 +785,11 @@ static void sata_rcar_setup_port(struct ata_host *host) ioaddr->command_addr = ioaddr->cmd_addr + (ATA_REG_CMD << 2); } -static void sata_rcar_init_controller(struct ata_host *host) +static void sata_rcar_init_module(struct sata_rcar_priv *priv) { - struct sata_rcar_priv *priv = host->private_data; void __iomem *base = priv->base; u32 val; - /* reset and setup phy */ - switch (priv->type) { - case RCAR_GEN1_SATA: - sata_rcar_gen1_phy_init(priv); - break; - case RCAR_GEN2_SATA: - case RCAR_R8A7790_ES1_SATA: - sata_rcar_gen2_phy_init(priv); - break; - default: - dev_warn(host->dev, "SATA phy is not initialized\n"); - break; - } - /* SATA-IP reset state */ val = ioread32(base + ATAPI_CONTROL1_REG); val |= ATAPI_CONTROL1_RESET; @@ -824,10 +810,34 @@ static void sata_rcar_init_controller(struct ata_host *host) /* ack and mask */ iowrite32(0, base + SATAINTSTAT_REG); iowrite32(0x7ff, base + SATAINTMASK_REG); + /* enable interrupts */ iowrite32(ATAPI_INT_ENABLE_SATAINT, base + ATAPI_INT_ENABLE_REG); } +static void sata_rcar_init_controller(struct ata_host *host) +{ + struct sata_rcar_priv *priv = host->private_data; + void __iomem *base = priv->base; + + /* reset and setup phy */ + switch (priv->type) { + case RCAR_GEN1_SATA: + sata_rcar_gen1_phy_init(priv); + break; + case RCAR_GEN2_SATA: + case RCAR_GEN3_SATA: + case RCAR_R8A7790_ES1_SATA: + sata_rcar_gen2_phy_init(priv); + break; + default: + dev_warn(host->dev, "SATA phy is not initialized\n"); + break; + } + + sata_rcar_init_module(priv); +} + static const struct of_device_id sata_rcar_match[] = { { /* Deprecated by "renesas,sata-r8a7779" */ @@ -856,7 +866,7 @@ static const struct of_device_id sata_rcar_match[] = { }, { .compatible = "renesas,sata-r8a7795", - .data = (void *)RCAR_GEN2_SATA + .data = (void *)RCAR_GEN3_SATA }, { .compatible = "renesas,rcar-gen2-sata", @@ -864,7 +874,7 @@ static const struct of_device_id sata_rcar_match[] = { }, { .compatible = "renesas,rcar-gen3-sata", - .data = (void *)RCAR_GEN2_SATA + .data = (void *)RCAR_GEN3_SATA }, { }, }; @@ -982,11 +992,18 @@ static int sata_rcar_resume(struct device *dev) if (ret) return ret; - /* ack and mask */ - iowrite32(0, base + SATAINTSTAT_REG); - iowrite32(0x7ff, base + SATAINTMASK_REG); - /* enable interrupts */ - iowrite32(ATAPI_INT_ENABLE_SATAINT, base + ATAPI_INT_ENABLE_REG); + if (priv->type == RCAR_GEN3_SATA) { + sata_rcar_gen2_phy_init(priv); + sata_rcar_init_module(priv); + } else { + /* ack and mask */ + iowrite32(0, base + SATAINTSTAT_REG); + iowrite32(0x7ff, base + SATAINTMASK_REG); + + /* enable interrupts */ + iowrite32(ATAPI_INT_ENABLE_SATAINT, + base + ATAPI_INT_ENABLE_REG); + } ata_host_resume(host); From c53593e5cb693d59d9e8b64fb3a79436bf99c3b3 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Mon, 22 Jan 2018 11:26:18 -0800 Subject: [PATCH 008/336] sched, cgroup: Don't reject lower cpu.max on ancestors While adding cgroup2 interface for the cpu controller, 0d5936344f30 ("sched: Implement interface for cgroup unified hierarchy") forgot to update input validation and left it to reject cpu.max config if any descendant has set a higher value. cgroup2 officially supports delegation and a descendant must not be able to restrict what its ancestors can configure. For absolute limits such as cpu.max and memory.max, this means that the config at each level should only act as the upper limit at that level and shouldn't interfere with what other cgroups can configure. This patch updates config validation on cgroup2 so that the cpu controller follows the same convention. Signed-off-by: Tejun Heo Fixes: 0d5936344f30 ("sched: Implement interface for cgroup unified hierarchy") Acked-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org # v4.15+ --- kernel/sched/core.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index bf724c1952eac..1bc6a694c84ff 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6678,13 +6678,18 @@ static int tg_cfs_schedulable_down(struct task_group *tg, void *data) parent_quota = parent_b->hierarchical_quota; /* - * Ensure max(child_quota) <= parent_quota, inherit when no + * Ensure max(child_quota) <= parent_quota. On cgroup2, + * always take the min. On cgroup1, only inherit when no * limit is set: */ - if (quota == RUNTIME_INF) - quota = parent_quota; - else if (parent_quota != RUNTIME_INF && quota > parent_quota) - return -EINVAL; + if (cgroup_subsys_on_dfl(cpu_cgrp_subsys)) { + quota = min(quota, parent_quota); + } else { + if (quota == RUNTIME_INF) + quota = parent_quota; + else if (parent_quota != RUNTIME_INF && quota > parent_quota) + return -EINVAL; + } } cfs_b->hierarchical_quota = quota; From 685469e5bf9d31ccd9212be86f861a18fc213d05 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Mon, 12 Feb 2018 12:10:41 -0800 Subject: [PATCH 009/336] percpu: add Dennis Zhou as a percpu co-maintainer Dennis rewrote the percpu area allocator some months ago, understands most of the code base and has been responsive with the bug reports and questions. Let's add him as a co-maintainer. Signed-off-by: Tejun Heo Acked-by: Christopher Lameter --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 3bdc260e36b7a..f384e22546d39 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10832,6 +10832,7 @@ F: drivers/platform/x86/peaq-wmi.c PER-CPU MEMORY ALLOCATOR M: Tejun Heo M: Christoph Lameter +M: Dennis Zhou T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git S: Maintained F: include/linux/percpu*.h From c662f77331c98018ed256501557b4dd67133fbd7 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Tue, 13 Feb 2018 15:16:01 +1100 Subject: [PATCH 010/336] KVM: PPC: Fix compile error that occurs when CONFIG_ALTIVEC=n Commit accb757d798c ("KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_run", 2017-12-04) added a "goto out" statement and an "out:" label to kvm_arch_vcpu_ioctl_run(). Since the only "goto out" is inside a CONFIG_VSX block, compiling with CONFIG_VSX=n gives a warning that label "out" is defined but not used, and because arch/powerpc is compiled with -Werror, that becomes a compile error that makes the kernel build fail. Merge commit 1ab03c072feb ("Merge tag 'kvm-ppc-next-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc", 2018-02-09) added a similar block of code inside a #ifdef CONFIG_ALTIVEC, with a "goto out" statement. In order to make the build succeed, this adds a #ifdef around the "out:" label. This is a minimal, ugly fix, to be replaced later by a refactoring of the code. Since CONFIG_VSX depends on CONFIG_ALTIVEC, it is sufficient to use #ifdef CONFIG_ALTIVEC here. Fixes: accb757d798c ("KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_run") Reported-by: Christian Zigotzky Signed-off-by: Paul Mackerras --- arch/powerpc/kvm/powerpc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 403e642c78f51..0083142c2f848 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -1608,7 +1608,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) kvm_sigset_deactivate(vcpu); +#ifdef CONFIG_ALTIVEC out: +#endif vcpu_put(vcpu); return r; } From 6df3877fc962c2bb3d0438633dfd24a185af6838 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Tue, 13 Feb 2018 15:45:21 +1100 Subject: [PATCH 011/336] KVM: PPC: Book3S: Fix compile error that occurs with some gcc versions Some versions of gcc generate a warning that the variable "emulated" may be used uninitialized in function kvmppc_handle_load128_by2x64(). It would be used uninitialized if kvmppc_handle_load128_by2x64 was ever called with vcpu->arch.mmio_vmx_copy_nums == 0, but neither of the callers ever do that, so there is no actual bug. When gcc generates a warning, it causes the build to fail because arch/powerpc is compiled with -Werror. This silences the warning by initializing "emulated" to EMULATE_DONE. Fixes: 09f984961c13 ("KVM: PPC: Book3S: Add MMIO emulation for VMX instructions") Reported-by: Michael Ellerman Signed-off-by: Paul Mackerras --- arch/powerpc/kvm/powerpc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 0083142c2f848..52c2053739862 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -1345,7 +1345,7 @@ static int kvmppc_emulate_mmio_vsx_loadstore(struct kvm_vcpu *vcpu, int kvmppc_handle_load128_by2x64(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned int rt, int is_default_endian) { - enum emulation_result emulated; + enum emulation_result emulated = EMULATE_DONE; while (vcpu->arch.mmio_vmx_copy_nums) { emulated = __kvmppc_handle_load(run, vcpu, rt, 8, From b1c7fe26e0497386d3ae2e2404d1fa7f93895405 Mon Sep 17 00:00:00 2001 From: Aishwarya Pant Date: Tue, 13 Feb 2018 16:51:59 +0530 Subject: [PATCH 012/336] libata: transport: cleanup documentation of sysfs interface Clean-up the documentation of sysfs interfaces to be in the same format as described in Documentation/ABI/README. This will be useful for tracking changes in the ABI. Attributes are grouped by function (device, link or port) and then by date added. This patch also adds documentation for one attribute - /sys/class/ata_port/ataX/port_no Signed-off-by: Aishwarya Pant Signed-off-by: Tejun Heo --- Documentation/ABI/testing/sysfs-ata | 171 ++++++++++++++++------------ 1 file changed, 100 insertions(+), 71 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-ata b/Documentation/ABI/testing/sysfs-ata index aa4296498859e..9ab0ef1dd1c72 100644 --- a/Documentation/ABI/testing/sysfs-ata +++ b/Documentation/ABI/testing/sysfs-ata @@ -1,110 +1,139 @@ What: /sys/class/ata_... -Date: August 2008 -Contact: Gwendal Grignou Description: - -Provide a place in sysfs for storing the ATA topology of the system. This allows -retrieving various information about ATA objects. + Provide a place in sysfs for storing the ATA topology of the + system. This allows retrieving various information about ATA + objects. Files under /sys/class/ata_port ------------------------------- - For each port, a directory ataX is created where X is the ata_port_id of - the port. The device parent is the ata host device. +For each port, a directory ataX is created where X is the ata_port_id of the +port. The device parent is the ata host device. -idle_irq (read) - Number of IRQ received by the port while idle [some ata HBA only]. +What: /sys/class/ata_port/ataX/nr_pmp_links +What: /sys/class/ata_port/ataX/idle_irq +Date: May, 2010 +KernelVersion: v2.6.37 +Contact: Gwendal Grignou +Description: + nr_pmp_links: (RO) If a SATA Port Multiplier (PM) is + connected, the number of links behind it. -nr_pmp_links (read) + idle_irq: (RO) Number of IRQ received by the port while + idle [some ata HBA only]. - If a SATA Port Multiplier (PM) is connected, number of link behind it. + +What: /sys/class/ata_port/ataX/port_no +Date: May, 2013 +KernelVersion: v3.11 +Contact: Gwendal Grignou +Description: + (RO) Host local port number. While registering host controller, + port numbers are tracked based upon number of ports available on + the controller. This attribute is needed by udev for composing + persistent links in /dev/disk/by-path. Files under /sys/class/ata_link ------------------------------- - Behind each port, there is a ata_link. If there is a SATA PM in the - topology, 15 ata_link objects are created. - - If a link is behind a port, the directory name is linkX, where X is - ata_port_id of the port. - If a link is behind a PM, its name is linkX.Y where X is ata_port_id - of the parent port and Y the PM port. +Behind each port, there is a ata_link. If there is a SATA PM in the topology, 15 +ata_link objects are created. -hw_sata_spd_limit +If a link is behind a port, the directory name is linkX, where X is ata_port_id +of the port. If a link is behind a PM, its name is linkX.Y where X is +ata_port_id of the parent port and Y the PM port. - Maximum speed supported by the connected SATA device. -sata_spd_limit +What: /sys/class/ata_link/linkX[.Y]/hw_sata_spd_limit +What: /sys/class/ata_link/linkX[.Y]/sata_spd_limit +What: /sys/class/ata_link/linkX[.Y]/sata_spd +Date: May, 2010 +KernelVersion: v2.6.37 +Contact: Gwendal Grignou +Description: + hw_sata_spd_limit: (RO) Maximum speed supported by the + connected SATA device. - Maximum speed imposed by libata. + sata_spd_limit: (RO) Maximum speed imposed by libata. -sata_spd + sata_spd: (RO) Current speed of the link + eg. 1.5, 3 Gbps etc. - Current speed of the link [1.5, 3Gps,...]. Files under /sys/class/ata_device --------------------------------- - Behind each link, up to two ata device are created. - The name of the directory is devX[.Y].Z where: - - X is ata_port_id of the port where the device is connected, - - Y the port of the PM if any, and - - Z the device id: for PATA, there is usually 2 devices [0,1], - only 1 for SATA. - -class - Device class. Can be "ata" for disk, "atapi" for packet device, - "pmp" for PM, or "none" if no device was found behind the link. - -dma_mode +Behind each link, up to two ata devices are created. +The name of the directory is devX[.Y].Z where: +- X is ata_port_id of the port where the device is connected, +- Y the port of the PM if any, and +- Z the device id: for PATA, there is usually 2 devices [0,1], only 1 for SATA. + + +What: /sys/class/ata_device/devX[.Y].Z/spdn_cnt +What: /sys/class/ata_device/devX[.Y].Z/gscr +What: /sys/class/ata_device/devX[.Y].Z/ering +What: /sys/class/ata_device/devX[.Y].Z/id +What: /sys/class/ata_device/devX[.Y].Z/pio_mode +What: /sys/class/ata_device/devX[.Y].Z/xfer_mode +What: /sys/class/ata_device/devX[.Y].Z/dma_mode +What: /sys/class/ata_device/devX[.Y].Z/class +Date: May, 2010 +KernelVersion: v2.6.37 +Contact: Gwendal Grignou +Description: + spdn_cnt: (RO) Number of times libata decided to lower the + speed of link due to errors. - Transfer modes supported by the device when in DMA mode. - Mostly used by PATA device. + gscr: (RO) Cached result of the dump of PM GSCR + register. Valid registers are: -pio_mode + 0: SATA_PMP_GSCR_PROD_ID, + 1: SATA_PMP_GSCR_REV, + 2: SATA_PMP_GSCR_PORT_INFO, + 32: SATA_PMP_GSCR_ERROR, + 33: SATA_PMP_GSCR_ERROR_EN, + 64: SATA_PMP_GSCR_FEAT, + 96: SATA_PMP_GSCR_FEAT_EN, + 130: SATA_PMP_GSCR_SII_GPIO - Transfer modes supported by the device when in PIO mode. - Mostly used by PATA device. + Only valid if the device is a PM. -xfer_mode + ering: (RO) Formatted output of the error ring of the + device. - Current transfer mode. + id: (RO) Cached result of IDENTIFY command, as + described in ATA8 7.16 and 7.17. Only valid if + the device is not a PM. -id + pio_mode: (RO) Transfer modes supported by the device when + in PIO mode. Mostly used by PATA device. - Cached result of IDENTIFY command, as described in ATA8 7.16 and 7.17. - Only valid if the device is not a PM. + xfer_mode: (RO) Current transfer mode -gscr + dma_mode: (RO) Transfer modes supported by the device when + in DMA mode. Mostly used by PATA device. - Cached result of the dump of PM GSCR register. - Valid registers are: - 0: SATA_PMP_GSCR_PROD_ID, - 1: SATA_PMP_GSCR_REV, - 2: SATA_PMP_GSCR_PORT_INFO, - 32: SATA_PMP_GSCR_ERROR, - 33: SATA_PMP_GSCR_ERROR_EN, - 64: SATA_PMP_GSCR_FEAT, - 96: SATA_PMP_GSCR_FEAT_EN, - 130: SATA_PMP_GSCR_SII_GPIO - Only valid if the device is a PM. + class: (RO) Device class. Can be "ata" for disk, + "atapi" for packet device, "pmp" for PM, or + "none" if no device was found behind the link. -trim - Shows the DSM TRIM mode currently used by the device. Valid - values are: - unsupported: Drive does not support DSM TRIM - unqueued: Drive supports unqueued DSM TRIM only - queued: Drive supports queued DSM TRIM - forced_unqueued: Drive's queued DSM support is known to be - buggy and only unqueued TRIM commands - are sent +What: /sys/class/ata_device/devX[.Y].Z/trim +Date: May, 2015 +KernelVersion: v4.10 +Contact: Gwendal Grignou +Description: + (RO) Shows the DSM TRIM mode currently used by the device. Valid + values are: -spdn_cnt + unsupported: Drive does not support DSM TRIM - Number of time libata decided to lower the speed of link due to errors. + unqueued: Drive supports unqueued DSM TRIM only -ering + queued: Drive supports queued DSM TRIM - Formatted output of the error ring of the device. + forced_unqueued: Drive's queued DSM support is known to + be buggy and only unqueued TRIM commands + are sent From 8f8ca51dbb4da0457f57f83d94aea81931b0707a Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Tue, 13 Feb 2018 13:43:23 +0100 Subject: [PATCH 013/336] ata: sata_rcar: Remove unused variable in sata_rcar_init_controller() drivers/ata/sata_rcar.c: In function 'sata_rcar_init_controller': drivers/ata/sata_rcar.c:821:8: warning: unused variable 'base' [-Wunused-variable] Fixes: da77d76b95a0e894 ("sata_rcar: Reset SATA PHY when Salvator-X board resumes") Signed-off-by: Geert Uytterhoeven Signed-off-by: Tejun Heo --- drivers/ata/sata_rcar.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/ata/sata_rcar.c b/drivers/ata/sata_rcar.c index 6f47ca34767d7..6456e07db72a7 100644 --- a/drivers/ata/sata_rcar.c +++ b/drivers/ata/sata_rcar.c @@ -818,7 +818,6 @@ static void sata_rcar_init_module(struct sata_rcar_priv *priv) static void sata_rcar_init_controller(struct ata_host *host) { struct sata_rcar_priv *priv = host->private_data; - void __iomem *base = priv->base; /* reset and setup phy */ switch (priv->type) { From 0a65e125150c227314dcd561a202a84228398449 Mon Sep 17 00:00:00 2001 From: Aishwarya Pant Date: Tue, 13 Feb 2018 13:48:16 +0530 Subject: [PATCH 014/336] libata: update documentation for sysfs interfaces Dcoumentation has been added by parsing through git commit history and reading code. This might be useful for scripting and tracking changes in the ABI. I do not have complete descriptions for the following 3 attributes; they have been annotated with the comment [to be documented] - /sys/class/scsi_host/hostX/ahci_port_cmd /sys/class/scsi_host/hostX/ahci_host_caps /sys/class/scsi_host/hostX/ahci_host_cap2 Signed-off-by: Aishwarya Pant Signed-off-by: Tejun Heo --- Documentation/ABI/testing/sysfs-block-device | 58 ++++++++++++ .../ABI/testing/sysfs-class-scsi_host | 89 +++++++++++++++++++ 2 files changed, 147 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-block-device diff --git a/Documentation/ABI/testing/sysfs-block-device b/Documentation/ABI/testing/sysfs-block-device new file mode 100644 index 0000000000000..82ef6eab042d3 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-block-device @@ -0,0 +1,58 @@ +What: /sys/block/*/device/sw_activity +Date: Jun, 2008 +KernelVersion: v2.6.27 +Contact: linux-ide@vger.kernel.org +Description: + (RW) Used by drivers which support software controlled activity + LEDs. + + It has the following valid values: + + 0 OFF - the LED is not activated on activity + 1 BLINK_ON - the LED blinks on every 10ms when activity is + detected. + 2 BLINK_OFF - the LED is on when idle, and blinks off + every 10ms when activity is detected. + + Note that the user must turn sw_activity OFF it they wish to + control the activity LED via the em_message file. + + +What: /sys/block/*/device/unload_heads +Date: Sep, 2008 +KernelVersion: v2.6.28 +Contact: linux-ide@vger.kernel.org +Description: + (RW) Hard disk shock protection + + Writing an integer value to this file will take the heads of the + respective drive off the platter and block all I/O operations + for the specified number of milliseconds. + + - If the device does not support the unload heads feature, + access is denied with -EOPNOTSUPP. + - The maximal value accepted for a timeout is 30000 + milliseconds. + - A previously set timeout can be cancelled and disk can resume + normal operation immediately by specifying a timeout of 0. + - Some hard drives only comply with an earlier version of the + ATA standard, but support the unload feature nonetheless. + There is no safe way Linux can detect these devices, so this + is not enabled by default. If it is known that your device + does support the unload feature, then you can tell the kernel + to enable it by writing -1. It can be disabled again by + writing -2. + - Values below -2 are rejected with -EINVAL + + For more information, see + Documentation/laptops/disk-shock-protection.txt + + +What: /sys/block/*/device/ncq_prio_enable +Date: Oct, 2016 +KernelVersion: v4.10 +Contact: linux-ide@vger.kernel.org +Description: + (RW) Write to the file to turn on or off the SATA ncq (native + command queueing) support. By default this feature is turned + off. diff --git a/Documentation/ABI/testing/sysfs-class-scsi_host b/Documentation/ABI/testing/sysfs-class-scsi_host index 0eb255e7db123..bafc59fd7b69e 100644 --- a/Documentation/ABI/testing/sysfs-class-scsi_host +++ b/Documentation/ABI/testing/sysfs-class-scsi_host @@ -27,3 +27,92 @@ Description: This file contains the current status of the "SSD Smart Path" the direct i/o path to physical devices. This setting is controller wide, affecting all configured logical drives on the controller. This file is readable and writable. + +What: /sys/class/scsi_host/hostX/link_power_management_policy +Date: Oct, 2007 +KernelVersion: v2.6.24 +Contact: linux-ide@vger.kernel.org +Description: + (RW) This parameter allows the user to read and set the link + (interface) power management. + + There are four possible options: + + min_power: Tell the controller to try to make the link use the + least possible power when possible. This may sacrifice some + performance due to increased latency when coming out of lower + power states. + + max_performance: Generally, this means no power management. + Tell the controller to have performance be a priority over power + management. + + medium_power: Tell the controller to enter a lower power state + when possible, but do not enter the lowest power state, thus + improving latency over min_power setting. + + med_power_with_dipm: Identical to the existing medium_power + setting except that it enables dipm (device initiated power + management) on top, which makes it match the Windows IRST (Intel + Rapid Storage Technology) driver settings. This setting is also + close to min_power, except that: + a) It does not use host-initiated slumber mode, but it does + allow device-initiated slumber + b) It does not enable low power device sleep mode (DevSlp). + +What: /sys/class/scsi_host/hostX/em_message +What: /sys/class/scsi_host/hostX/em_message_type +Date: Jun, 2008 +KernelVersion: v2.6.27 +Contact: linux-ide@vger.kernel.org +Description: + em_message: (RW) Enclosure management support. For the LED + protocol, writes and reads correspond to the LED message format + as defined in the AHCI spec. + + The user must turn sw_activity (under /sys/block/*/device/) OFF + it they wish to control the activity LED via the em_message + file. + + em_message_type: (RO) Displays the current enclosure management + protocol that is being used by the driver (for eg. LED, SAF-TE, + SES-2, SGPIO etc). + +What: /sys/class/scsi_host/hostX/ahci_port_cmd +What: /sys/class/scsi_host/hostX/ahci_host_caps +What: /sys/class/scsi_host/hostX/ahci_host_cap2 +Date: Mar, 2010 +KernelVersion: v2.6.35 +Contact: linux-ide@vger.kernel.org +Description: + [to be documented] + +What: /sys/class/scsi_host/hostX/ahci_host_version +Date: Mar, 2010 +KernelVersion: v2.6.35 +Contact: linux-ide@vger.kernel.org +Description: + (RO) Display the version of the AHCI spec implemented by the + host. + +What: /sys/class/scsi_host/hostX/em_buffer +Date: Apr, 2010 +KernelVersion: v2.6.35 +Contact: linux-ide@vger.kernel.org +Description: + (RW) Allows access to AHCI EM (enclosure management) buffer + directly if the host supports EM. + + For eg. the AHCI driver supports SGPIO EM messages but the + SATA/AHCI specs do not define the SGPIO message format of the EM + buffer. Different hardware(HW) vendors may have different + definitions. With the em_buffer attribute, this issue can be + solved by allowing HW vendors to provide userland drivers and + tools for their SGPIO initiators. + +What: /sys/class/scsi_host/hostX/em_message_supported +Date: Oct, 2009 +KernelVersion: v2.6.39 +Contact: linux-ide@vger.kernel.org +Description: + (RO) Displays supported enclosure management message types. From c5489f9fc053c744c609f34b32efca395cc2fdad Mon Sep 17 00:00:00 2001 From: Michal Oleszczyk Date: Fri, 2 Feb 2018 13:10:29 +0100 Subject: [PATCH 015/336] sgtl5000: change digital_mute policy Current implementation mute codec in global way (DAC block). That means when user routes sound not from I2S but from AUX source (LINE_IN) it also will be muted by alsa core. This should not happen. Signed-off-by: Michal Oleszczyk Signed-off-by: Mark Brown --- sound/soc/codecs/sgtl5000.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c index e1ab5537d27a8..c445a0794a27e 100644 --- a/sound/soc/codecs/sgtl5000.c +++ b/sound/soc/codecs/sgtl5000.c @@ -529,10 +529,15 @@ static const struct snd_kcontrol_new sgtl5000_snd_controls[] = { static int sgtl5000_digital_mute(struct snd_soc_dai *codec_dai, int mute) { struct snd_soc_codec *codec = codec_dai->codec; - u16 adcdac_ctrl = SGTL5000_DAC_MUTE_LEFT | SGTL5000_DAC_MUTE_RIGHT; + u16 i2s_pwr = SGTL5000_I2S_IN_POWERUP; - snd_soc_update_bits(codec, SGTL5000_CHIP_ADCDAC_CTRL, - adcdac_ctrl, mute ? adcdac_ctrl : 0); + /* + * During 'digital mute' do not mute DAC + * because LINE_IN would be muted aswell. We want to mute + * only I2S block - this can be done by powering it off + */ + snd_soc_update_bits(codec, SGTL5000_CHIP_DIG_POWER, + i2s_pwr, mute ? 0 : i2s_pwr); return 0; } @@ -1237,6 +1242,10 @@ static int sgtl5000_probe(struct snd_soc_codec *codec) */ snd_soc_write(codec, SGTL5000_DAP_CTRL, 0); + /* Unmute DAC after start */ + snd_soc_update_bits(codec, SGTL5000_CHIP_ADCDAC_CTRL, + SGTL5000_DAC_MUTE_LEFT | SGTL5000_DAC_MUTE_RIGHT, 0); + return 0; err: From dbe7d4c6d11999bda20bcea2572263150ff231ef Mon Sep 17 00:00:00 2001 From: Sylwester Nawrocki Date: Mon, 5 Feb 2018 18:05:00 +0100 Subject: [PATCH 016/336] ASoC: samsung: Add the DT binding files entry to MAINTAINERS This patch adds missing DT binding files to the Samsung ASoC drivers entry. Signed-off-by: Sylwester Nawrocki Acked-by: Krzysztof Kozlowski Signed-off-by: Mark Brown --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 3bdc260e36b7a..2161c1df9de36 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12091,6 +12091,7 @@ M: Sylwester Nawrocki L: alsa-devel@alsa-project.org (moderated for non-subscribers) S: Supported F: sound/soc/samsung/ +F: Documentation/devicetree/bindings/sound/samsung* SAMSUNG EXYNOS PSEUDO RANDOM NUMBER GENERATOR (RNG) DRIVER M: Krzysztof Kozlowski From 764baba80168ad3adafb521d2ab483ccbc49e344 Mon Sep 17 00:00:00 2001 From: Amir Goldstein Date: Sun, 4 Feb 2018 15:35:09 +0200 Subject: [PATCH 017/336] ovl: hash non-dir by lower inode for fsnotify Commit 31747eda41ef ("ovl: hash directory inodes for fsnotify") fixed an issue of inotify watch on directory that stops getting events after dropping dentry caches. A similar issue exists for non-dir non-upper files, for example: $ mkdir -p lower upper work merged $ touch lower/foo $ mount -t overlay -o lowerdir=lower,workdir=work,upperdir=upper none merged $ inotifywait merged/foo & $ echo 2 > /proc/sys/vm/drop_caches $ cat merged/foo inotifywait doesn't get the OPEN event, because ovl_lookup() called from 'cat' allocates a new overlay inode and does not reuse the watched inode. Fix this by hashing non-dir overlay inodes by lower real inode in the following cases that were not hashed before this change: - A non-upper overlay mount - A lower non-hardlink when index=off A helper ovl_hash_bylower() was added to put all the logic and documentation about which real inode an overlay inode is hashed by into one place. The issue dates back to initial version of overlayfs, but this patch depends on ovl_inode code that was introduced in kernel v4.13. Cc: #v4.13 Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi --- fs/overlayfs/inode.c | 58 ++++++++++++++++++++++++++++++-------------- 1 file changed, 40 insertions(+), 18 deletions(-) diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index fcd97b783fa1f..3b1bd469accdf 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -669,38 +669,59 @@ struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real, return inode; } +/* + * Does overlay inode need to be hashed by lower inode? + */ +static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper, + struct dentry *lower, struct dentry *index) +{ + struct ovl_fs *ofs = sb->s_fs_info; + + /* No, if pure upper */ + if (!lower) + return false; + + /* Yes, if already indexed */ + if (index) + return true; + + /* Yes, if won't be copied up */ + if (!ofs->upper_mnt) + return true; + + /* No, if lower hardlink is or will be broken on copy up */ + if ((upper || !ovl_indexdir(sb)) && + !d_is_dir(lower) && d_inode(lower)->i_nlink > 1) + return false; + + /* No, if non-indexed upper with NFS export */ + if (sb->s_export_op && upper) + return false; + + /* Otherwise, hash by lower inode for fsnotify */ + return true; +} + struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry, struct dentry *lowerdentry, struct dentry *index, unsigned int numlower) { - struct ovl_fs *ofs = sb->s_fs_info; struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL; struct inode *inode; - /* Already indexed or could be indexed on copy up? */ - bool indexed = (index || (ovl_indexdir(sb) && !upperdentry)); - struct dentry *origin = indexed ? lowerdentry : NULL; + bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index); bool is_dir; - if (WARN_ON(upperdentry && indexed && !lowerdentry)) - return ERR_PTR(-EIO); - if (!realinode) realinode = d_inode(lowerdentry); /* - * Copy up origin (lower) may exist for non-indexed non-dir upper, but - * we must not use lower as hash key in that case. - * Hash non-dir that is or could be indexed by origin inode. - * Hash dir that is or could be merged by origin inode. - * Hash pure upper and non-indexed non-dir by upper inode. - * Hash non-indexed dir by upper inode for NFS export. + * Copy up origin (lower) may exist for non-indexed upper, but we must + * not use lower as hash key if this is a broken hardlink. */ is_dir = S_ISDIR(realinode->i_mode); - if (is_dir && (indexed || !sb->s_export_op || !ofs->upper_mnt)) - origin = lowerdentry; - - if (upperdentry || origin) { - struct inode *key = d_inode(origin ?: upperdentry); + if (upperdentry || bylower) { + struct inode *key = d_inode(bylower ? lowerdentry : + upperdentry); unsigned int nlink = is_dir ? 1 : realinode->i_nlink; inode = iget5_locked(sb, (unsigned long) key, @@ -728,6 +749,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry, nlink = ovl_get_nlink(lowerdentry, upperdentry, nlink); set_nlink(inode, nlink); } else { + /* Lower hardlink that will be broken on copy up */ inode = new_inode(sb); if (!inode) goto out_nomem; From 2ca3c148a06244d46dcfc95c5965644c83a30b37 Mon Sep 17 00:00:00 2001 From: Amir Goldstein Date: Tue, 30 Jan 2018 13:31:09 +0200 Subject: [PATCH 018/336] ovl: check lower ancestry on encode of lower dir file handle This change relaxes copy up on encode of merge dir with lower layer > 1 and handles the case of encoding a merge dir with lower layer 1, where an ancestor is a non-indexed merge dir. In that case, decode of the lower file handle will not have been possible if the non-indexed ancestor is redirected before or after encode. Before encoding a non-upper directory file handle from real layer N, we need to check if it will be possible to reconnect an overlay dentry from the real lower decoded dentry. This is done by following the overlay ancestry up to a "layer N connected" ancestor and verifying that all parents along the way are "layer N connectable". If an ancestor that is NOT "layer N connectable" is found, we need to copy up an ancestor, which is "layer N connectable", thus making that ancestor "layer N connected". For example: layer 1: /a layer 2: /a/b/c The overlay dentry /a is NOT "layer 2 connectable", because if dir /a is copied up and renamed, upper dir /a will be indexed by lower dir /a from layer 1. The dir /a from layer 2 will never be indexed, so the algorithm in ovl_lookup_real_ancestor() (*) will not be able to lookup a connected overlay dentry from the connected lower dentry /a/b/c. To avoid this problem on decode time, we need to copy up an ancestor of /a/b/c, which is "layer 2 connectable", on encode time. That ancestor is /a/b. After copy up (and index) of /a/b, it will become "layer 2 connected" and when the time comes to decode the file handle from lower dentry /a/b/c, ovl_lookup_real_ancestor() will find the indexed ancestor /a/b and decoding a connected overlay dentry will be accomplished. (*) the algorithm in ovl_lookup_real_ancestor() can be improved to lookup an entry /a in the lower layers above layer N and find the indexed dir /a from layer 1. If that improvement is made, then the check for "layer N connected" will need to verify there are no redirects in lower layers above layer N. In the example above, /a will be "layer 2 connectable". However, if layer 2 dir /a is a target of a layer 1 redirect, then /a will NOT be "layer 2 connectable": layer 1: /A (redirect = /a) layer 2: /a/b/c Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi --- fs/overlayfs/export.c | 210 +++++++++++++++++++++++++++++++-------- fs/overlayfs/overlayfs.h | 1 + fs/overlayfs/super.c | 1 + 3 files changed, 168 insertions(+), 44 deletions(-) diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c index bb94ce9da5c87..9df455ca59a8e 100644 --- a/fs/overlayfs/export.c +++ b/fs/overlayfs/export.c @@ -19,6 +19,142 @@ #include #include "overlayfs.h" +static int ovl_encode_maybe_copy_up(struct dentry *dentry) +{ + int err; + + if (ovl_dentry_upper(dentry)) + return 0; + + err = ovl_want_write(dentry); + if (!err) { + err = ovl_copy_up(dentry); + ovl_drop_write(dentry); + } + + if (err) { + pr_warn_ratelimited("overlayfs: failed to copy up on encode (%pd2, err=%i)\n", + dentry, err); + } + + return err; +} + +/* + * Before encoding a non-upper directory file handle from real layer N, we need + * to check if it will be possible to reconnect an overlay dentry from the real + * lower decoded dentry. This is done by following the overlay ancestry up to a + * "layer N connected" ancestor and verifying that all parents along the way are + * "layer N connectable". If an ancestor that is NOT "layer N connectable" is + * found, we need to copy up an ancestor, which is "layer N connectable", thus + * making that ancestor "layer N connected". For example: + * + * layer 1: /a + * layer 2: /a/b/c + * + * The overlay dentry /a is NOT "layer 2 connectable", because if dir /a is + * copied up and renamed, upper dir /a will be indexed by lower dir /a from + * layer 1. The dir /a from layer 2 will never be indexed, so the algorithm (*) + * in ovl_lookup_real_ancestor() will not be able to lookup a connected overlay + * dentry from the connected lower dentry /a/b/c. + * + * To avoid this problem on decode time, we need to copy up an ancestor of + * /a/b/c, which is "layer 2 connectable", on encode time. That ancestor is + * /a/b. After copy up (and index) of /a/b, it will become "layer 2 connected" + * and when the time comes to decode the file handle from lower dentry /a/b/c, + * ovl_lookup_real_ancestor() will find the indexed ancestor /a/b and decoding + * a connected overlay dentry will be accomplished. + * + * (*) the algorithm in ovl_lookup_real_ancestor() can be improved to lookup an + * entry /a in the lower layers above layer N and find the indexed dir /a from + * layer 1. If that improvement is made, then the check for "layer N connected" + * will need to verify there are no redirects in lower layers above N. In the + * example above, /a will be "layer 2 connectable". However, if layer 2 dir /a + * is a target of a layer 1 redirect, then /a will NOT be "layer 2 connectable": + * + * layer 1: /A (redirect = /a) + * layer 2: /a/b/c + */ + +/* Return the lowest layer for encoding a connectable file handle */ +static int ovl_connectable_layer(struct dentry *dentry) +{ + struct ovl_entry *oe = OVL_E(dentry); + + /* We can get overlay root from root of any layer */ + if (dentry == dentry->d_sb->s_root) + return oe->numlower; + + /* + * If it's an unindexed merge dir, then it's not connectable with any + * lower layer + */ + if (ovl_dentry_upper(dentry) && + !ovl_test_flag(OVL_INDEX, d_inode(dentry))) + return 0; + + /* We can get upper/overlay path from indexed/lower dentry */ + return oe->lowerstack[0].layer->idx; +} + +/* + * @dentry is "connected" if all ancestors up to root or a "connected" ancestor + * have the same uppermost lower layer as the origin's layer. We may need to + * copy up a "connectable" ancestor to make it "connected". A "connected" dentry + * cannot become non "connected", so cache positive result in dentry flags. + * + * Return the connected origin layer or < 0 on error. + */ +static int ovl_connect_layer(struct dentry *dentry) +{ + struct dentry *next, *parent = NULL; + int origin_layer; + int err = 0; + + if (WARN_ON(dentry == dentry->d_sb->s_root) || + WARN_ON(!ovl_dentry_lower(dentry))) + return -EIO; + + origin_layer = OVL_E(dentry)->lowerstack[0].layer->idx; + if (ovl_dentry_test_flag(OVL_E_CONNECTED, dentry)) + return origin_layer; + + /* Find the topmost origin layer connectable ancestor of @dentry */ + next = dget(dentry); + for (;;) { + parent = dget_parent(next); + if (WARN_ON(parent == next)) { + err = -EIO; + break; + } + + /* + * If @parent is not origin layer connectable, then copy up + * @next which is origin layer connectable and we are done. + */ + if (ovl_connectable_layer(parent) < origin_layer) { + err = ovl_encode_maybe_copy_up(next); + break; + } + + /* If @parent is connected or indexed we are done */ + if (ovl_dentry_test_flag(OVL_E_CONNECTED, parent) || + ovl_test_flag(OVL_INDEX, d_inode(parent))) + break; + + dput(next); + next = parent; + } + + dput(parent); + dput(next); + + if (!err) + ovl_dentry_set_flag(OVL_E_CONNECTED, dentry); + + return err ?: origin_layer; +} + /* * We only need to encode origin if there is a chance that the same object was * encoded pre copy up and then we need to stay consistent with the same @@ -41,73 +177,59 @@ * L = lower file handle * * (*) Connecting an overlay dir from real lower dentry is not always - * possible when there are redirects in lower layers. To mitigate this case, - * we copy up the lower dir first and then encode an upper dir file handle. + * possible when there are redirects in lower layers and non-indexed merge dirs. + * To mitigate those case, we may copy up the lower dir ancestor before encode + * a lower dir file handle. + * + * Return 0 for upper file handle, > 0 for lower file handle or < 0 on error. */ -static bool ovl_should_encode_origin(struct dentry *dentry) +static int ovl_check_encode_origin(struct dentry *dentry) { struct ovl_fs *ofs = dentry->d_sb->s_fs_info; + /* Upper file handle for pure upper */ if (!ovl_dentry_lower(dentry)) - return false; + return 0; /* - * Decoding a merge dir, whose origin's parent is under a redirected - * lower dir is not always possible. As a simple aproximation, we do - * not encode lower dir file handles when overlay has multiple lower - * layers and origin is below the topmost lower layer. + * Upper file handle for non-indexed upper. * - * TODO: copy up only the parent that is under redirected lower. + * Root is never indexed, so if there's an upper layer, encode upper for + * root. */ - if (d_is_dir(dentry) && ofs->upper_mnt && - OVL_E(dentry)->lowerstack[0].layer->idx > 1) - return false; - - /* Decoding a non-indexed upper from origin is not implemented */ if (ovl_dentry_upper(dentry) && !ovl_test_flag(OVL_INDEX, d_inode(dentry))) - return false; - - return true; -} - -static int ovl_encode_maybe_copy_up(struct dentry *dentry) -{ - int err; - - if (ovl_dentry_upper(dentry)) return 0; - err = ovl_want_write(dentry); - if (err) - return err; - - err = ovl_copy_up(dentry); + /* + * Decoding a merge dir, whose origin's ancestor is under a redirected + * lower dir or under a non-indexed upper is not always possible. + * ovl_connect_layer() will try to make origin's layer "connected" by + * copying up a "connectable" ancestor. + */ + if (d_is_dir(dentry) && ofs->upper_mnt) + return ovl_connect_layer(dentry); - ovl_drop_write(dentry); - return err; + /* Lower file handle for indexed and non-upper dir/non-dir */ + return 1; } static int ovl_d_to_fh(struct dentry *dentry, char *buf, int buflen) { - struct dentry *origin = ovl_dentry_lower(dentry); struct ovl_fh *fh = NULL; - int err; + int err, enc_lower; /* - * If we should not encode a lower dir file handle, copy up and encode - * an upper dir file handle. + * Check if we should encode a lower or upper file handle and maybe + * copy up an ancestor to make lower file handle connectable. */ - if (!ovl_should_encode_origin(dentry)) { - err = ovl_encode_maybe_copy_up(dentry); - if (err) - goto fail; - - origin = NULL; - } + err = enc_lower = ovl_check_encode_origin(dentry); + if (enc_lower < 0) + goto fail; - /* Encode an upper or origin file handle */ - fh = ovl_encode_fh(origin ?: ovl_dentry_upper(dentry), !origin); + /* Encode an upper or lower file handle */ + fh = ovl_encode_fh(enc_lower ? ovl_dentry_lower(dentry) : + ovl_dentry_upper(dentry), !enc_lower); err = PTR_ERR(fh); if (IS_ERR(fh)) goto fail; diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 0df25a9c94bd7..225ff11711474 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -40,6 +40,7 @@ enum ovl_inode_flag { enum ovl_entry_flag { OVL_E_UPPER_ALIAS, OVL_E_OPAQUE, + OVL_E_CONNECTED, }; /* diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 9ee37c76091d6..7c24619ae7fc5 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -1359,6 +1359,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) /* Root is always merge -> can have whiteouts */ ovl_set_flag(OVL_WHITEOUTS, d_inode(root_dentry)); + ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry); ovl_inode_init(d_inode(root_dentry), upperpath.dentry, ovl_dentry_lower(root_dentry)); From 7168179fcf25f7812e8541decac686a91359e522 Mon Sep 17 00:00:00 2001 From: Amir Goldstein Date: Tue, 30 Jan 2018 14:30:50 +0200 Subject: [PATCH 019/336] ovl: check ERR_PTR() return value from ovl_lookup_real() Reported-by: Dan Carpenter Fixes: 061701540349 ("ovl: lookup indexed ancestor of lower dir") Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi --- fs/overlayfs/export.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c index 9df455ca59a8e..97a916ea8b86a 100644 --- a/fs/overlayfs/export.c +++ b/fs/overlayfs/export.c @@ -477,8 +477,8 @@ static struct dentry *ovl_lookup_real_inode(struct super_block *sb, dput(upper); } - if (!this) - return NULL; + if (IS_ERR_OR_NULL(this)) + return this; if (WARN_ON(ovl_dentry_real_at(this, layer->idx) != real)) { dput(this); From aba62a9e9a4064c5ea9deb33b5b1392f263cad24 Mon Sep 17 00:00:00 2001 From: Fabio Estevam Date: Fri, 16 Feb 2018 12:45:13 -0200 Subject: [PATCH 020/336] MAINTAINERS: Add myself as sgtl5000 maintainer I would like helping maintaining and reviewing/testing sgtl5000 related patches. Signed-off-by: Fabio Estevam Signed-off-by: Mark Brown --- MAINTAINERS | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 3bdc260e36b7a..4e283d131def8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9921,6 +9921,13 @@ F: Documentation/ABI/stable/sysfs-bus-nvmem F: include/linux/nvmem-consumer.h F: include/linux/nvmem-provider.h +NXP SGTL5000 DRIVER +M: Fabio Estevam +L: alsa-devel@alsa-project.org (moderated for non-subscribers) +S: Maintained +F: Documentation/devicetree/bindings/sound/sgtl5000.txt +F: sound/soc/codecs/sgtl5000* + NXP TDA998X DRM DRIVER M: Russell King S: Supported From a8992973edbb2555e956b90f6fe97c4bc14d761d Mon Sep 17 00:00:00 2001 From: Fabio Estevam Date: Fri, 16 Feb 2018 11:58:54 -0200 Subject: [PATCH 021/336] ASoC: sgtl5000: Fix suspend/resume Commit 8419caa72702 ("ASoC: sgtl5000: Do not disable regulators in SND_SOC_BIAS_OFF") causes the sgtl5000 to fail after a suspend/resume sequence: Playing WAVE '/media/a2002011001-e02.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo aplay: pcm_write:2051: write error: Input/output error The problem is caused by the fact that the aforementioned commit dropped the cache handling, so re-introduce the register map resync to fix the problem. Suggested-by: Mark Brown Signed-off-by: Fabio Estevam Signed-off-by: Mark Brown Cc: --- sound/soc/codecs/sgtl5000.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c index c445a0794a27e..c5c76ab8ccf10 100644 --- a/sound/soc/codecs/sgtl5000.c +++ b/sound/soc/codecs/sgtl5000.c @@ -876,15 +876,26 @@ static int sgtl5000_pcm_hw_params(struct snd_pcm_substream *substream, static int sgtl5000_set_bias_level(struct snd_soc_codec *codec, enum snd_soc_bias_level level) { + struct sgtl5000_priv *sgtl = snd_soc_codec_get_drvdata(codec); + int ret; + switch (level) { case SND_SOC_BIAS_ON: case SND_SOC_BIAS_PREPARE: case SND_SOC_BIAS_STANDBY: + regcache_cache_only(sgtl->regmap, false); + ret = regcache_sync(sgtl->regmap); + if (ret) { + regcache_cache_only(sgtl->regmap, true); + return ret; + } + snd_soc_update_bits(codec, SGTL5000_CHIP_ANA_POWER, SGTL5000_REFTOP_POWERUP, SGTL5000_REFTOP_POWERUP); break; case SND_SOC_BIAS_OFF: + regcache_cache_only(sgtl->regmap, true); snd_soc_update_bits(codec, SGTL5000_CHIP_ANA_POWER, SGTL5000_REFTOP_POWERUP, 0); break; From 50c330973c0c9f1e300b07bbab78d306dcc6e612 Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Fri, 16 Feb 2018 16:57:56 +0000 Subject: [PATCH 022/336] irqchip/gic-v3-its: Fix misplaced __iomem annotations Save 26 lines worth of Sparse complaints by fixing up this minor mishap. The pointee lies in the __iomem space; the pointer does not. Signed-off-by: Robin Murphy Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3-its.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 1d3056f537472..94b7d74d519f3 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -2495,7 +2495,7 @@ static int its_vpe_set_affinity(struct irq_data *d, static void its_vpe_schedule(struct its_vpe *vpe) { - void * __iomem vlpi_base = gic_data_rdist_vlpi_base(); + void __iomem *vlpi_base = gic_data_rdist_vlpi_base(); u64 val; /* Schedule the VPE */ @@ -2527,7 +2527,7 @@ static void its_vpe_schedule(struct its_vpe *vpe) static void its_vpe_deschedule(struct its_vpe *vpe) { - void * __iomem vlpi_base = gic_data_rdist_vlpi_base(); + void __iomem *vlpi_base = gic_data_rdist_vlpi_base(); u32 count = 1000000; /* 1s! */ bool clean; u64 val; From 9c7be59fc519af9081c46c48f06f2b8fadf55ad8 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Fri, 16 Feb 2018 10:48:20 +0100 Subject: [PATCH 023/336] libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs Various people have reported the Crucial MX100 512GB model not working with LPM set to min_power. I've now received a report that it also does not work with the new med_power_with_dipm level. It does work with medium_power, but that has no measurable power-savings and given the amount of people being bitten by the other levels not working, this commit just disables LPM altogether. Note all reporters of this have either the 512GB model (max capacity), or are not specifying their SSD's size. So for now this quirk assumes this is a problem with the 512GB model only. Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=89261 Buglink: https://github.com/linrunner/TLP/issues/84 Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede Signed-off-by: Tejun Heo --- drivers/ata/libata-core.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 61b09968d0326..28cad49fc846d 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4530,6 +4530,11 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { { "PIONEER DVD-RW DVR-212D", NULL, ATA_HORKAGE_NOSETXFER }, { "PIONEER DVD-RW DVR-216D", NULL, ATA_HORKAGE_NOSETXFER }, + /* The 512GB version of the MX100 has both queued TRIM and LPM issues */ + { "Crucial_CT512MX100*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | + ATA_HORKAGE_ZERO_AFTER_TRIM | + ATA_HORKAGE_NOLPM, }, + /* devices that don't properly handle queued TRIM commands */ { "Micron_M500_*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | ATA_HORKAGE_ZERO_AFTER_TRIM, }, From 15d9f3d116c02a485441d758d9ca0a2e4f3b30be Mon Sep 17 00:00:00 2001 From: Dennis Zhou Date: Thu, 15 Feb 2018 10:08:14 -0600 Subject: [PATCH 024/336] percpu: match chunk allocator declarations with definitions At some point the function declaration parameters got out of sync with the function definitions in percpu-vm.c and percpu-km.c. This patch makes them match again. Signed-off-by: Dennis Zhou Acked-by: Christoph Lameter Signed-off-by: Tejun Heo --- mm/percpu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index 50e7fdf840551..e1ea410021739 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1277,8 +1277,10 @@ static void pcpu_chunk_depopulated(struct pcpu_chunk *chunk, * pcpu_addr_to_page - translate address to physical address * pcpu_verify_alloc_info - check alloc_info is acceptable during init */ -static int pcpu_populate_chunk(struct pcpu_chunk *chunk, int off, int size); -static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, int off, int size); +static int pcpu_populate_chunk(struct pcpu_chunk *chunk, + int page_start, int page_end); +static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, + int page_start, int page_end); static struct pcpu_chunk *pcpu_create_chunk(void); static void pcpu_destroy_chunk(struct pcpu_chunk *chunk); static struct page *pcpu_addr_to_page(void *addr); From 47504ee04b9241548ae2c28be7d0b01cff3b7aa6 Mon Sep 17 00:00:00 2001 From: Dennis Zhou Date: Fri, 16 Feb 2018 12:07:19 -0600 Subject: [PATCH 025/336] percpu: add __GFP_NORETRY semantics to the percpu balancing path Percpu memory using the vmalloc area based chunk allocator lazily populates chunks by first requesting the full virtual address space required for the chunk and subsequently adding pages as allocations come through. To ensure atomic allocations can succeed, a workqueue item is used to maintain a minimum number of empty pages. In certain scenarios, such as reported in [1], it is possible that physical memory becomes quite scarce which can result in either a rather long time spent trying to find free pages or worse, a kernel panic. This patch adds support for __GFP_NORETRY and __GFP_NOWARN passing them through to the underlying allocators. This should prevent any unnecessary panics potentially caused by the workqueue item. The passing of gfp around is as additional flags rather than a full set of flags. The next patch will change these to caller passed semantics. V2: Added const modifier to gfp flags in the balance path. Removed an extra whitespace. [1] https://lkml.org/lkml/2018/2/12/551 Signed-off-by: Dennis Zhou Suggested-by: Daniel Borkmann Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com Acked-by: Christoph Lameter Signed-off-by: Tejun Heo --- mm/percpu-km.c | 8 ++++---- mm/percpu-vm.c | 18 +++++++++++------- mm/percpu.c | 44 +++++++++++++++++++++++++++----------------- 3 files changed, 42 insertions(+), 28 deletions(-) diff --git a/mm/percpu-km.c b/mm/percpu-km.c index d2a76642c4ae8..0d88d7bd57064 100644 --- a/mm/percpu-km.c +++ b/mm/percpu-km.c @@ -34,7 +34,7 @@ #include static int pcpu_populate_chunk(struct pcpu_chunk *chunk, - int page_start, int page_end) + int page_start, int page_end, gfp_t gfp) { return 0; } @@ -45,18 +45,18 @@ static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, /* nada */ } -static struct pcpu_chunk *pcpu_create_chunk(void) +static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) { const int nr_pages = pcpu_group_sizes[0] >> PAGE_SHIFT; struct pcpu_chunk *chunk; struct page *pages; int i; - chunk = pcpu_alloc_chunk(); + chunk = pcpu_alloc_chunk(gfp); if (!chunk) return NULL; - pages = alloc_pages(GFP_KERNEL, order_base_2(nr_pages)); + pages = alloc_pages(gfp | GFP_KERNEL, order_base_2(nr_pages)); if (!pages) { pcpu_free_chunk(chunk); return NULL; diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c index 9158e5a81391c..0af71eb2fff03 100644 --- a/mm/percpu-vm.c +++ b/mm/percpu-vm.c @@ -37,7 +37,7 @@ static struct page **pcpu_get_pages(void) lockdep_assert_held(&pcpu_alloc_mutex); if (!pages) - pages = pcpu_mem_zalloc(pages_size); + pages = pcpu_mem_zalloc(pages_size, 0); return pages; } @@ -73,18 +73,21 @@ static void pcpu_free_pages(struct pcpu_chunk *chunk, * @pages: array to put the allocated pages into, indexed by pcpu_page_idx() * @page_start: page index of the first page to be allocated * @page_end: page index of the last page to be allocated + 1 + * @gfp: allocation flags passed to the underlying allocator * * Allocate pages [@page_start,@page_end) into @pages for all units. * The allocation is for @chunk. Percpu core doesn't care about the * content of @pages and will pass it verbatim to pcpu_map_pages(). */ static int pcpu_alloc_pages(struct pcpu_chunk *chunk, - struct page **pages, int page_start, int page_end) + struct page **pages, int page_start, int page_end, + gfp_t gfp) { - const gfp_t gfp = GFP_KERNEL | __GFP_HIGHMEM; unsigned int cpu, tcpu; int i; + gfp |= GFP_KERNEL | __GFP_HIGHMEM; + for_each_possible_cpu(cpu) { for (i = page_start; i < page_end; i++) { struct page **pagep = &pages[pcpu_page_idx(cpu, i)]; @@ -262,6 +265,7 @@ static void pcpu_post_map_flush(struct pcpu_chunk *chunk, * @chunk: chunk of interest * @page_start: the start page * @page_end: the end page + * @gfp: allocation flags passed to the underlying memory allocator * * For each cpu, populate and map pages [@page_start,@page_end) into * @chunk. @@ -270,7 +274,7 @@ static void pcpu_post_map_flush(struct pcpu_chunk *chunk, * pcpu_alloc_mutex, does GFP_KERNEL allocation. */ static int pcpu_populate_chunk(struct pcpu_chunk *chunk, - int page_start, int page_end) + int page_start, int page_end, gfp_t gfp) { struct page **pages; @@ -278,7 +282,7 @@ static int pcpu_populate_chunk(struct pcpu_chunk *chunk, if (!pages) return -ENOMEM; - if (pcpu_alloc_pages(chunk, pages, page_start, page_end)) + if (pcpu_alloc_pages(chunk, pages, page_start, page_end, gfp)) return -ENOMEM; if (pcpu_map_pages(chunk, pages, page_start, page_end)) { @@ -325,12 +329,12 @@ static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, pcpu_free_pages(chunk, pages, page_start, page_end); } -static struct pcpu_chunk *pcpu_create_chunk(void) +static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) { struct pcpu_chunk *chunk; struct vm_struct **vms; - chunk = pcpu_alloc_chunk(); + chunk = pcpu_alloc_chunk(gfp); if (!chunk) return NULL; diff --git a/mm/percpu.c b/mm/percpu.c index e1ea410021739..f97443d488a81 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -447,10 +447,12 @@ static void pcpu_next_fit_region(struct pcpu_chunk *chunk, int alloc_bits, /** * pcpu_mem_zalloc - allocate memory * @size: bytes to allocate + * @gfp: allocation flags * * Allocate @size bytes. If @size is smaller than PAGE_SIZE, - * kzalloc() is used; otherwise, vzalloc() is used. The returned - * memory is always zeroed. + * kzalloc() is used; otherwise, the equivalent of vzalloc() is used. + * This is to facilitate passing through whitelisted flags. The + * returned memory is always zeroed. * * CONTEXT: * Does GFP_KERNEL allocation. @@ -458,15 +460,16 @@ static void pcpu_next_fit_region(struct pcpu_chunk *chunk, int alloc_bits, * RETURNS: * Pointer to the allocated area on success, NULL on failure. */ -static void *pcpu_mem_zalloc(size_t size) +static void *pcpu_mem_zalloc(size_t size, gfp_t gfp) { if (WARN_ON_ONCE(!slab_is_available())) return NULL; if (size <= PAGE_SIZE) - return kzalloc(size, GFP_KERNEL); + return kzalloc(size, gfp | GFP_KERNEL); else - return vzalloc(size); + return __vmalloc(size, gfp | GFP_KERNEL | __GFP_ZERO, + PAGE_KERNEL); } /** @@ -1154,12 +1157,12 @@ static struct pcpu_chunk * __init pcpu_alloc_first_chunk(unsigned long tmp_addr, return chunk; } -static struct pcpu_chunk *pcpu_alloc_chunk(void) +static struct pcpu_chunk *pcpu_alloc_chunk(gfp_t gfp) { struct pcpu_chunk *chunk; int region_bits; - chunk = pcpu_mem_zalloc(pcpu_chunk_struct_size); + chunk = pcpu_mem_zalloc(pcpu_chunk_struct_size, gfp); if (!chunk) return NULL; @@ -1168,17 +1171,17 @@ static struct pcpu_chunk *pcpu_alloc_chunk(void) region_bits = pcpu_chunk_map_bits(chunk); chunk->alloc_map = pcpu_mem_zalloc(BITS_TO_LONGS(region_bits) * - sizeof(chunk->alloc_map[0])); + sizeof(chunk->alloc_map[0]), gfp); if (!chunk->alloc_map) goto alloc_map_fail; chunk->bound_map = pcpu_mem_zalloc(BITS_TO_LONGS(region_bits + 1) * - sizeof(chunk->bound_map[0])); + sizeof(chunk->bound_map[0]), gfp); if (!chunk->bound_map) goto bound_map_fail; chunk->md_blocks = pcpu_mem_zalloc(pcpu_chunk_nr_blocks(chunk) * - sizeof(chunk->md_blocks[0])); + sizeof(chunk->md_blocks[0]), gfp); if (!chunk->md_blocks) goto md_blocks_fail; @@ -1278,10 +1281,10 @@ static void pcpu_chunk_depopulated(struct pcpu_chunk *chunk, * pcpu_verify_alloc_info - check alloc_info is acceptable during init */ static int pcpu_populate_chunk(struct pcpu_chunk *chunk, - int page_start, int page_end); + int page_start, int page_end, gfp_t gfp); static void pcpu_depopulate_chunk(struct pcpu_chunk *chunk, int page_start, int page_end); -static struct pcpu_chunk *pcpu_create_chunk(void); +static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp); static void pcpu_destroy_chunk(struct pcpu_chunk *chunk); static struct page *pcpu_addr_to_page(void *addr); static int __init pcpu_verify_alloc_info(const struct pcpu_alloc_info *ai); @@ -1423,7 +1426,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, } if (list_empty(&pcpu_slot[pcpu_nr_slots - 1])) { - chunk = pcpu_create_chunk(); + chunk = pcpu_create_chunk(0); if (!chunk) { err = "failed to allocate new chunk"; goto fail; @@ -1452,7 +1455,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, page_start, page_end) { WARN_ON(chunk->immutable); - ret = pcpu_populate_chunk(chunk, rs, re); + ret = pcpu_populate_chunk(chunk, rs, re, 0); spin_lock_irqsave(&pcpu_lock, flags); if (ret) { @@ -1563,10 +1566,17 @@ void __percpu *__alloc_reserved_percpu(size_t size, size_t align) * pcpu_balance_workfn - manage the amount of free chunks and populated pages * @work: unused * - * Reclaim all fully free chunks except for the first one. + * Reclaim all fully free chunks except for the first one. This is also + * responsible for maintaining the pool of empty populated pages. However, + * it is possible that this is called when physical memory is scarce causing + * OOM killer to be triggered. We should avoid doing so until an actual + * allocation causes the failure as it is possible that requests can be + * serviced from already backed regions. */ static void pcpu_balance_workfn(struct work_struct *work) { + /* gfp flags passed to underlying allocators */ + const gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN; LIST_HEAD(to_free); struct list_head *free_head = &pcpu_slot[pcpu_nr_slots - 1]; struct pcpu_chunk *chunk, *next; @@ -1647,7 +1657,7 @@ static void pcpu_balance_workfn(struct work_struct *work) chunk->nr_pages) { int nr = min(re - rs, nr_to_pop); - ret = pcpu_populate_chunk(chunk, rs, rs + nr); + ret = pcpu_populate_chunk(chunk, rs, rs + nr, gfp); if (!ret) { nr_to_pop -= nr; spin_lock_irq(&pcpu_lock); @@ -1664,7 +1674,7 @@ static void pcpu_balance_workfn(struct work_struct *work) if (nr_to_pop) { /* ran out of chunks to populate, create a new one and retry */ - chunk = pcpu_create_chunk(); + chunk = pcpu_create_chunk(gfp); if (chunk) { spin_lock_irq(&pcpu_lock); pcpu_chunk_relocate(chunk, -1); From 554fef1c39ee148623a496e04569dabb11463406 Mon Sep 17 00:00:00 2001 From: Dennis Zhou Date: Fri, 16 Feb 2018 12:09:58 -0600 Subject: [PATCH 026/336] percpu: allow select gfp to be passed to underlying allocators The prior patch added support for passing gfp flags through to the underlying allocators. This patch allows users to pass along gfp flags (currently only __GFP_NORETRY and __GFP_NOWARN) to the underlying allocators. This should allow users to decide if they are ok with failing allocations recovering in a more graceful way. Additionally, gfp passing was done as additional flags in the previous patch. Instead, change this to caller passed semantics. GFP_KERNEL is also removed as the default flag. It continues to be used for internally caused underlying percpu allocations. V2: Removed gfp_percpu_mask in favor of doing it inline. Removed GFP_KERNEL as a default flag for __alloc_percpu_gfp. Signed-off-by: Dennis Zhou Suggested-by: Daniel Borkmann Acked-by: Christoph Lameter Signed-off-by: Tejun Heo --- mm/percpu-km.c | 2 +- mm/percpu-vm.c | 4 ++-- mm/percpu.c | 16 +++++++--------- 3 files changed, 10 insertions(+), 12 deletions(-) diff --git a/mm/percpu-km.c b/mm/percpu-km.c index 0d88d7bd57064..38de70ab1a0d6 100644 --- a/mm/percpu-km.c +++ b/mm/percpu-km.c @@ -56,7 +56,7 @@ static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) if (!chunk) return NULL; - pages = alloc_pages(gfp | GFP_KERNEL, order_base_2(nr_pages)); + pages = alloc_pages(gfp, order_base_2(nr_pages)); if (!pages) { pcpu_free_chunk(chunk); return NULL; diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c index 0af71eb2fff03..d8078de912de3 100644 --- a/mm/percpu-vm.c +++ b/mm/percpu-vm.c @@ -37,7 +37,7 @@ static struct page **pcpu_get_pages(void) lockdep_assert_held(&pcpu_alloc_mutex); if (!pages) - pages = pcpu_mem_zalloc(pages_size, 0); + pages = pcpu_mem_zalloc(pages_size, GFP_KERNEL); return pages; } @@ -86,7 +86,7 @@ static int pcpu_alloc_pages(struct pcpu_chunk *chunk, unsigned int cpu, tcpu; int i; - gfp |= GFP_KERNEL | __GFP_HIGHMEM; + gfp |= __GFP_HIGHMEM; for_each_possible_cpu(cpu) { for (i = page_start; i < page_end; i++) { diff --git a/mm/percpu.c b/mm/percpu.c index f97443d488a81..fa3f854634a14 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -454,9 +454,6 @@ static void pcpu_next_fit_region(struct pcpu_chunk *chunk, int alloc_bits, * This is to facilitate passing through whitelisted flags. The * returned memory is always zeroed. * - * CONTEXT: - * Does GFP_KERNEL allocation. - * * RETURNS: * Pointer to the allocated area on success, NULL on failure. */ @@ -466,10 +463,9 @@ static void *pcpu_mem_zalloc(size_t size, gfp_t gfp) return NULL; if (size <= PAGE_SIZE) - return kzalloc(size, gfp | GFP_KERNEL); + return kzalloc(size, gfp); else - return __vmalloc(size, gfp | GFP_KERNEL | __GFP_ZERO, - PAGE_KERNEL); + return __vmalloc(size, gfp | __GFP_ZERO, PAGE_KERNEL); } /** @@ -1344,6 +1340,8 @@ static struct pcpu_chunk *pcpu_chunk_addr_search(void *addr) static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, gfp_t gfp) { + /* whitelisted flags that can be passed to the backing allocators */ + gfp_t pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN); bool is_atomic = (gfp & GFP_KERNEL) != GFP_KERNEL; bool do_warn = !(gfp & __GFP_NOWARN); static int warn_limit = 10; @@ -1426,7 +1424,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, } if (list_empty(&pcpu_slot[pcpu_nr_slots - 1])) { - chunk = pcpu_create_chunk(0); + chunk = pcpu_create_chunk(pcpu_gfp); if (!chunk) { err = "failed to allocate new chunk"; goto fail; @@ -1455,7 +1453,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, page_start, page_end) { WARN_ON(chunk->immutable); - ret = pcpu_populate_chunk(chunk, rs, re, 0); + ret = pcpu_populate_chunk(chunk, rs, re, pcpu_gfp); spin_lock_irqsave(&pcpu_lock, flags); if (ret) { @@ -1576,7 +1574,7 @@ void __percpu *__alloc_reserved_percpu(size_t size, size_t align) static void pcpu_balance_workfn(struct work_struct *work) { /* gfp flags passed to underlying allocators */ - const gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN; + const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; LIST_HEAD(to_free); struct list_head *free_head = &pcpu_slot[pcpu_nr_slots - 1]; struct pcpu_chunk *chunk, *next; From 2d30e9494f1ea320aaaad0cff9ddd92c87eac355 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Sun, 18 Feb 2018 23:01:44 +0100 Subject: [PATCH 027/336] ASoC: rt5651: Fix regcache sync errors on resume The ALC5651 does not like multi-write accesses, avoid them. This fixes: rt5651 i2c-10EC5651:00: Unable to sync registers 0x27-0x28. -121 Errors on resume (and all registers after the registers in the error not being synced). Signed-off-by: Hans de Goede Signed-off-by: Mark Brown Cc: stable@vger.kernel.org --- sound/soc/codecs/rt5651.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/soc/codecs/rt5651.c b/sound/soc/codecs/rt5651.c index 831b297978a48..45a73049cf648 100644 --- a/sound/soc/codecs/rt5651.c +++ b/sound/soc/codecs/rt5651.c @@ -1722,6 +1722,7 @@ static const struct regmap_config rt5651_regmap = { .num_reg_defaults = ARRAY_SIZE(rt5651_reg), .ranges = rt5651_ranges, .num_ranges = ARRAY_SIZE(rt5651_ranges), + .use_single_rw = true, }; #if defined(CONFIG_OF) From 5e558f8afaec8957932b1dbe5aeff800f9fc6957 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Tue, 20 Feb 2018 16:19:05 +0200 Subject: [PATCH 028/336] ASoC: hdmi-codec: Fix module unloading caused kernel crash The hcp->chmap_info must not be freed up in the hdmi_codec_remove() function as it leads to kernel crash due ALSA core's pcm_chmap_ctl_private_free() is trying to free it up again when the card destroyed via snd_card_free. Commit cd6111b26280a ("ASoC: hdmi-codec: add channel mapping control") should not have added the kfree(hcp->chmap_info); to the hdmi_codec_remove function. Signed-off-by: Peter Ujfalusi Reviewed-by: Jyri Sarha Tested-by: Jyri Sarha Signed-off-by: Mark Brown --- sound/soc/codecs/hdmi-codec.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c index 5672e516bec37..c1830ccd3bb8e 100644 --- a/sound/soc/codecs/hdmi-codec.c +++ b/sound/soc/codecs/hdmi-codec.c @@ -798,12 +798,7 @@ static int hdmi_codec_probe(struct platform_device *pdev) static int hdmi_codec_remove(struct platform_device *pdev) { - struct device *dev = &pdev->dev; - struct hdmi_codec_priv *hcp; - - hcp = dev_get_drvdata(dev); - kfree(hcp->chmap_info); - snd_soc_unregister_codec(dev); + snd_soc_unregister_codec(&pdev->dev); return 0; } From b17e5729a630d8326a48ec34ef02e6b4464a6aef Mon Sep 17 00:00:00 2001 From: Kai-Heng Feng Date: Sun, 18 Feb 2018 22:17:09 +0800 Subject: [PATCH 029/336] libata: disable LPM for Crucial BX100 SSD 500GB drive After Laptop Mode Tools starts to use min_power for LPM, a user found out Crucial BX100 SSD can't get mounted. Crucial BX100 SSD 500GB drive don't work well with min_power. This also happens to med_power_with_dipm. So let's disable LPM for Crucial BX100 SSD 500GB drive. BugLink: https://bugs.launchpad.net/bugs/1726930 Signed-off-by: Kai-Heng Feng Signed-off-by: Tejun Heo Cc: stable@vger.kernel.org --- drivers/ata/libata-core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 28cad49fc846d..cb789f8849aee 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4530,6 +4530,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { { "PIONEER DVD-RW DVR-212D", NULL, ATA_HORKAGE_NOSETXFER }, { "PIONEER DVD-RW DVR-216D", NULL, ATA_HORKAGE_NOSETXFER }, + /* Crucial BX100 SSD 500GB has broken LPM support */ + { "CT500BX100SSD1", "MU02", ATA_HORKAGE_NOLPM }, + /* The 512GB version of the MX100 has both queued TRIM and LPM issues */ { "Crucial_CT512MX100*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | ATA_HORKAGE_ZERO_AFTER_TRIM | From d1897c9538edafd4ae6bbd03cc075962ddde2c21 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 21 Feb 2018 11:39:22 -0800 Subject: [PATCH 030/336] cgroup: fix rule checking for threaded mode switching A domain cgroup isn't allowed to be turned threaded if its subtree is populated or domain controllers are enabled. cgroup_enable_threaded() depended on cgroup_can_be_thread_root() test to enforce this rule. A parent which has populated domain descendants or have domain controllers enabled can't become a thread root, so the above rules are enforced automatically. However, for the root cgroup which can host mixed domain and threaded children, cgroup_can_be_thread_root() doesn't check any of those conditions and thus first level cgroups ends up escaping those rules. This patch fixes the bug by adding explicit checks for those rules in cgroup_enable_threaded(). Reported-by: Michael Kerrisk (man-pages) Signed-off-by: Tejun Heo Fixes: 8cfd8147df67 ("cgroup: implement cgroup v2 thread support") Cc: stable@vger.kernel.org # v4.14+ --- kernel/cgroup/cgroup.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 8cda3bc3ae228..4bfb2908ec157 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3183,6 +3183,16 @@ static int cgroup_enable_threaded(struct cgroup *cgrp) if (cgroup_is_threaded(cgrp)) return 0; + /* + * If @cgroup is populated or has domain controllers enabled, it + * can't be switched. While the below cgroup_can_be_thread_root() + * test can catch the same conditions, that's only when @parent is + * not mixable, so let's check it explicitly. + */ + if (cgroup_is_populated(cgrp) || + cgrp->subtree_control & ~cgrp_dfl_threaded_ss_mask) + return -EOPNOTSUPP; + /* we're joining the parent's domain, ensure its validity */ if (!cgroup_is_valid_domain(dom_cgrp) || !cgroup_can_be_thread_root(dom_cgrp)) From 1b22b4b28fd5fbc51855219e3238b3ab81da8466 Mon Sep 17 00:00:00 2001 From: Colin Ian King Date: Thu, 22 Feb 2018 17:50:12 +0000 Subject: [PATCH 031/336] MIPS: ath25: Check for kzalloc allocation failure Currently there is no null check on a failed allocation of board_data, and hence a null pointer dereference will occurr. Fix this by checking for the out of memory null pointer. Fixes: a7473717483e ("MIPS: ath25: add board configuration detection") Signed-off-by: Colin Ian King Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Cc: # 3.19+ Patchwork: https://patchwork.linux-mips.org/patch/18657/ Signed-off-by: James Hogan --- arch/mips/ath25/board.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/mips/ath25/board.c b/arch/mips/ath25/board.c index 9ab48ff80c1c8..6d11ae581ea77 100644 --- a/arch/mips/ath25/board.c +++ b/arch/mips/ath25/board.c @@ -135,6 +135,8 @@ int __init ath25_find_config(phys_addr_t base, unsigned long size) } board_data = kzalloc(BOARD_CONFIG_BUFSZ, GFP_KERNEL); + if (!board_data) + goto error; ath25_board.config = (struct ath25_boarddata *)board_data; memcpy_fromio(board_data, bcfg, 0x100); if (broken_boarddata) { From 902f4d067a50ccf645a58dd5fb1d113b6e0f9b5b Mon Sep 17 00:00:00 2001 From: Colin Ian King Date: Thu, 22 Feb 2018 18:08:53 +0000 Subject: [PATCH 032/336] MIPS: OCTEON: irq: Check for null return on kzalloc allocation The allocation of host_data is not null checked, leading to a null pointer dereference if the allocation fails. Fix this by adding a null check and return with -ENOMEM. Fixes: 64b139f97c01 ("MIPS: OCTEON: irq: add CIB and other fixes") Signed-off-by: Colin Ian King Acked-by: David Daney Cc: Ralf Baechle Cc: "Steven J. Hill" Cc: linux-mips@linux-mips.org Cc: # 4.0+ Patchwork: https://patchwork.linux-mips.org/patch/18658/ Signed-off-by: James Hogan --- arch/mips/cavium-octeon/octeon-irq.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/mips/cavium-octeon/octeon-irq.c b/arch/mips/cavium-octeon/octeon-irq.c index 5b3a3f6a9ad31..d99f5242169e7 100644 --- a/arch/mips/cavium-octeon/octeon-irq.c +++ b/arch/mips/cavium-octeon/octeon-irq.c @@ -2277,6 +2277,8 @@ static int __init octeon_irq_init_cib(struct device_node *ciu_node, } host_data = kzalloc(sizeof(*host_data), GFP_KERNEL); + if (!host_data) + return -ENOMEM; raw_spin_lock_init(&host_data->lock); addr = of_get_address(ciu_node, 0, NULL, NULL); From accd4f36a7d11c2d54544007eb65e10604dcf2f5 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Fri, 23 Feb 2018 08:12:42 -0800 Subject: [PATCH 033/336] percpu: add a schedule point in pcpu_balance_workfn() When a large BPF percpu map is destroyed, I have seen pcpu_balance_workfn() holding cpu for hundreds of milliseconds. On KASAN config and 112 hyperthreads, average time to destroy a chunk is ~4 ms. [ 2489.841376] destroy chunk 1 in 4148689 ns ... [ 2490.093428] destroy chunk 32 in 4072718 ns Signed-off-by: Eric Dumazet Signed-off-by: Tejun Heo --- mm/percpu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/percpu.c b/mm/percpu.c index fa3f854634a14..36e7b65ba6cf3 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1610,6 +1610,7 @@ static void pcpu_balance_workfn(struct work_struct *work) spin_unlock_irq(&pcpu_lock); } pcpu_destroy_chunk(chunk); + cond_resched(); } /* From 3b821409632ab778d46e807516b457dfa72736ed Mon Sep 17 00:00:00 2001 From: Al Viro Date: Fri, 23 Feb 2018 20:47:17 -0500 Subject: [PATCH 034/336] lock_parent() needs to recheck if dentry got __dentry_kill'ed under it In case when dentry passed to lock_parent() is protected from freeing only by the fact that it's on a shrink list and trylock of parent fails, we could get hit by __dentry_kill() (and subsequent dentry_kill(parent)) between unlocking dentry and locking presumed parent. We need to recheck that dentry is alive once we lock both it and parent *and* postpone rcu_read_unlock() until after that point. Otherwise we could return a pointer to struct dentry that already is rcu-scheduled for freeing, with ->d_lock held on it; caller's subsequent attempt to unlock it can end up with memory corruption. Cc: stable@vger.kernel.org # 3.12+, counting backports Signed-off-by: Al Viro --- fs/dcache.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 7c38f39958bc3..32aaab21e648a 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -647,11 +647,16 @@ static inline struct dentry *lock_parent(struct dentry *dentry) spin_unlock(&parent->d_lock); goto again; } - rcu_read_unlock(); - if (parent != dentry) + if (parent != dentry) { spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); - else + if (unlikely(dentry->d_lockref.count < 0)) { + spin_unlock(&parent->d_lock); + parent = NULL; + } + } else { parent = NULL; + } + rcu_read_unlock(); return parent; } From 015555fd4d2930bc0c86952c46ad88b3392f66e4 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Mon, 19 Feb 2018 14:55:54 +0000 Subject: [PATCH 035/336] fs: dcache: Avoid livelock between d_alloc_parallel and __d_add If d_alloc_parallel runs concurrently with __d_add, it is possible for d_alloc_parallel to continuously retry whilst i_dir_seq has been incremented to an odd value by __d_add: CPU0: __d_add n = start_dir_add(dir); cmpxchg(&dir->i_dir_seq, n, n + 1) == n CPU1: d_alloc_parallel retry: seq = smp_load_acquire(&parent->d_inode->i_dir_seq) & ~1; hlist_bl_lock(b); bit_spin_lock(0, (unsigned long *)b); // Always succeeds CPU0: __d_lookup_done(dentry) hlist_bl_lock bit_spin_lock(0, (unsigned long *)b); // Never succeeds CPU1: if (unlikely(parent->d_inode->i_dir_seq != seq)) { hlist_bl_unlock(b); goto retry; } Since the simple bit_spin_lock used to implement hlist_bl_lock does not provide any fairness guarantees, then CPU1 can starve CPU0 of the lock and prevent it from reaching end_dir_add(dir), therefore CPU1 cannot exit its retry loop because the sequence number always has the bottom bit set. This patch resolves the livelock by not taking hlist_bl_lock in d_alloc_parallel if the sequence counter is odd, since any subsequent masked comparison with i_dir_seq will fail anyway. Cc: Peter Zijlstra Cc: Al Viro Reported-by: Naresh Madhusudana Acked-by: Peter Zijlstra (Intel) Reviewed-by: Matthew Wilcox Signed-off-by: Will Deacon Signed-off-by: Al Viro --- fs/dcache.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index 32aaab21e648a..bde3b6662601e 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2479,7 +2479,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, retry: rcu_read_lock(); - seq = smp_load_acquire(&parent->d_inode->i_dir_seq) & ~1; + seq = smp_load_acquire(&parent->d_inode->i_dir_seq); r_seq = read_seqbegin(&rename_lock); dentry = __d_lookup_rcu(parent, name, &d_seq); if (unlikely(dentry)) { @@ -2500,6 +2500,12 @@ struct dentry *d_alloc_parallel(struct dentry *parent, rcu_read_unlock(); goto retry; } + + if (unlikely(seq & 1)) { + rcu_read_unlock(); + goto retry; + } + hlist_bl_lock(b); if (unlikely(parent->d_inode->i_dir_seq != seq)) { hlist_bl_unlock(b); From 8cc07c808c9d595e81cbe5aad419b7769eb2e5c9 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Mon, 19 Feb 2018 14:55:55 +0000 Subject: [PATCH 036/336] fs: dcache: Use READ_ONCE when accessing i_dir_seq i_dir_seq is subject to concurrent modification by a cmpxchg or store-release operation, so ensure that the relaxed access in d_alloc_parallel uses READ_ONCE. Reported-by: Peter Zijlstra Signed-off-by: Will Deacon Signed-off-by: Al Viro --- fs/dcache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index bde3b6662601e..8945e6cabd93f 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2507,7 +2507,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, } hlist_bl_lock(b); - if (unlikely(parent->d_inode->i_dir_seq != seq)) { + if (unlikely(READ_ONCE(parent->d_inode->i_dir_seq) != seq)) { hlist_bl_unlock(b); rcu_read_unlock(); goto retry; From 192b2e742c06af399e8eecb4a1726520bfccece8 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Mon, 26 Feb 2018 13:17:07 +1100 Subject: [PATCH 037/336] selftests/powerpc: Skip tm-trap if transactional memory is not enabled Some processor revisions do not support transactional memory, and additionally kernel support can be disabled. In either case the tm-trap test should be skipped, otherwise it will fail with a SIGILL. Fixes: a08082f8e4e1 ("powerpc/selftests: Check endianness on trap in TM") Signed-off-by: Michael Ellerman --- tools/testing/selftests/powerpc/tm/tm-trap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/powerpc/tm/tm-trap.c b/tools/testing/selftests/powerpc/tm/tm-trap.c index 5d92c23ee6cbd..179d592f0073c 100644 --- a/tools/testing/selftests/powerpc/tm/tm-trap.c +++ b/tools/testing/selftests/powerpc/tm/tm-trap.c @@ -255,6 +255,8 @@ int tm_trap_test(void) struct sigaction trap_sa; + SKIP_IF(!have_htm()); + trap_sa.sa_flags = SA_SIGINFO; trap_sa.sa_sigaction = trap_signal_handler; sigaction(SIGTRAP, &trap_sa, NULL); From 5a3386790a172cf738194e1574f631cd43c6140a Mon Sep 17 00:00:00 2001 From: Yong Deng Date: Mon, 26 Feb 2018 10:43:52 +0800 Subject: [PATCH 038/336] ASoC: sun4i-i2s: Fix RX slot number of SUN8I I2S's RX slot number of SUN8I should be shifted 4 bit to left. Fixes: 7d2993811a1e ("ASoC: sun4i-i2s: Add support for H3") Signed-off-by: Yong Deng Reviewed-by: Chen-Yu Tsai Signed-off-by: Mark Brown Cc: stable@vger.kernel.org --- sound/soc/sunxi/sun4i-i2s.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c index dca1143c1150a..a4aa931ebfaef 100644 --- a/sound/soc/sunxi/sun4i-i2s.c +++ b/sound/soc/sunxi/sun4i-i2s.c @@ -104,7 +104,7 @@ #define SUN8I_I2S_CHAN_CFG_REG 0x30 #define SUN8I_I2S_CHAN_CFG_RX_SLOT_NUM_MASK GENMASK(6, 4) -#define SUN8I_I2S_CHAN_CFG_RX_SLOT_NUM(chan) (chan - 1) +#define SUN8I_I2S_CHAN_CFG_RX_SLOT_NUM(chan) ((chan - 1) << 4) #define SUN8I_I2S_CHAN_CFG_TX_SLOT_NUM_MASK GENMASK(2, 0) #define SUN8I_I2S_CHAN_CFG_TX_SLOT_NUM(chan) (chan - 1) From b5095f24e791c2d05da7cbb3d99e2b420b36a273 Mon Sep 17 00:00:00 2001 From: Fengguang Wu Date: Tue, 6 Feb 2018 00:25:16 +0800 Subject: [PATCH 039/336] ovl: fix ptr_ret.cocci warnings fs/overlayfs/export.c:459:10-16: WARNING: PTR_ERR_OR_ZERO can be used Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Generated by: scripts/coccinelle/api/ptr_ret.cocci Fixes: 4b91c30a5a19 ("ovl: lookup connected ancestor of dir in inode cache") CC: Amir Goldstein Signed-off-by: Fengguang Wu Signed-off-by: Miklos Szeredi --- fs/overlayfs/export.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c index 97a916ea8b86a..87bd4148f4fb5 100644 --- a/fs/overlayfs/export.c +++ b/fs/overlayfs/export.c @@ -620,7 +620,7 @@ static struct dentry *ovl_lookup_real(struct super_block *sb, if (err == -ECHILD) { this = ovl_lookup_real_ancestor(sb, real, layer); - err = IS_ERR(this) ? PTR_ERR(this) : 0; + err = PTR_ERR_OR_ZERO(this); } if (!err) { dput(connected); From 753e8abc36b2c966caea075db0c845563c8a19bf Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Fri, 23 Feb 2018 18:04:48 +0000 Subject: [PATCH 040/336] arm64: mm: fix thinko in non-global page table attribute check The routine pgattr_change_is_safe() was extended in commit 4e6020565596 ("arm64: mm: Permit transitioning from Global to Non-Global without BBM") to permit changing the nG attribute from not set to set, but did so in a way that inadvertently disallows such changes if other permitted attribute changes take place at the same time. So update the code to take this into account. Fixes: 4e6020565596 ("arm64: mm: Permit transitioning from Global to ...") Cc: # 4.14.x- Acked-by: Mark Rutland Reviewed-by: Marc Zyngier Acked-by: Will Deacon Signed-off-by: Ard Biesheuvel Signed-off-by: Catalin Marinas --- arch/arm64/mm/mmu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 84a019f550229..8c704f1e53c22 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -108,7 +108,7 @@ static bool pgattr_change_is_safe(u64 old, u64 new) * The following mapping attributes may be updated in live * kernel mappings without the need for break-before-make. */ - static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE; + static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG; /* creating or taking down mappings is always safe */ if (old == 0 || new == 0) @@ -118,9 +118,9 @@ static bool pgattr_change_is_safe(u64 old, u64 new) if ((old | new) & PTE_CONT) return false; - /* Transitioning from Global to Non-Global is safe */ - if (((old ^ new) == PTE_NG) && (new & PTE_NG)) - return true; + /* Transitioning from Non-Global to Global is unsafe */ + if (old & ~new & PTE_NG) + return false; return ((old ^ new) & ~mask) == 0; } From d1fe96c0e4de78ba0cd336ea3df3b850d06b9b9a Mon Sep 17 00:00:00 2001 From: Vivek Goyal Date: Fri, 2 Feb 2018 10:23:24 -0500 Subject: [PATCH 041/336] ovl: redirect_dir=nofollow should not follow redirect for opaque lower redirect_dir=nofollow should not follow a redirect. But in a specific configuration it can still follow it. For example try this. $ mkdir -p lower0 lower1/foo upper work merged $ touch lower1/foo/lower-file.txt $ setfattr -n "trusted.overlay.opaque" -v "y" lower1/foo $ mount -t overlay -o lowerdir=lower1:lower0,workdir=work,upperdir=upper,redirect_dir=on none merged $ cd merged $ mv foo foo-renamed $ umount merged # mount again. This time with redirect_dir=nofollow $ mount -t overlay -o lowerdir=lower1:lower0,workdir=work,upperdir=upper,redirect_dir=nofollow none merged $ ls merged/foo-renamed/ # This lists lower-file.txt, while it should not have. Basically, we are doing redirect check after we check for d.stop. And if this is not last lower, and we find an opaque lower, d.stop will be set. ovl_lookup_single() if (!d->last && ovl_is_opaquedir(this)) { d->stop = d->opaque = true; goto out; } To fix this, first check redirect is allowed. And after that check if d.stop has been set or not. Signed-off-by: Vivek Goyal Fixes: 438c84c2f0c7 ("ovl: don't follow redirects if redirect_dir=off") Cc: #v4.15 Signed-off-by: Miklos Szeredi --- fs/overlayfs/namei.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c index de3e6da1d5a51..70fcfcc684cc0 100644 --- a/fs/overlayfs/namei.c +++ b/fs/overlayfs/namei.c @@ -913,9 +913,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, stack[ctr].layer = lower.layer; ctr++; - if (d.stop) - break; - /* * Following redirects can have security consequences: it's like * a symlink into the lower layer without the permission checks. @@ -933,6 +930,9 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, goto out_put; } + if (d.stop) + break; + if (d.redirect && d.redirect[0] == '/' && poe != roe) { poe = roe; /* Find the current layer on the root dentry */ From d716d9b702bb759dd6fb50804f10a174bd156d71 Mon Sep 17 00:00:00 2001 From: Yoshihiro Shimoda Date: Wed, 14 Feb 2018 18:40:12 +0900 Subject: [PATCH 042/336] dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3 According to R-Car Gen3 Rev.0.80 manual, the DMATCR can be set to 16,777,215 as maximum. So, this patch fixes the max_chunk_size for safety on all of SoCs. Otherwise, a system may hang if the DMATCR is set to 0 on R-Car Gen3. Signed-off-by: Yoshihiro Shimoda Reviewed-by: Simon Horman Signed-off-by: Vinod Koul --- drivers/dma/sh/rcar-dmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c index e3ff162c03fc6..d0cacdb0713ec 100644 --- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -917,7 +917,7 @@ rcar_dmac_chan_prep_sg(struct rcar_dmac_chan *chan, struct scatterlist *sgl, rcar_dmac_chan_configure_desc(chan, desc); - max_chunk_size = (RCAR_DMATCR_MASK + 1) << desc->xfer_shift; + max_chunk_size = RCAR_DMATCR_MASK << desc->xfer_shift; /* * Allocate and fill the transfer chunk descriptors. We own the only From 084a804e01205bcd74cd0849bc72cb5c88f8e648 Mon Sep 17 00:00:00 2001 From: Roger Quadros Date: Tue, 27 Feb 2018 12:41:41 +0200 Subject: [PATCH 043/336] usb: dwc3: Fix lock-up on ID change during system suspend/resume To reproduce the lock up do the following - connect otg host adapter and a USB device to the dual-role port so that it is in host mode. - suspend to mem. - disconnect otg adapter. - resume the system. If we call dwc3_host_exit() before tasks are thawed xhci_plat_remove() seems to lock up at the second usb_remove_hcd() call. To work around this we queue the _dwc3_set_mode() work on the system_freezable_wq. Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly") Cc: # v4.12+ Suggested-by: Manu Gautam Signed-off-by: Roger Quadros Signed-off-by: Felipe Balbi --- drivers/usb/dwc3/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index f1d838a4acd61..e94bf91cc58a8 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -175,7 +175,7 @@ void dwc3_set_mode(struct dwc3 *dwc, u32 mode) dwc->desired_dr_role = mode; spin_unlock_irqrestore(&dwc->lock, flags); - queue_work(system_power_efficient_wq, &dwc->drd_work); + queue_work(system_freezable_wq, &dwc->drd_work); } u32 dwc3_core_fifo_space(struct dwc3_ep *dep, u8 type) From d7789f5bcdb298c4a302db471b1b20f74a20de95 Mon Sep 17 00:00:00 2001 From: Richard Fitzgerald Date: Wed, 28 Feb 2018 10:31:10 +0000 Subject: [PATCH 044/336] ASoC: wm_adsp: For TLV controls only register TLV get/set Normal 512-byte get/set of a TLV isn't supported but we were registering the normal get/set anyway and relying on omitting the SNDRV_CTL_ELEM_ACCESS_[READ|WRITE] flags to prevent them being called. Trouble is if this gets broken in the core ALSA code - as it has been since at least 4.14 - the standard get/set can be called unexpectedly and corrupt memory. There's no point providing functions that won't be called and it's a trivial change. The benefit is that if the ALSA core gets broken again we get a big fat immediate NULL dereference instead of a memory corruption timebomb. Signed-off-by: Richard Fitzgerald Signed-off-by: Mark Brown Cc: stable@vger.kernel.org --- sound/soc/codecs/wm_adsp.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c index 66e32f5d2917f..989d093abda7e 100644 --- a/sound/soc/codecs/wm_adsp.c +++ b/sound/soc/codecs/wm_adsp.c @@ -1204,12 +1204,14 @@ static int wmfw_add_ctl(struct wm_adsp *dsp, struct wm_coeff_ctl *ctl) kcontrol->put = wm_coeff_put_acked; break; default: - kcontrol->get = wm_coeff_get; - kcontrol->put = wm_coeff_put; - - ctl->bytes_ext.max = ctl->len; - ctl->bytes_ext.get = wm_coeff_tlv_get; - ctl->bytes_ext.put = wm_coeff_tlv_put; + if (kcontrol->access & SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK) { + ctl->bytes_ext.max = ctl->len; + ctl->bytes_ext.get = wm_coeff_tlv_get; + ctl->bytes_ext.put = wm_coeff_tlv_put; + } else { + kcontrol->get = wm_coeff_get; + kcontrol->put = wm_coeff_put; + } break; } From 64c3f648c25d108f346fdc96c15180c6b7d250e9 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Fri, 23 Feb 2018 12:55:59 -0800 Subject: [PATCH 045/336] powerpc/boot: Fix random libfdt related build errors Once in a while I see build errors similar to the following when building images from a clean tree. Building powerpc:virtex-ml507:44x/virtex5_defconfig ... failed ------------ Error log: arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error: libfdt.h: No such file or directory Building powerpc:bamboo:smpdev:44x/bamboo_defconfig ... failed ------------ Error log: arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error: libfdt.h: No such file or directory arch/powerpc/boot/treeboot-currituck.c:35:20: fatal error: libfdt.h: No such file or directory Rebuilds will succeed. Turns out that several source files in arch/powerpc/boot/ include libfdt.h, but Makefile dependencies are incomplete. Let's fix that. Signed-off-by: Guenter Roeck Signed-off-by: Michael Ellerman --- arch/powerpc/boot/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index ef6549e571571..26d5d2a5b8e99 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -101,7 +101,8 @@ $(addprefix $(obj)/,$(zlib-y)): \ libfdt := fdt.c fdt_ro.c fdt_wip.c fdt_sw.c fdt_rw.c fdt_strerror.c libfdtheader := fdt.h libfdt.h libfdt_internal.h -$(addprefix $(obj)/,$(libfdt) libfdt-wrapper.o simpleboot.o epapr.o opal.o): \ +$(addprefix $(obj)/,$(libfdt) libfdt-wrapper.o simpleboot.o epapr.o opal.o \ + treeboot-akebono.o treeboot-currituck.o treeboot-iss4xx.o): \ $(addprefix $(obj)/,$(libfdtheader)) src-wlib-y := string.S crt0.S stdio.c decompress.c main.c \ From b7abbd5a3533a31a1e7d4696ea275df543440c51 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Wed, 28 Feb 2018 15:15:56 +1100 Subject: [PATCH 046/336] selftests/powerpc: Fix missing clean of pmu/lib.o The tm-resched-dscr test links against pmu/lib.o, but we don't have a rule to clean pmu/lib.o. This can lead to a build break if you build for big endian and then little, or vice versa. Fix it by making tm-resched-dscr depend on pmu/lib.c, causing the code to be built directly in, meaning no .o is generated. Signed-off-by: Michael Ellerman --- tools/testing/selftests/powerpc/tm/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/powerpc/tm/Makefile b/tools/testing/selftests/powerpc/tm/Makefile index a23453943ad2b..5c72ff978f278 100644 --- a/tools/testing/selftests/powerpc/tm/Makefile +++ b/tools/testing/selftests/powerpc/tm/Makefile @@ -16,7 +16,7 @@ $(OUTPUT)/tm-syscall: tm-syscall-asm.S $(OUTPUT)/tm-syscall: CFLAGS += -I../../../../../usr/include $(OUTPUT)/tm-tmspr: CFLAGS += -pthread $(OUTPUT)/tm-vmx-unavail: CFLAGS += -pthread -m64 -$(OUTPUT)/tm-resched-dscr: ../pmu/lib.o +$(OUTPUT)/tm-resched-dscr: ../pmu/lib.c $(OUTPUT)/tm-unavailable: CFLAGS += -O0 -pthread -m64 -Wno-error=uninitialized -mvsx $(OUTPUT)/tm-trap: CFLAGS += -O0 -pthread -m64 From 28b0f8a6962a24ed21737578f3b1b07424635c9e Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Tue, 13 Feb 2018 07:38:08 -0800 Subject: [PATCH 047/336] tty: make n_tty_read() always abort if hangup is in progress A tty is hung up by __tty_hangup() setting file->f_op to hung_up_tty_fops, which is skipped on ttys whose write operation isn't tty_write(). This means that, for example, /dev/console whose write op is redirected_tty_write() is never actually marked hung up. Because n_tty_read() uses the hung up status to decide whether to abort the waiting readers, the lack of hung-up marking can lead to the following scenario. 1. A session contains two processes. The leader and its child. The child ignores SIGHUP. 2. The leader exits and starts disassociating from the controlling terminal (/dev/console). 3. __tty_hangup() skips setting f_op to hung_up_tty_fops. 4. SIGHUP is delivered and ignored. 5. tty_ldisc_hangup() is invoked. It wakes up the waits which should clear the read lockers of tty->ldisc_sem. 6. The reader wakes up but because tty_hung_up_p() is false, it doesn't abort and goes back to sleep while read-holding tty->ldisc_sem. 7. The leader progresses to tty_ldisc_lock() in tty_ldisc_hangup() and is now stuck in D sleep indefinitely waiting for tty->ldisc_sem. The following is Alan's explanation on why some ttys aren't hung up. http://lkml.kernel.org/r/20171101170908.6ad08580@alans-desktop 1. It broke the serial consoles because they would hang up and close down the hardware. With tty_port that *should* be fixable properly for any cases remaining. 2. The console layer was (and still is) completely broken and doens't refcount properly. So if you turn on console hangups it breaks (as indeed does freeing consoles and half a dozen other things). As neither can be fixed quickly, this patch works around the problem by introducing a new flag, TTY_HUPPING, which is used solely to tell n_tty_read() that hang-up is in progress for the console and the readers should be aborted regardless of the hung-up status of the device. The following is a sample hung task warning caused by this issue. INFO: task agetty:2662 blocked for more than 120 seconds. Not tainted 4.11.3-dbg-tty-lockup-02478-gfd6c7ee-dirty #28 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 0 2662 1 0x00000086 Call Trace: __schedule+0x267/0x890 schedule+0x36/0x80 schedule_timeout+0x23c/0x2e0 ldsem_down_write+0xce/0x1f6 tty_ldisc_lock+0x16/0x30 tty_ldisc_hangup+0xb3/0x1b0 __tty_hangup+0x300/0x410 disassociate_ctty+0x6c/0x290 do_exit+0x7ef/0xb00 do_group_exit+0x3f/0xa0 get_signal+0x1b3/0x5d0 do_signal+0x28/0x660 exit_to_usermode_loop+0x46/0x86 do_syscall_64+0x9c/0xb0 entry_SYSCALL64_slow_path+0x25/0x25 The following is the repro. Run "$PROG /dev/console". The parent process hangs in D state. #include #include #include #include #include #include #include #include #include #include #include #include int main(int argc, char **argv) { struct sigaction sact = { .sa_handler = SIG_IGN }; struct timespec ts1s = { .tv_sec = 1 }; pid_t pid; int fd; if (argc < 2) { fprintf(stderr, "test-hung-tty /dev/$TTY\n"); return 1; } /* fork a child to ensure that it isn't already the session leader */ pid = fork(); if (pid < 0) { perror("fork"); return 1; } if (pid > 0) { /* top parent, wait for everyone */ while (waitpid(-1, NULL, 0) >= 0) ; if (errno != ECHILD) perror("waitpid"); return 0; } /* new session, start a new session and set the controlling tty */ if (setsid() < 0) { perror("setsid"); return 1; } fd = open(argv[1], O_RDWR); if (fd < 0) { perror("open"); return 1; } if (ioctl(fd, TIOCSCTTY, 1) < 0) { perror("ioctl"); return 1; } /* fork a child, sleep a bit and exit */ pid = fork(); if (pid < 0) { perror("fork"); return 1; } if (pid > 0) { nanosleep(&ts1s, NULL); printf("Session leader exiting\n"); exit(0); } /* * The child ignores SIGHUP and keeps reading from the controlling * tty. Because SIGHUP is ignored, the child doesn't get killed on * parent exit and the bug in n_tty makes the read(2) block the * parent's control terminal hangup attempt. The parent ends up in * D sleep until the child is explicitly killed. */ sigaction(SIGHUP, &sact, NULL); printf("Child reading tty\n"); while (1) { char buf[1024]; if (read(fd, buf, sizeof(buf)) < 0) { perror("read"); return 1; } } return 0; } Signed-off-by: Tejun Heo Cc: Alan Cox Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/tty/n_tty.c | 6 ++++++ drivers/tty/tty_io.c | 9 +++++++++ include/linux/tty.h | 1 + 3 files changed, 16 insertions(+) diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index 5c0e59e8fe46b..cbe98bc2b9982 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -2180,6 +2180,12 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, } if (tty_hung_up_p(file)) break; + /* + * Abort readers for ttys which never actually + * get hung up. See __tty_hangup(). + */ + if (test_bit(TTY_HUPPING, &tty->flags)) + break; if (!timeout) break; if (file->f_flags & O_NONBLOCK) { diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index eb9133b472f48..63114ea35ec1b 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -586,6 +586,14 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session) return; } + /* + * Some console devices aren't actually hung up for technical and + * historical reasons, which can lead to indefinite interruptible + * sleep in n_tty_read(). The following explicitly tells + * n_tty_read() to abort readers. + */ + set_bit(TTY_HUPPING, &tty->flags); + /* inuse_filps is protected by the single tty lock, this really needs to change if we want to flush the workqueue with the lock held */ @@ -640,6 +648,7 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session) * from the ldisc side, which is now guaranteed. */ set_bit(TTY_HUPPED, &tty->flags); + clear_bit(TTY_HUPPING, &tty->flags); tty_unlock(tty); if (f) diff --git a/include/linux/tty.h b/include/linux/tty.h index 0a6c71e0ad01e..47f8af22f2168 100644 --- a/include/linux/tty.h +++ b/include/linux/tty.h @@ -364,6 +364,7 @@ struct tty_file_private { #define TTY_PTY_LOCK 16 /* pty private */ #define TTY_NO_WRITE_SPLIT 17 /* Preserve write boundaries to driver */ #define TTY_HUPPED 18 /* Post driver->hangup() */ +#define TTY_HUPPING 19 /* Hangup in progress */ #define TTY_LDISC_HALTED 22 /* Line discipline is halted */ /* Values for tty->flow_change */ From fd63a8903a2c40425a9811c3371dd4d0f42c0ad3 Mon Sep 17 00:00:00 2001 From: Jonas Danielsson Date: Mon, 29 Jan 2018 12:39:15 +0100 Subject: [PATCH 048/336] tty/serial: atmel: add new version check for usart On our at91sam9260 based board the usart0 and usart1 ports report their versions (ATMEL_US_VERSION) as 0x10302. This version is not included in the current checks in the driver. Signed-off-by: Jonas Danielsson Acked-by: Richard Genoud Acked-by: Nicolas Ferre Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/atmel_serial.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c index df46a9e88c34d..e287fe8f10fc0 100644 --- a/drivers/tty/serial/atmel_serial.c +++ b/drivers/tty/serial/atmel_serial.c @@ -1734,6 +1734,7 @@ static void atmel_get_ip_name(struct uart_port *port) switch (version) { case 0x302: case 0x10213: + case 0x10302: dev_dbg(port->dev, "This version is usart\n"); atmel_port->has_frac_baudrate = true; atmel_port->has_hw_timer = true; From e7f3e99cb1a667d04d60d02957fbed58b50d4e5a Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Fri, 2 Feb 2018 20:39:13 +0200 Subject: [PATCH 049/336] serial: 8250_pci: Don't fail on multiport card class Do not fail on multiport cards in serial_pci_is_class_communication(). It restores behaviour for SUNIX multiport cards, that enumerated by class and have a custom board data. Moreover it allows users to reenumerate port-by-port from user space. Fixes: 7d8905d06405 ("serial: 8250_pci: Enable device after we check black list") Reported-by: Nikola Ciprich Signed-off-by: Andy Shevchenko Tested-by: Nikola Ciprich Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/8250/8250_pci.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c index 54adf8d563501..d580625acc793 100644 --- a/drivers/tty/serial/8250/8250_pci.c +++ b/drivers/tty/serial/8250/8250_pci.c @@ -3387,11 +3387,9 @@ static int serial_pci_is_class_communication(struct pci_dev *dev) /* * If it is not a communications device or the programming * interface is greater than 6, give up. - * - * (Should we try to make guesses for multiport serial devices - * later?) */ if ((((dev->class >> 8) != PCI_CLASS_COMMUNICATION_SERIAL) && + ((dev->class >> 8) != PCI_CLASS_COMMUNICATION_MULTISERIAL) && ((dev->class >> 8) != PCI_CLASS_COMMUNICATION_MODEM)) || (dev->class & 0xff) > 6) return -ENODEV; @@ -3428,6 +3426,12 @@ serial_pci_guess_board(struct pci_dev *dev, struct pciserial_board *board) { int num_iomem, num_port, first_port = -1, i; + /* + * Should we try to make guesses for multiport serial devices later? + */ + if ((dev->class >> 8) == PCI_CLASS_COMMUNICATION_MULTISERIAL) + return -ENODEV; + num_iomem = num_port = 0; for (i = 0; i < PCI_NUM_BAR_RESOURCES; i++) { if (pci_resource_flags(dev, i) & IORESOURCE_IO) { From 714569064adee3c114a2a6490735b94abe269068 Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior Date: Sat, 3 Feb 2018 12:27:23 +0100 Subject: [PATCH 050/336] serial: core: mark port as initialized in autoconfig This is a followup on 44117a1d1732 ("serial: core: mark port as initialized after successful IRQ change"). Nikola has been using autoconfig via setserial and reported a crash similar to what I fixed in the earlier mentioned commit. Here I do the same fixup for the autoconfig. I wasn't sure that this is the right approach. Nikola confirmed that it fixes his crash. Fixes: b3b576461864 ("tty: serial_core: convert uart_open to use tty_port_open") Link: http://lkml.kernel.org/r/20180131072000.GD1853@localhost.localdomain Reported-by: Nikola Ciprich Tested-by: Nikola Ciprich Cc: Signed-off-by: Sebastian Andrzej Siewior Tested-by: Nikola Ciprich Acked-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/serial_core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index c8dde56b532b2..35b9201db3b4b 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -1144,6 +1144,8 @@ static int uart_do_autoconfig(struct tty_struct *tty,struct uart_state *state) uport->ops->config_port(uport, flags); ret = uart_startup(tty, state, 1); + if (ret == 0) + tty_port_set_initialized(port, true); if (ret > 0) ret = 0; } From 1f66dd36bb18437397ea0d7882c52f7e3c476e15 Mon Sep 17 00:00:00 2001 From: Greentime Hu Date: Tue, 13 Feb 2018 17:09:08 +0800 Subject: [PATCH 051/336] earlycon: add reg-offset to physical address before mapping It will get the wrong virtual address because port->mapbase is not added the correct reg-offset yet. We have to update it before earlycon_map() is called Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann Cc: Peter Hurley Cc: stable@vger.kernel.org Fixes: 088da2a17619 ("of: earlycon: Initialize port fields from DT properties") Acked-by: Rob Herring Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/earlycon.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/tty/serial/earlycon.c b/drivers/tty/serial/earlycon.c index 870e84fb6e39e..a24278380fec2 100644 --- a/drivers/tty/serial/earlycon.c +++ b/drivers/tty/serial/earlycon.c @@ -245,11 +245,12 @@ int __init of_setup_earlycon(const struct earlycon_id *match, } port->mapbase = addr; port->uartclk = BASE_BAUD * 16; - port->membase = earlycon_map(port->mapbase, SZ_4K); val = of_get_flat_dt_prop(node, "reg-offset", NULL); if (val) port->mapbase += be32_to_cpu(*val); + port->membase = earlycon_map(port->mapbase, SZ_4K); + val = of_get_flat_dt_prop(node, "reg-shift", NULL); if (val) port->regshift = be32_to_cpu(*val); From 9f2068f35729948bde84d87a40d135015911345d Mon Sep 17 00:00:00 2001 From: Nikola Ciprich Date: Tue, 13 Feb 2018 15:04:46 +0100 Subject: [PATCH 052/336] serial: 8250_pci: Add Brainboxes UC-260 4 port serial device Add PCI ids for two variants of Brainboxes UC-260 quad port PCI serial cards. Suggested-by: Andy Shevchenko Signed-off-by: Nikola Ciprich Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/8250/8250_pci.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c index d580625acc793..a93f77ab3da08 100644 --- a/drivers/tty/serial/8250/8250_pci.c +++ b/drivers/tty/serial/8250/8250_pci.c @@ -4702,6 +4702,17 @@ static const struct pci_device_id serial_pci_tbl[] = { { PCI_VENDOR_ID_INTASHIELD, PCI_DEVICE_ID_INTASHIELD_IS400, PCI_ANY_ID, PCI_ANY_ID, 0, 0, /* 135a.0dc0 */ pbn_b2_4_115200 }, + /* + * BrainBoxes UC-260 + */ + { PCI_VENDOR_ID_INTASHIELD, 0x0D21, + PCI_ANY_ID, PCI_ANY_ID, + PCI_CLASS_COMMUNICATION_MULTISERIAL << 8, 0xffff00, + pbn_b2_4_115200 }, + { PCI_VENDOR_ID_INTASHIELD, 0x0E34, + PCI_ANY_ID, PCI_ANY_ID, + PCI_CLASS_COMMUNICATION_MULTISERIAL << 8, 0xffff00, + pbn_b2_4_115200 }, /* * Perle PCI-RAS cards */ From 7842055bfce4bf0170d0f61df8b2add8399697be Mon Sep 17 00:00:00 2001 From: Ulrich Hecht Date: Thu, 15 Feb 2018 13:02:27 +0100 Subject: [PATCH 053/336] serial: sh-sci: prevent lockup on full TTY buffers When the TTY buffers fill up to the configured maximum, a system lockup occurs: [ 598.820128] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 598.825796] 0-...!: (1 GPs behind) idle=5a6/2/0 softirq=1974/1974 fqs=1 [ 598.832577] (detected by 3, t=62517 jiffies, g=296, c=295, q=126) [ 598.838755] Task dump for CPU 0: [ 598.841977] swapper/0 R running task 0 0 0 0x00000022 [ 598.849023] Call trace: [ 598.851476] __switch_to+0x98/0xb0 [ 598.854870] (null) This can be prevented by doing a dummy read of the RX data register. This issue affects both HSCIF and SCIF ports. Reported for R-Car H3 ES2.0; reproduced and fixed on H3 ES1.1. Probably affects other R-Car platforms as well. Reported-by: Yoshihiro Shimoda Signed-off-by: Ulrich Hecht Reviewed-by: Geert Uytterhoeven Cc: stable Tested-by: Nguyen Viet Dung Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/sh-sci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c index 7257c078e1554..44adf9db38f89 100644 --- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -885,6 +885,8 @@ static void sci_receive_chars(struct uart_port *port) /* Tell the rest of the system the news. New characters! */ tty_flip_buffer_push(tport); } else { + /* TTY buffers full; read from RX reg to prevent lockup */ + serial_port_in(port, SCxRDR); serial_port_in(port, SCxSR); /* dummy read */ sci_clear_SCxSR(port, SCxSR_RDxF_CLEAR(port)); } From 5d7f77ec72d10c421bc33958f06a5583f2d27ed6 Mon Sep 17 00:00:00 2001 From: phil eichinger Date: Mon, 19 Feb 2018 10:24:15 +0100 Subject: [PATCH 054/336] serial: imx: fix bogus dev_err MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Only one of the two is really required, not both: * have_rtscts or * have_rtsgpio In imx_rs485_config() this is done correctly, so RS485 is working, just the error message is false. Signed-off-by: Phil Eichinger Reviewed-by: Fabio Estevam Fixes: b8f3bff057b0 ("serial: imx: Support common rs485 binding for RTS polarity" Acked-by: Uwe Kleine-König Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/imx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c index 1d7ca382bc12b..a33c685af9907 100644 --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -2093,7 +2093,7 @@ static int serial_imx_probe(struct platform_device *pdev) uart_get_rs485_mode(&pdev->dev, &sport->port.rs485); if (sport->port.rs485.flags & SER_RS485_ENABLED && - (!sport->have_rtscts || !sport->have_rtsgpio)) + (!sport->have_rtscts && !sport->have_rtsgpio)) dev_err(&pdev->dev, "no RTS control, disabling rs485\n"); imx_rs485_config(&sport->port, &sport->port.rs485); From b08e5fd90bfc7553d36fa42a03fb7f5e82d252eb Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Mon, 26 Feb 2018 16:10:56 +0000 Subject: [PATCH 055/336] arm_pmu: Use disable_irq_nosync when disabling SPI in CPU teardown hook Commit 6de3f79112cc ("arm_pmu: explicitly enable/disable SPIs at hotplug") moved all of the arm_pmu IRQ enable/disable calls to the CPU hotplug hooks, regardless of whether they are implemented as PPIs or SPIs. This can lead to us sleeping from atomic context due to disable_irq blocking: | BUG: sleeping function called from invalid context at kernel/irq/manage.c:112 | in_atomic(): 1, irqs_disabled(): 128, pid: 15, name: migration/1 | no locks held by migration/1/15. | irq event stamp: 192 | hardirqs last enabled at (191): [<00000000803c2507>] | _raw_spin_unlock_irq+0x2c/0x4c | hardirqs last disabled at (192): [<000000007f57ad28>] multi_cpu_stop+0x9c/0x140 | softirqs last enabled at (0): [<0000000004ee1b58>] | copy_process.isra.77.part.78+0x43c/0x1504 | softirqs last disabled at (0): [< (null)>] (null) | CPU: 1 PID: 15 Comm: migration/1 Not tainted 4.16.0-rc3-salvator-x #1651 | Hardware name: Renesas Salvator-X board based on r8a7796 (DT) | Call trace: | dump_backtrace+0x0/0x140 | show_stack+0x14/0x1c | dump_stack+0xb4/0xf0 | ___might_sleep+0x1fc/0x218 | __might_sleep+0x70/0x80 | synchronize_irq+0x40/0xa8 | disable_irq+0x20/0x2c | arm_perf_teardown_cpu+0x80/0xac Since the interrupt is always CPU-affine and this code is running with interrupts disabled, we can just use disable_irq_nosync as we know there isn't a concurrent invocation of the handler to worry about. Fixes: 6de3f79112cc ("arm_pmu: explicitly enable/disable SPIs at hotplug") Reported-by: Geert Uytterhoeven Tested-by: Geert Uytterhoeven Acked-by: Mark Rutland Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas --- drivers/perf/arm_pmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 0c2ed11c06030..f63db346c2197 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -638,7 +638,7 @@ static int arm_perf_teardown_cpu(unsigned int cpu, struct hlist_node *node) if (irq_is_percpu_devid(irq)) disable_percpu_irq(irq); else - disable_irq(irq); + disable_irq_nosync(irq); } per_cpu(cpu_armpmu, cpu) = NULL; From da343b6d90e11132f1e917d865d88ee35d6e6d00 Mon Sep 17 00:00:00 2001 From: Sergey Gorenko Date: Sun, 25 Feb 2018 13:39:48 +0200 Subject: [PATCH 056/336] IB/mlx5: Fix incorrect size of klms in the memory region The value of mr->ndescs greater than mr->max_descs is set in the function mlx5_ib_sg_to_klms() if sg_nents is greater than mr->max_descs. This is an invalid value and it causes the following error when registering mr: mlx5_0:dump_cqe:276:(pid 193): dump error cqe 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000030: 00 00 00 00 0f 00 78 06 25 00 00 8b 08 1e 8f d3 Cc: # 4.5 Fixes: b005d3164713 ("mlx5: Add arbitrary sg list support") Signed-off-by: Sergey Gorenko Tested-by: Laurence Oberman Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mlx5/mr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 556e015678de2..1961c6a454372 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1816,7 +1816,6 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr, mr->ibmr.iova = sg_dma_address(sg) + sg_offset; mr->ibmr.length = 0; - mr->ndescs = sg_nents; for_each_sg(sgl, sg, sg_nents, i) { if (unlikely(i >= mr->max_descs)) @@ -1828,6 +1827,7 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr, sg_offset = 0; } + mr->ndescs = i; if (sg_offset_p) *sg_offset_p = sg_offset; From e7b169f34403becd3c9fd3b6e46614ab788f2187 Mon Sep 17 00:00:00 2001 From: Noa Osherovich Date: Sun, 25 Feb 2018 13:39:51 +0200 Subject: [PATCH 057/336] IB/mlx5: Avoid passing an invalid QP type to firmware During QP creation, the mlx5 driver translates the QP type to an internal value which is passed on to FW. There was no check to make sure that the translated value is valid, and -EINVAL was coerced into the mailbox command. Current firmware refuses this as an invalid QP type, but future/past firmware may do something else. Fixes: 09a7d9eca1a6c ('{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc') Reviewed-by: Ilya Lesokhin Signed-off-by: Noa Osherovich Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mlx5/qp.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index 39d24bf694a86..e8d7eaf0670ca 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -1584,6 +1584,7 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd, u32 uidx = MLX5_IB_DEFAULT_UIDX; struct mlx5_ib_create_qp ucmd; struct mlx5_ib_qp_base *base; + int mlx5_st; void *qpc; u32 *in; int err; @@ -1592,6 +1593,10 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd, spin_lock_init(&qp->sq.lock); spin_lock_init(&qp->rq.lock); + mlx5_st = to_mlx5_st(init_attr->qp_type); + if (mlx5_st < 0) + return -EINVAL; + if (init_attr->rwq_ind_tbl) { if (!udata) return -ENOSYS; @@ -1753,7 +1758,7 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd, qpc = MLX5_ADDR_OF(create_qp_in, in, qpc); - MLX5_SET(qpc, qpc, st, to_mlx5_st(init_attr->qp_type)); + MLX5_SET(qpc, qpc, st, mlx5_st); MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); if (init_attr->qp_type != MLX5_IB_QPT_REG_UMR) From aba462134634b502d720e15b23154f21cfa277e5 Mon Sep 17 00:00:00 2001 From: Daniel Jurgens Date: Sun, 25 Feb 2018 13:39:53 +0200 Subject: [PATCH 058/336] {net, IB}/mlx5: Raise fatal IB event when sys error occurs All other mlx5_events report the port number as 1 based, which is how FW reports it in the port event EQE. Reporting 0 for this event causes mlx5_ib to not raise a fatal event notification to registered clients due to a seemingly invalid port. All switch cases in mlx5_ib_event that go through the port check are supposed to set the port now, so just do it once at variable declaration. Fixes: 89d44f0a6c73("net/mlx5_core: Add pci error handlers to mlx5_core driver") Reviewed-by: Majd Dibbiny Signed-off-by: Daniel Jurgens Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mlx5/main.c | 11 ++--------- drivers/net/ethernet/mellanox/mlx5/core/health.c | 2 +- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 4236c80868200..bab38c6647d73 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3263,7 +3263,7 @@ static void mlx5_ib_handle_event(struct work_struct *_work) struct mlx5_ib_dev *ibdev; struct ib_event ibev; bool fatal = false; - u8 port = 0; + u8 port = (u8)work->param; if (mlx5_core_is_mp_slave(work->dev)) { ibdev = mlx5_ib_get_ibdev_from_mpi(work->context); @@ -3283,8 +3283,6 @@ static void mlx5_ib_handle_event(struct work_struct *_work) case MLX5_DEV_EVENT_PORT_UP: case MLX5_DEV_EVENT_PORT_DOWN: case MLX5_DEV_EVENT_PORT_INITIALIZED: - port = (u8)work->param; - /* In RoCE, port up/down events are handled in * mlx5_netdev_event(). */ @@ -3298,24 +3296,19 @@ static void mlx5_ib_handle_event(struct work_struct *_work) case MLX5_DEV_EVENT_LID_CHANGE: ibev.event = IB_EVENT_LID_CHANGE; - port = (u8)work->param; break; case MLX5_DEV_EVENT_PKEY_CHANGE: ibev.event = IB_EVENT_PKEY_CHANGE; - port = (u8)work->param; - schedule_work(&ibdev->devr.ports[port - 1].pkey_change_work); break; case MLX5_DEV_EVENT_GUID_CHANGE: ibev.event = IB_EVENT_GID_CHANGE; - port = (u8)work->param; break; case MLX5_DEV_EVENT_CLIENT_REREG: ibev.event = IB_EVENT_CLIENT_REREGISTER; - port = (u8)work->param; break; case MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT: schedule_work(&ibdev->delay_drop.delay_drop_work); @@ -3327,7 +3320,7 @@ static void mlx5_ib_handle_event(struct work_struct *_work) ibev.device = &ibdev->ib_dev; ibev.element.port_num = port; - if (port < 1 || port > ibdev->num_ports) { + if (!rdma_is_port_valid(&ibdev->ib_dev, port)) { mlx5_ib_warn(ibdev, "warning: event on port %d\n", port); goto out; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c index 21d29f7936f6c..d39b0b7011b2d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c @@ -124,7 +124,7 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force) trigger_cmd_completions(dev); } - mlx5_core_event(dev, MLX5_DEV_EVENT_SYS_ERROR, 0); + mlx5_core_event(dev, MLX5_DEV_EVENT_SYS_ERROR, 1); mlx5_core_err(dev, "end\n"); unlock: From 65389322b28f81cc137b60a41044c2d958a7b950 Mon Sep 17 00:00:00 2001 From: Moni Shoua Date: Sun, 25 Feb 2018 13:39:54 +0200 Subject: [PATCH 059/336] IB/mlx: Set slid to zero in Ethernet completion struct IB spec says that a lid should be ignored when link layer is Ethernet, for example when building or parsing a CM request message (CA17-34). However, since ib_lid_be16() and ib_lid_cpu16() validates the slid, not only when link layer is IB, we set the slid to zero to prevent false warnings in the kernel log. Fixes: 62ede7779904 ("Add OPA extended LID support") Reviewed-by: Majd Dibbiny Signed-off-by: Moni Shoua Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mlx4/cq.c | 4 +++- drivers/infiniband/hw/mlx5/cq.c | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 9a566ee3ceffe..82adc0d1d30ef 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -601,6 +601,7 @@ static void use_tunnel_data(struct mlx4_ib_qp *qp, struct mlx4_ib_cq *cq, struct wc->dlid_path_bits = 0; if (is_eth) { + wc->slid = 0; wc->vlan_id = be16_to_cpu(hdr->tun.sl_vid); memcpy(&(wc->smac[0]), (char *)&hdr->tun.mac_31_0, 4); memcpy(&(wc->smac[4]), (char *)&hdr->tun.slid_mac_47_32, 2); @@ -851,7 +852,6 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, } } - wc->slid = be16_to_cpu(cqe->rlid); g_mlpath_rqpn = be32_to_cpu(cqe->g_mlpath_rqpn); wc->src_qp = g_mlpath_rqpn & 0xffffff; wc->dlid_path_bits = (g_mlpath_rqpn >> 24) & 0x7f; @@ -860,6 +860,7 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, wc->wc_flags |= mlx4_ib_ipoib_csum_ok(cqe->status, cqe->checksum) ? IB_WC_IP_CSUM_OK : 0; if (is_eth) { + wc->slid = 0; wc->sl = be16_to_cpu(cqe->sl_vid) >> 13; if (be32_to_cpu(cqe->vlan_my_qpn) & MLX4_CQE_CVLAN_PRESENT_MASK) { @@ -871,6 +872,7 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, memcpy(wc->smac, cqe->smac, ETH_ALEN); wc->wc_flags |= (IB_WC_WITH_VLAN | IB_WC_WITH_SMAC); } else { + wc->slid = be16_to_cpu(cqe->rlid); wc->sl = be16_to_cpu(cqe->sl_vid) >> 12; wc->vlan_id = 0xffff; } diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index 5b974fb97611b..b5cfdaa9c7c8c 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -226,7 +226,6 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, wc->ex.invalidate_rkey = be32_to_cpu(cqe->imm_inval_pkey); break; } - wc->slid = be16_to_cpu(cqe->slid); wc->src_qp = be32_to_cpu(cqe->flags_rqpn) & 0xffffff; wc->dlid_path_bits = cqe->ml_path; g = (be32_to_cpu(cqe->flags_rqpn) >> 28) & 3; @@ -241,10 +240,12 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, } if (ll != IB_LINK_LAYER_ETHERNET) { + wc->slid = be16_to_cpu(cqe->slid); wc->sl = (be32_to_cpu(cqe->flags_rqpn) >> 24) & 0xf; return; } + wc->slid = 0; vlan_present = cqe->l4_l3_hdr_type & 0x1; roce_packet_type = (be32_to_cpu(cqe->flags_rqpn) >> 24) & 0x3; if (vlan_present) { From 2fb4f4eadd180a50112618dd9c5fef7fc50d4f08 Mon Sep 17 00:00:00 2001 From: Parav Pandit Date: Sun, 25 Feb 2018 13:39:56 +0200 Subject: [PATCH 060/336] IB/core: Fix missing RDMA cgroups release in case of failure to register device During IB device registration process, if query_device() fails or if ib_core fails to registers sysfs entries, rdma cgroup cleanup is skipped. Cc: # v4.2+ Fixes: 4be3a4fa51f4 ("IB/core: Fix kernel crash during fail to initialize device") Reviewed-by: Daniel Jurgens Signed-off-by: Parav Pandit Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/device.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index e8010e73a1cf4..bb065c9449be4 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -536,14 +536,14 @@ int ib_register_device(struct ib_device *device, ret = device->query_device(device, &device->attrs, &uhw); if (ret) { pr_warn("Couldn't query the device attributes\n"); - goto cache_cleanup; + goto cg_cleanup; } ret = ib_device_register_sysfs(device, port_callback); if (ret) { pr_warn("Couldn't register device %s with driver model\n", device->name); - goto cache_cleanup; + goto cg_cleanup; } device->reg_state = IB_DEV_REGISTERED; @@ -559,6 +559,8 @@ int ib_register_device(struct ib_device *device, mutex_unlock(&device_mutex); return 0; +cg_cleanup: + ib_device_unregister_rdmacg(device); cache_cleanup: ib_cache_cleanup_one(device); ib_cache_release_one(device); From a45bc17b360d75fac9ced85e99fda14bf38b4dc3 Mon Sep 17 00:00:00 2001 From: Devesh Sharma Date: Mon, 26 Feb 2018 01:51:37 -0800 Subject: [PATCH 061/336] RDMA/bnxt_re: Unconditionly fence non wire memory operations HW requires an unconditonal fence for all non-wire memory operations through SQ. This guarantees the completions of these memory operations. Signed-off-by: Devesh Sharma Signed-off-by: Selvin Xavier Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 643174d949a8c..755f1ccd82bbf 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -2227,10 +2227,13 @@ static int bnxt_re_build_inv_wqe(struct ib_send_wr *wr, wqe->type = BNXT_QPLIB_SWQE_TYPE_LOCAL_INV; wqe->local_inv.inv_l_key = wr->ex.invalidate_rkey; + /* Need unconditional fence for local invalidate + * opcode to work as expected. + */ + wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE; + if (wr->send_flags & IB_SEND_SIGNALED) wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP; - if (wr->send_flags & IB_SEND_FENCE) - wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE; if (wr->send_flags & IB_SEND_SOLICITED) wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SOLICIT_EVENT; @@ -2251,8 +2254,12 @@ static int bnxt_re_build_reg_wqe(struct ib_reg_wr *wr, wqe->frmr.levels = qplib_frpl->hwq.level + 1; wqe->type = BNXT_QPLIB_SWQE_TYPE_REG_MR; - if (wr->wr.send_flags & IB_SEND_FENCE) - wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE; + /* Need unconditional fence for reg_mr + * opcode to function as expected. + */ + + wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE; + if (wr->wr.send_flags & IB_SEND_SIGNALED) wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP; From c354dff00db8df80f271418d8392065e10ffffb6 Mon Sep 17 00:00:00 2001 From: Devesh Sharma Date: Mon, 26 Feb 2018 01:51:38 -0800 Subject: [PATCH 062/336] RDMA/bnxt_re: Fix incorrect DB offset calculation To support host systems with non 4K page size, l2_db_size shall be calculated with 4096 instead of PAGE_SIZE. Also, supply the host page size to FW during initialization. Signed-off-by: Devesh Sharma Signed-off-by: Selvin Xavier Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 6 +++++- drivers/infiniband/hw/bnxt_re/qplib_rcfw.h | 1 + drivers/infiniband/hw/bnxt_re/qplib_sp.c | 3 ++- drivers/infiniband/hw/bnxt_re/roce_hsi.h | 25 +++++++++++++++++++++- 4 files changed, 32 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index 8329ec6a79469..14d153d4013ca 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -460,7 +460,11 @@ int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw, int rc; RCFW_CMD_PREP(req, INITIALIZE_FW, cmd_flags); - + /* Supply (log-base-2-of-host-page-size - base-page-shift) + * to bono to adjust the doorbell page sizes. + */ + req.log2_dbr_pg_size = cpu_to_le16(PAGE_SHIFT - + RCFW_DBR_BASE_PAGE_SHIFT); /* * VFs need not setup the HW context area, PF * shall setup this area for VF. Skipping the diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h index 6bee6e3636ea4..c7cce2e4185e6 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h @@ -49,6 +49,7 @@ #define RCFW_COMM_SIZE 0x104 #define RCFW_DBR_PCI_BAR_REGION 2 +#define RCFW_DBR_BASE_PAGE_SHIFT 12 #define RCFW_CMD_PREP(req, CMD, cmd_flags) \ do { \ diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 03057983341f7..ee98e5efef846 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -139,7 +139,8 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, attr->max_pkey = le32_to_cpu(sb->max_pkeys); attr->max_inline_data = le32_to_cpu(sb->max_inline_data); - attr->l2_db_size = (sb->l2_db_space_size + 1) * PAGE_SIZE; + attr->l2_db_size = (sb->l2_db_space_size + 1) * + (0x01 << RCFW_DBR_BASE_PAGE_SHIFT); attr->max_sgid = le32_to_cpu(sb->max_gid); bnxt_qplib_query_version(rcfw, attr->fw_ver); diff --git a/drivers/infiniband/hw/bnxt_re/roce_hsi.h b/drivers/infiniband/hw/bnxt_re/roce_hsi.h index 2d7ea096a2474..3e5a4f760d0eb 100644 --- a/drivers/infiniband/hw/bnxt_re/roce_hsi.h +++ b/drivers/infiniband/hw/bnxt_re/roce_hsi.h @@ -1761,7 +1761,30 @@ struct cmdq_initialize_fw { #define CMDQ_INITIALIZE_FW_TIM_PG_SIZE_PG_2M (0x3UL << 4) #define CMDQ_INITIALIZE_FW_TIM_PG_SIZE_PG_8M (0x4UL << 4) #define CMDQ_INITIALIZE_FW_TIM_PG_SIZE_PG_1G (0x5UL << 4) - __le16 reserved16; + /* This value is (log-base-2-of-DBR-page-size - 12). + * 0 for 4KB. HW supported values are enumerated below. + */ + __le16 log2_dbr_pg_size; + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_MASK 0xfUL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_SFT 0 + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_4K 0x0UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_8K 0x1UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_16K 0x2UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_32K 0x3UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_64K 0x4UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_128K 0x5UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_256K 0x6UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_512K 0x7UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_1M 0x8UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_2M 0x9UL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_4M 0xaUL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_8M 0xbUL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_16M 0xcUL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_32M 0xdUL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_64M 0xeUL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_128M 0xfUL + #define CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_LAST \ + CMDQ_INITIALIZE_FW_LOG2_DBR_PG_SIZE_PG_128M __le64 qpc_page_dir; __le64 mrw_page_dir; __le64 srq_page_dir; From 497158aa5f520db50452ef928c0f955cb42f2e77 Mon Sep 17 00:00:00 2001 From: Selvin Xavier Date: Mon, 26 Feb 2018 01:51:39 -0800 Subject: [PATCH 063/336] RDMA/bnxt_re: Fix the ib_reg failure cleanup Release the netdev references in the cleanup path. Invokes the cleanup routines if bnxt_re_ib_reg fails. Signed-off-by: Selvin Xavier Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/bnxt_re/main.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c index 33a448036c2eb..604c805ceaa7a 100644 --- a/drivers/infiniband/hw/bnxt_re/main.c +++ b/drivers/infiniband/hw/bnxt_re/main.c @@ -1416,9 +1416,12 @@ static void bnxt_re_task(struct work_struct *work) switch (re_work->event) { case NETDEV_REGISTER: rc = bnxt_re_ib_reg(rdev); - if (rc) + if (rc) { dev_err(rdev_to_dev(rdev), "Failed to register with IB: %#x", rc); + bnxt_re_remove_one(rdev); + bnxt_re_dev_unreg(rdev); + } break; case NETDEV_UP: bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, From 4cd482c12be473ae507eba232a8374c798233e42 Mon Sep 17 00:00:00 2001 From: Muneendra Kumar M Date: Tue, 27 Feb 2018 21:51:49 -0800 Subject: [PATCH 064/336] IB/core : Add null pointer check in addr_resolve dev_get_by_index is being called in addr_resolve function which returns NULL and NULL pointer access leads to kernel crash. Following call trace is observed while running rdma_lat test application [ 146.173149] BUG: unable to handle kernel NULL pointer dereference at 00000000000004a0 [ 146.173198] IP: addr_resolve+0x9e/0x3e0 [ib_core] [ 146.173221] PGD 0 P4D 0 [ 146.173869] Oops: 0000 [#1] SMP PTI [ 146.182859] CPU: 8 PID: 127 Comm: kworker/8:1 Tainted: G O 4.15.0-rc6+ #18 [ 146.183758] Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01KN179, BIOS-[TCE132H-2.50]- 10/11/2017 [ 146.184691] Workqueue: ib_cm cm_work_handler [ib_cm] [ 146.185632] RIP: 0010:addr_resolve+0x9e/0x3e0 [ib_core] [ 146.186584] RSP: 0018:ffffc9000362faa0 EFLAGS: 00010246 [ 146.187521] RAX: 000000000000001b RBX: ffffc9000362fc08 RCX: 0000000000000006 [ 146.188472] RDX: 0000000000000000 RSI: 0000000000000096 RDI : ffff88087fc16990 [ 146.189427] RBP: ffffc9000362fb18 R08: 00000000ffffff9d R09: 00000000000004ac [ 146.190392] R10: 00000000000001e7 R11: 0000000000000001 R12: ffff88086af2e090 [ 146.191361] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff9d [ 146.192327] FS: 0000000000000000(0000) GS:ffff88087fc00000(0000) knlGS:0000000000000000 [ 146.193301] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 146.194274] CR2: 00000000000004a0 CR3: 000000000220a002 CR4: 00000000003606e0 [ 146.195258] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 146.196256] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 146.197231] Call Trace: [ 146.198209] ? rdma_addr_register_client+0x30/0x30 [ib_core] [ 146.199199] rdma_resolve_ip+0x1af/0x280 [ib_core] [ 146.200196] rdma_addr_find_l2_eth_by_grh+0x154/0x2b0 [ib_core] The below patch adds the missing NULL pointer check returned by dev_get_by_index before accessing the netdev to avoid kernel crash. We observed the below crash when we try to do the below test. server client --------- --------- |1.1.1.1|<----rxe-channel--->|1.1.1.2| --------- --------- On server: rdma_lat -c -n 2 -s 1024 On client:rdma_lat 1.1.1.1 -c -n 2 -s 1024 Fixes: 200298326b27 ("IB/core: Validate route when we init ah") Signed-off-by: Muneendra Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/addr.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index a5b4cf030c11b..9183d148d6444 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -550,18 +550,13 @@ static int addr_resolve(struct sockaddr *src_in, dst_release(dst); } - if (ndev->flags & IFF_LOOPBACK) { - ret = rdma_translate_ip(dst_in, addr); - /* - * Put the loopback device and get the translated - * device instead. - */ + if (ndev) { + if (ndev->flags & IFF_LOOPBACK) + ret = rdma_translate_ip(dst_in, addr); + else + addr->bound_dev_if = ndev->ifindex; dev_put(ndev); - ndev = dev_get_by_index(addr->net, addr->bound_dev_if); - } else { - addr->bound_dev_if = ndev->ifindex; } - dev_put(ndev); return ret; } From e64b6afa98f3629d0c0c46233bbdbe8acdb56f06 Mon Sep 17 00:00:00 2001 From: Giulio Benetti Date: Wed, 28 Feb 2018 17:46:53 +0100 Subject: [PATCH 065/336] drm/sun4i: Fix dclk_set_phase Phase value is not shifted before writing. Shift left of 28 bits to fit right bits Signed-off-by: Giulio Benetti Signed-off-by: Maxime Ripard Link: https://patchwork.freedesktop.org/patch/msgid/1519836413-35023-1-git-send-email-giulio.benetti@micronovasrl.com --- drivers/gpu/drm/sun4i/sun4i_dotclock.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/sun4i/sun4i_dotclock.c b/drivers/gpu/drm/sun4i/sun4i_dotclock.c index 023f39bda633d..e36004fbe4536 100644 --- a/drivers/gpu/drm/sun4i/sun4i_dotclock.c +++ b/drivers/gpu/drm/sun4i/sun4i_dotclock.c @@ -132,10 +132,13 @@ static int sun4i_dclk_get_phase(struct clk_hw *hw) static int sun4i_dclk_set_phase(struct clk_hw *hw, int degrees) { struct sun4i_dclk *dclk = hw_to_dclk(hw); + u32 val = degrees / 120; + + val <<= 28; regmap_update_bits(dclk->regmap, SUN4I_TCON0_IO_POL_REG, GENMASK(29, 28), - degrees / 120); + val); return 0; } From 09a0fb67536a49af19f2bfc632100e9de91fe526 Mon Sep 17 00:00:00 2001 From: Christian Borntraeger Date: Wed, 28 Feb 2018 18:44:34 +0000 Subject: [PATCH 066/336] KVM: s390: provide io interrupt kvm_stat We already count io interrupts, but we forgot to print them. Signed-off-by: Christian Borntraeger Fixes: d8346b7d9b ("KVM: s390: Support for I/O interrupts.") Reviewed-by: Cornelia Huck Reviewed-by: David Hildenbrand Signed-off-by: Christian Borntraeger --- arch/s390/kvm/kvm-s390.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 77d7818130db4..df19f158347e0 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -86,6 +86,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { "deliver_prefix_signal", VCPU_STAT(deliver_prefix_signal) }, { "deliver_restart_signal", VCPU_STAT(deliver_restart_signal) }, { "deliver_program_interruption", VCPU_STAT(deliver_program_int) }, + { "deliver_io_interrupt", VCPU_STAT(deliver_io_int) }, { "exit_wait_state", VCPU_STAT(exit_wait_state) }, { "instruction_epsw", VCPU_STAT(instruction_epsw) }, { "instruction_gs", VCPU_STAT(instruction_gs) }, From 651438bb0af5213f1f70d66e75bf11d08cb5537a Mon Sep 17 00:00:00 2001 From: Wen Xiong Date: Thu, 15 Feb 2018 14:05:10 -0600 Subject: [PATCH 067/336] nvme-pci: Fix EEH failure on ppc Triggering PPC EEH detection and handling requires a memory mapped read failure. The NVMe driver removed the periodic health check MMIO, so there's no early detection mechanism to trigger the recovery. Instead, the detection now happens when the nvme driver handles an IO timeout event. This takes the pci channel offline, so we do not want the driver to proceed with escalating its own recovery efforts that may conflict with the EEH handler. This patch ensures the driver will observe the channel was set to offline after a failed MMIO read and resets the IO timer so the EEH handler has a chance to recover the device. Signed-off-by: Wen Xiong [updated change log] Signed-off-by: Keith Busch --- drivers/nvme/host/pci.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 5933a5c732e83..e5ce07f4966f6 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1153,12 +1153,6 @@ static bool nvme_should_reset(struct nvme_dev *dev, u32 csts) if (!(csts & NVME_CSTS_CFS) && !nssro) return false; - /* If PCI error recovery process is happening, we cannot reset or - * the recovery mechanism will surely fail. - */ - if (pci_channel_offline(to_pci_dev(dev->dev))) - return false; - return true; } @@ -1189,6 +1183,13 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) struct nvme_command cmd; u32 csts = readl(dev->bar + NVME_REG_CSTS); + /* If PCI error recovery process is happening, we cannot reset or + * the recovery mechanism will surely fail. + */ + mb(); + if (pci_channel_offline(to_pci_dev(dev->dev))) + return BLK_EH_RESET_TIMER; + /* * Reset immediately if the controller is failed */ From cb57469c9573f6018cd1302953dd45d6e05aba7b Mon Sep 17 00:00:00 2001 From: Joel Fernandes Date: Fri, 16 Feb 2018 11:02:01 -0800 Subject: [PATCH 068/336] staging: android: ashmem: Fix lockdep issue during llseek ashmem_mutex create a chain of dependencies like so: (1) mmap syscall -> mmap_sem -> (acquired) ashmem_mmap ashmem_mutex (try to acquire) (block) (2) llseek syscall -> ashmem_llseek -> ashmem_mutex -> (acquired) inode_lock -> inode->i_rwsem (try to acquire) (block) (3) getdents -> iterate_dir -> inode_lock -> inode->i_rwsem (acquired) copy_to_user -> mmap_sem (try to acquire) There is a lock ordering created between mmap_sem and inode->i_rwsem causing a lockdep splat [2] during a syzcaller test, this patch fixes the issue by unlocking the mutex earlier. Functionally that's Ok since we don't need to protect vfs_llseek. [1] https://patchwork.kernel.org/patch/10185031/ [2] https://lkml.org/lkml/2018/1/10/48 Acked-by: Todd Kjos Cc: Arve Hjonnevag Cc: stable@vger.kernel.org Reported-by: syzbot+8ec30bb7bf1a981a2012@syzkaller.appspotmail.com Signed-off-by: Joel Fernandes Acked-by: Greg Hackmann Signed-off-by: Greg Kroah-Hartman --- drivers/staging/android/ashmem.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index 6dbba5aff1911..d5450365769cb 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -326,24 +326,23 @@ static loff_t ashmem_llseek(struct file *file, loff_t offset, int origin) mutex_lock(&ashmem_mutex); if (asma->size == 0) { - ret = -EINVAL; - goto out; + mutex_unlock(&ashmem_mutex); + return -EINVAL; } if (!asma->file) { - ret = -EBADF; - goto out; + mutex_unlock(&ashmem_mutex); + return -EBADF; } + mutex_unlock(&ashmem_mutex); + ret = vfs_llseek(asma->file, offset, origin); if (ret < 0) - goto out; + return ret; /** Copy f_pos from backing file, since f_ops->llseek() sets it */ file->f_pos = asma->file->f_pos; - -out: - mutex_unlock(&ashmem_mutex); return ret; } From 16ccfff2897613007b5eda9e29d65303c6280026 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Tue, 6 Feb 2018 20:17:42 +0800 Subject: [PATCH 069/336] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs") has switched to do irq vectors spread among all possible CPUs, so pass num_possible_cpus() as max vecotrs to be assigned. For example, in a 8 cores system, 0~3 online, 4~8 offline/not present, see 'lscpu': [ming@box]$lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 2 NUMA node(s): 2 ... NUMA node0 CPU(s): 0-3 NUMA node1 CPU(s): ... 1) before this patch, follows the allocated vectors and their affinity: irq 47, cpu list 0,4 irq 48, cpu list 1,6 irq 49, cpu list 2,5 irq 50, cpu list 3,7 2) after this patch, follows the allocated vectors and their affinity: irq 43, cpu list 0 irq 44, cpu list 1 irq 45, cpu list 2 irq 46, cpu list 3 irq 47, cpu list 4 irq 48, cpu list 6 irq 49, cpu list 5 irq 50, cpu list 7 Cc: Keith Busch Cc: Sagi Grimberg Cc: Thomas Gleixner Cc: Christoph Hellwig Signed-off-by: Ming Lei Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch --- drivers/nvme/host/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index e5ce07f4966f6..b6f43b738f03a 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1914,7 +1914,7 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) int result, nr_io_queues; unsigned long size; - nr_io_queues = num_present_cpus(); + nr_io_queues = num_possible_cpus(); result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues); if (result < 0) return result; From 6f54120e17e311fd7ac42b9ec2a0611caa5b46ad Mon Sep 17 00:00:00 2001 From: Jason Yan Date: Wed, 28 Feb 2018 09:11:10 +0800 Subject: [PATCH 070/336] ata: do not schedule hot plug if it is a sas host We've got a kernel panic when using sata disk with sas controller: [115946.152283] Unable to handle kernel NULL pointer dereference at virtual address 000007d8 [115946.223963] CPU: 0 PID: 22175 Comm: kworker/0:1 Tainted: G W OEL 4.14.0 #1 [115946.232925] Workqueue: events ata_scsi_hotplug [115946.237938] task: ffff8021ee50b180 task.stack: ffff00000d5d0000 [115946.244717] PC is at sas_find_dev_by_rphy+0x44/0x114 [115946.250224] LR is at sas_find_dev_by_rphy+0x3c/0x114 ...... [115946.355701] Process kworker/0:1 (pid: 22175, stack limit = 0xffff00000d5d0000) [115946.363369] Call trace: [115946.456356] [] sas_find_dev_by_rphy+0x44/0x114 [115946.462908] [] sas_target_alloc+0x20/0x5c [115946.469408] [] scsi_alloc_target+0x250/0x308 [115946.475781] [] __scsi_add_device+0xb0/0x154 [115946.481991] [] ata_scsi_scan_host+0x180/0x218 [115946.488367] [] ata_scsi_hotplug+0xb0/0xcc [115946.494801] [] process_one_work+0x144/0x390 [115946.501115] [] worker_thread+0x144/0x418 [115946.507093] [] kthread+0x10c/0x138 [115946.512792] [] ret_from_fork+0x10/0x18 We found that Ding Xiang has reported a similar bug before: https://patchwork.kernel.org/patch/9179817/ And this bug still exists in mainline. Since libsas handles hotplug and device adding/removing itself, do not need to schedule ata hot plug task here if it is a sas host. Signed-off-by: Jason Yan Cc: Ding Xiang Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo --- drivers/ata/libata-eh.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 11c3137d7b0af..c016829a38fd2 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -815,7 +815,8 @@ void ata_scsi_port_error_handler(struct Scsi_Host *host, struct ata_port *ap) if (ap->pflags & ATA_PFLAG_LOADING) ap->pflags &= ~ATA_PFLAG_LOADING; - else if (ap->pflags & ATA_PFLAG_SCSI_HOTPLUG) + else if ((ap->pflags & ATA_PFLAG_SCSI_HOTPLUG) && + !(ap->flags & ATA_FLAG_SAS_HOST)) schedule_delayed_work(&ap->hotplug_task, 0); if (ap->pflags & ATA_PFLAG_RECOVERED) From 172ed391f6e40f799273e005405041b57c343cf7 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 1 Mar 2018 14:10:31 -0800 Subject: [PATCH 071/336] xfs: don't allocate COW blocks for zeroing holes or unwritten extents The iomap zeroing interface is smart enough to skip zeroing holes or unwritten extents. Don't subvert this logic for reflink files. Signed-off-by: Christoph Hellwig Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_iomap.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 66e1edbfb2b2b..4e771e0f11702 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -955,6 +955,13 @@ static inline bool imap_needs_alloc(struct inode *inode, (IS_DAX(inode) && imap->br_state == XFS_EXT_UNWRITTEN); } +static inline bool needs_cow_for_zeroing(struct xfs_bmbt_irec *imap, int nimaps) +{ + return nimaps && + imap->br_startblock != HOLESTARTBLOCK && + imap->br_state != XFS_EXT_UNWRITTEN; +} + static inline bool need_excl_ilock(struct xfs_inode *ip, unsigned flags) { /* @@ -1024,7 +1031,9 @@ xfs_file_iomap_begin( goto out_unlock; } - if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { + if (xfs_is_reflink_inode(ip) && + ((flags & IOMAP_WRITE) || + ((flags & IOMAP_ZERO) && needs_cow_for_zeroing(&imap, nimaps)))) { if (flags & IOMAP_DIRECT) { /* * A reflinked inode will result in CoW alloc. From af5b5afe9ac68406892fa343fafba4ea988c3c69 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 1 Mar 2018 14:12:12 -0800 Subject: [PATCH 072/336] xfs: don't start out with the exclusive ilock for direct I/O There is no reason to take the ilock exclusively at the start of xfs_file_iomap_begin for direct I/O, given that it will be demoted just before calling xfs_iomap_write_direct anyway. Signed-off-by: Christoph Hellwig Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_iomap.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 4e771e0f11702..ee01859b77a57 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -965,13 +965,11 @@ static inline bool needs_cow_for_zeroing(struct xfs_bmbt_irec *imap, int nimaps) static inline bool need_excl_ilock(struct xfs_inode *ip, unsigned flags) { /* - * COW writes will allocate delalloc space, so we need to make sure - * to take the lock exclusively here. + * COW writes may allocate delalloc space or convert unwritten COW + * extents, so we need to make sure to take the lock exclusively here. */ if (xfs_is_reflink_inode(ip) && (flags & (IOMAP_WRITE | IOMAP_ZERO))) return true; - if ((flags & IOMAP_DIRECT) && (flags & IOMAP_WRITE)) - return true; return false; } From ff3d8b9c4cb95180ae6ef9eed28409840525b9fa Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 1 Mar 2018 14:12:45 -0800 Subject: [PATCH 073/336] xfs: don't block on the ilock for RWF_NOWAIT Fix xfs_file_iomap_begin to trylock the ilock if IOMAP_NOWAIT is passed, so that we don't block io_submit callers. Signed-off-by: Christoph Hellwig Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_iomap.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index ee01859b77a57..046469fcc1b8a 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -970,6 +970,15 @@ static inline bool need_excl_ilock(struct xfs_inode *ip, unsigned flags) */ if (xfs_is_reflink_inode(ip) && (flags & (IOMAP_WRITE | IOMAP_ZERO))) return true; + + /* + * Extents not yet cached requires exclusive access, don't block. + * This is an opencoded xfs_ilock_data_map_shared() to cater for the + * non-blocking behaviour. + */ + if (ip->i_d.di_format == XFS_DINODE_FMT_BTREE && + !(ip->i_df.if_flags & XFS_IFEXTENTS)) + return true; return false; } @@ -998,16 +1007,18 @@ xfs_file_iomap_begin( return xfs_file_iomap_begin_delay(inode, offset, length, iomap); } - if (need_excl_ilock(ip, flags)) { + if (need_excl_ilock(ip, flags)) lockmode = XFS_ILOCK_EXCL; - xfs_ilock(ip, XFS_ILOCK_EXCL); - } else { - lockmode = xfs_ilock_data_map_shared(ip); - } + else + lockmode = XFS_ILOCK_SHARED; - if ((flags & IOMAP_NOWAIT) && !(ip->i_df.if_flags & XFS_IFEXTENTS)) { - error = -EAGAIN; - goto out_unlock; + if (flags & IOMAP_NOWAIT) { + if (!(ip->i_df.if_flags & XFS_IFEXTENTS)) + return -EAGAIN; + if (!xfs_ilock_nowait(ip, lockmode)) + return -EAGAIN; + } else { + xfs_ilock(ip, lockmode); } ASSERT(offset <= mp->m_super->s_maxbytes); From cd4a6f3ab4d80cb919d15897eb3cbc85c2009d4b Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Mon, 26 Feb 2018 15:22:22 +1100 Subject: [PATCH 074/336] selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable The subpage_prot syscall is only functional when the system is using the Hash MMU. Since commit 5b2b80714796 ("powerpc/mm: Invalidate subpage_prot() system call on radix platforms") it returns ENOENT when the Radix MMU is active. Currently this just makes the test fail. Additionally the syscall is not available if the kernel is built with 4K pages, or if CONFIG_PPC_SUBPAGE_PROT=n, in which case it returns ENOSYS because the syscall is missing entirely. So check explicitly for ENOENT and ENOSYS and skip if we see either of those. Signed-off-by: Michael Ellerman --- tools/testing/selftests/powerpc/mm/subpage_prot.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/tools/testing/selftests/powerpc/mm/subpage_prot.c b/tools/testing/selftests/powerpc/mm/subpage_prot.c index 35ade7406dcdb..3ae77ba93208f 100644 --- a/tools/testing/selftests/powerpc/mm/subpage_prot.c +++ b/tools/testing/selftests/powerpc/mm/subpage_prot.c @@ -135,6 +135,16 @@ static int run_test(void *addr, unsigned long size) return 0; } +static int syscall_available(void) +{ + int rc; + + errno = 0; + rc = syscall(__NR_subpage_prot, 0, 0, 0); + + return rc == 0 || (errno != ENOENT && errno != ENOSYS); +} + int test_anon(void) { unsigned long align; @@ -145,6 +155,8 @@ int test_anon(void) void *mallocblock; unsigned long mallocsize; + SKIP_IF(!syscall_available()); + if (getpagesize() != 0x10000) { fprintf(stderr, "Kernel page size must be 64K!\n"); return 1; @@ -180,6 +192,8 @@ int test_file(void) off_t filesize; int fd; + SKIP_IF(!syscall_available()); + fd = open(file_name, O_RDWR); if (fd == -1) { perror("failed to open file"); From 07c5ccd70ad702e561fcda8e4df494f098a42742 Mon Sep 17 00:00:00 2001 From: Alastair D'Silva Date: Thu, 22 Feb 2018 15:17:38 +1100 Subject: [PATCH 075/336] ocxl: Add get_metadata IOCTL to share OCXL information to userspace Some required information is not exposed to userspace currently (eg. the PASID), pass this information back, along with other information which is currently communicated via sysfs, which saves some parsing effort in userspace. Signed-off-by: Alastair D'Silva Acked-by: Andrew Donnellan Acked-by: Frederic Barrat Signed-off-by: Michael Ellerman --- drivers/misc/ocxl/file.c | 27 +++++++++++++++++++++++++++ include/uapi/misc/ocxl.h | 17 +++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 337462e1569fe..038509e5d031f 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -102,10 +102,32 @@ static long afu_ioctl_attach(struct ocxl_context *ctx, return rc; } +static long afu_ioctl_get_metadata(struct ocxl_context *ctx, + struct ocxl_ioctl_metadata __user *uarg) +{ + struct ocxl_ioctl_metadata arg; + + memset(&arg, 0, sizeof(arg)); + + arg.version = 0; + + arg.afu_version_major = ctx->afu->config.version_major; + arg.afu_version_minor = ctx->afu->config.version_minor; + arg.pasid = ctx->pasid; + arg.pp_mmio_size = ctx->afu->config.pp_mmio_stride; + arg.global_mmio_size = ctx->afu->config.global_mmio_size; + + if (copy_to_user(uarg, &arg, sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ + x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -159,6 +181,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, irq_fd.eventfd); break; + case OCXL_IOCTL_GET_METADATA: + rc = afu_ioctl_get_metadata(ctx, + (struct ocxl_ioctl_metadata __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 4b0b0b756f3ee..0af83d80fb3ea 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -32,6 +32,22 @@ struct ocxl_ioctl_attach { __u64 reserved3; }; +struct ocxl_ioctl_metadata { + __u16 version; // struct version, always backwards compatible + + // Version 0 fields + __u8 afu_version_major; + __u8 afu_version_minor; + __u32 pasid; // PASID assigned to the current context + + __u64 pp_mmio_size; // Per PASID MMIO size + __u64 global_mmio_size; + + // End version 0 fields + + __u64 reserved[13]; // Total of 16*u64 +}; + struct ocxl_ioctl_irq_fd { __u64 irq_offset; __s32 eventfd; @@ -45,5 +61,6 @@ struct ocxl_ioctl_irq_fd { #define OCXL_IOCTL_IRQ_ALLOC _IOR(OCXL_MAGIC, 0x11, __u64) #define OCXL_IOCTL_IRQ_FREE _IOW(OCXL_MAGIC, 0x12, __u64) #define OCXL_IOCTL_IRQ_SET_FD _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd) +#define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct ocxl_ioctl_metadata) #endif /* _UAPI_MISC_OCXL_H */ From e7666d046ac0eda535282a5fd3b188f31d0f4afd Mon Sep 17 00:00:00 2001 From: Alastair D'Silva Date: Thu, 22 Feb 2018 15:17:39 +1100 Subject: [PATCH 076/336] ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL Signed-off-by: Alastair D'Silva Acked-by: Andrew Donnellan Acked-by: Frederic Barrat Signed-off-by: Michael Ellerman --- Documentation/accelerators/ocxl.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index 4f7af841d935a..ddcc58d01cfbc 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -152,6 +152,11 @@ OCXL_IOCTL_IRQ_SET_FD: Associate an event fd to an AFU interrupt so that the user process can be notified when the AFU sends an interrupt. +OCXL_IOCTL_GET_METADATA: + + Obtains configuration information from the card, such at the size of + MMIO areas, the AFU version, and the PASID for the current context. + mmap ---- From c3856aeb29402e94ad9b3879030165cc6a4fdc56 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Fri, 23 Feb 2018 21:21:12 +1100 Subject: [PATCH 077/336] KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler This fixes several bugs in the radix page fault handler relating to the way large pages in the memory backing the guest were handled. First, the check for large pages only checked for explicit huge pages and missed transparent huge pages. Then the check that the addresses (host virtual vs. guest physical) had appropriate alignment was wrong, meaning that the code never put a large page in the partition scoped radix tree; it was always demoted to a small page. Fixing this exposed bugs in kvmppc_create_pte(). We were never invalidating a 2MB PTE, which meant that if a page was initially faulted in without write permission and the guest then attempted to store to it, we would never update the PTE to have write permission. If we find a valid 2MB PTE in the PMD, we need to clear it and do a TLB invalidation before installing either the new 2MB PTE or a pointer to a page table page. This also corrects an assumption that get_user_pages_fast would set the _PAGE_DIRTY bit if we are writing, which is not true. Instead we mark the page dirty explicitly with set_page_dirty_lock(). This also means we don't need the dirty bit set on the host PTE when providing write access on a read fault. Signed-off-by: Paul Mackerras --- arch/powerpc/kvm/book3s_64_mmu_radix.c | 69 ++++++++++++++++---------- 1 file changed, 43 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 0c854816e653e..5cb4e4687107e 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -195,6 +195,12 @@ static void kvmppc_pte_free(pte_t *ptep) kmem_cache_free(kvm_pte_cache, ptep); } +/* Like pmd_huge() and pmd_large(), but works regardless of config options */ +static inline int pmd_is_leaf(pmd_t pmd) +{ + return !!(pmd_val(pmd) & _PAGE_PTE); +} + static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa, unsigned int level, unsigned long mmu_seq) { @@ -219,7 +225,7 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa, else new_pmd = pmd_alloc_one(kvm->mm, gpa); - if (level == 0 && !(pmd && pmd_present(*pmd))) + if (level == 0 && !(pmd && pmd_present(*pmd) && !pmd_is_leaf(*pmd))) new_ptep = kvmppc_pte_alloc(); /* Check if we might have been invalidated; let the guest retry if so */ @@ -244,12 +250,30 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa, new_pmd = NULL; } pmd = pmd_offset(pud, gpa); - if (pmd_large(*pmd)) { - /* Someone else has instantiated a large page here; retry */ - ret = -EAGAIN; - goto out_unlock; - } - if (level == 1 && !pmd_none(*pmd)) { + if (pmd_is_leaf(*pmd)) { + unsigned long lgpa = gpa & PMD_MASK; + + /* + * If we raced with another CPU which has just put + * a 2MB pte in after we saw a pte page, try again. + */ + if (level == 0 && !new_ptep) { + ret = -EAGAIN; + goto out_unlock; + } + /* Valid 2MB page here already, remove it */ + old = kvmppc_radix_update_pte(kvm, pmdp_ptep(pmd), + ~0UL, 0, lgpa, PMD_SHIFT); + kvmppc_radix_tlbie_page(kvm, lgpa, PMD_SHIFT); + if (old & _PAGE_DIRTY) { + unsigned long gfn = lgpa >> PAGE_SHIFT; + struct kvm_memory_slot *memslot; + memslot = gfn_to_memslot(kvm, gfn); + if (memslot && memslot->dirty_bitmap) + kvmppc_update_dirty_map(memslot, + gfn, PMD_SIZE); + } + } else if (level == 1 && !pmd_none(*pmd)) { /* * There's a page table page here, but we wanted * to install a large page. Tell the caller and let @@ -412,28 +436,24 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, } else { page = pages[0]; pfn = page_to_pfn(page); - if (PageHuge(page)) { - page = compound_head(page); - pte_size <<= compound_order(page); + if (PageCompound(page)) { + pte_size <<= compound_order(compound_head(page)); /* See if we can insert a 2MB large-page PTE here */ if (pte_size >= PMD_SIZE && - (gpa & PMD_MASK & PAGE_MASK) == - (hva & PMD_MASK & PAGE_MASK)) { + (gpa & (PMD_SIZE - PAGE_SIZE)) == + (hva & (PMD_SIZE - PAGE_SIZE))) { level = 1; pfn &= ~((PMD_SIZE >> PAGE_SHIFT) - 1); } } /* See if we can provide write access */ if (writing) { - /* - * We assume gup_fast has set dirty on the host PTE. - */ pgflags |= _PAGE_WRITE; } else { local_irq_save(flags); ptep = find_current_mm_pte(current->mm->pgd, hva, NULL, NULL); - if (ptep && pte_write(*ptep) && pte_dirty(*ptep)) + if (ptep && pte_write(*ptep)) pgflags |= _PAGE_WRITE; local_irq_restore(flags); } @@ -459,18 +479,15 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, pte = pfn_pte(pfn, __pgprot(pgflags)); ret = kvmppc_create_pte(kvm, pte, gpa, level, mmu_seq); } - if (ret == 0 || ret == -EAGAIN) - ret = RESUME_GUEST; if (page) { - /* - * We drop pages[0] here, not page because page might - * have been set to the head page of a compound, but - * we have to drop the reference on the correct tail - * page to match the get inside gup() - */ - put_page(pages[0]); + if (!ret && (pgflags & _PAGE_WRITE)) + set_page_dirty_lock(page); + put_page(page); } + + if (ret == 0 || ret == -EAGAIN) + ret = RESUME_GUEST; return ret; } @@ -644,7 +661,7 @@ void kvmppc_free_radix(struct kvm *kvm) continue; pmd = pmd_offset(pud, 0); for (im = 0; im < PTRS_PER_PMD; ++im, ++pmd) { - if (pmd_huge(*pmd)) { + if (pmd_is_leaf(*pmd)) { pmd_clear(pmd); continue; } From debd574f4195e205ba505b25e19b2b797f4bcd94 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Fri, 2 Mar 2018 15:38:04 +1100 Subject: [PATCH 078/336] KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing The current code for initializing the VRMA (virtual real memory area) for HPT guests requires the page size of the backing memory to be one of 4kB, 64kB or 16MB. With a radix host we have the possibility that the backing memory page size can be 2MB or 1GB. In these cases, if the guest switches to HPT mode, KVM will not initialize the VRMA and the guest will fail to run. In fact it is not necessary that the VRMA page size is the same as the backing memory page size; any VRMA page size less than or equal to the backing memory page size is acceptable. Therefore we now choose the largest page size out of the set {4k, 64k, 16M} which is not larger than the backing memory page size. Signed-off-by: Paul Mackerras --- arch/powerpc/kvm/book3s_hv.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 89707354c2efd..b4a538b29da55 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3656,15 +3656,17 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) goto up_out; psize = vma_kernel_pagesize(vma); - porder = __ilog2(psize); up_read(¤t->mm->mmap_sem); /* We can handle 4k, 64k or 16M pages in the VRMA */ - err = -EINVAL; - if (!(psize == 0x1000 || psize == 0x10000 || - psize == 0x1000000)) - goto out_srcu; + if (psize >= 0x1000000) + psize = 0x1000000; + else if (psize >= 0x10000) + psize = 0x10000; + else + psize = 0x1000; + porder = __ilog2(psize); senc = slb_pgsize_encoding(psize); kvm->arch.vrma_slb_v = senc | SLB_VSID_B_1T | From f3e5feeb92a163c935659b7222a32965276c1c23 Mon Sep 17 00:00:00 2001 From: Jernej Skrabec Date: Thu, 1 Mar 2018 22:34:32 +0100 Subject: [PATCH 079/336] drm/sun4i: Release exclusive clock lock when disabling TCON Currently exclusive TCON clock lock is never released, which, for example, prevents changing resolution on HDMI. In order to fix that, release clock when disabling TCON. TCON is always disabled first before new mode is set. Signed-off-by: Jernej Skrabec Signed-off-by: Maxime Ripard Link: https://patchwork.freedesktop.org/patch/msgid/20180301213442.16677-7-jernej.skrabec@siol.net --- drivers/gpu/drm/sun4i/sun4i_tcon.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c b/drivers/gpu/drm/sun4i/sun4i_tcon.c index b3960118deb9e..ade197b1a9ac6 100644 --- a/drivers/gpu/drm/sun4i/sun4i_tcon.c +++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c @@ -101,10 +101,12 @@ static void sun4i_tcon_channel_set_status(struct sun4i_tcon *tcon, int channel, return; } - if (enabled) + if (enabled) { clk_prepare_enable(clk); - else + } else { + clk_rate_exclusive_put(clk); clk_disable_unprepare(clk); + } } static void sun4i_tcon_lvds_set_status(struct sun4i_tcon *tcon, From d5078193e56bb24f4593f00102a3b5e07bb84ee0 Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Fri, 2 Mar 2018 13:05:36 +0800 Subject: [PATCH 080/336] ALSA: hda - Fix a wrong FIXUP for alc289 on Dell machines With the alc289, the Pin 0x1b is Headphone-Mic, so we should assign ALC269_FIXUP_DELL4_MIC_NO_PRESENCE rather than ALC225_FIXUP_DELL1_MIC_NO_PRESENCE to it. And this change is suggested by Kailang of Realtek and is verified on the machine. Fixes: 3f2f7c553d07 ("ALSA: hda - Fix headset mic detection problem for two Dell machines") Cc: Kailang Yang Cc: Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index b9c93fa0a51c6..7a9a867029276 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6872,7 +6872,7 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x12, 0x90a60120}, {0x14, 0x90170110}, {0x21, 0x0321101f}), - SND_HDA_PIN_QUIRK(0x10ec0289, 0x1028, "Dell", ALC225_FIXUP_DELL1_MIC_NO_PRESENCE, + SND_HDA_PIN_QUIRK(0x10ec0289, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, {0x12, 0xb7a60130}, {0x14, 0x90170110}, {0x21, 0x04211020}), From fde9fc766e96c494b82931b1d270a9a751be07c0 Mon Sep 17 00:00:00 2001 From: Matt Redfearn Date: Mon, 19 Feb 2018 16:55:06 +0000 Subject: [PATCH 081/336] signals: Move put_compat_sigset to compat.h to silence hardened usercopy Since commit afcc90f8621e ("usercopy: WARN() on slab cache usercopy region violations"), MIPS systems booting with a compat root filesystem emit a warning when copying compat siginfo to userspace: WARNING: CPU: 0 PID: 953 at mm/usercopy.c:81 usercopy_warn+0x98/0xe8 Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLAB object 'task_struct' (offset 1432, size 16)! Modules linked in: CPU: 0 PID: 953 Comm: S01logging Not tainted 4.16.0-rc2 #10 Stack : ffffffff808c0000 0000000000000000 0000000000000001 65ac85163f3bdc4a 65ac85163f3bdc4a 0000000000000000 90000000ff667ab8 ffffffff808c0000 00000000000003f8 ffffffff808d0000 00000000000000d1 0000000000000000 000000000000003c 0000000000000000 ffffffff808c8ca8 ffffffff808d0000 ffffffff808d0000 ffffffff80810000 fffffc0000000000 ffffffff80785c30 0000000000000009 0000000000000051 90000000ff667eb0 90000000ff667db0 000000007fe0d938 0000000000000018 ffffffff80449958 0000000020052798 ffffffff808c0000 90000000ff664000 90000000ff667ab0 00000000100c0000 ffffffff80698810 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ffffffff8010d02c 65ac85163f3bdc4a ... Call Trace: [] show_stack+0x9c/0x130 [] dump_stack+0x90/0xd0 [] __warn+0x100/0x118 [] warn_slowpath_fmt+0x4c/0x70 [] usercopy_warn+0x98/0xe8 [] __check_object_size+0xfc/0x250 [] put_compat_sigset+0x30/0x88 [] setup_rt_frame_n32+0xc4/0x160 [] do_signal+0x19c/0x230 [] do_notify_resume+0x60/0x78 [] work_notifysig+0x10/0x18 ---[ end trace 88fffbf69147f48a ]--- Commit 5905429ad856 ("fork: Provide usercopy whitelisting for task_struct") noted that: "While the blocked and saved_sigmask fields of task_struct are copied to userspace (via sigmask_to_save() and setup_rt_frame()), it is always copied with a static length (i.e. sizeof(sigset_t))." However, this is not true in the case of compat signals, whose sigset is copied by put_compat_sigset and receives size as an argument. At most call sites, put_compat_sigset is copying a sigset from the current task_struct. This triggers a warning when CONFIG_HARDENED_USERCOPY is active. However, by marking this function as static inline, the warning can be avoided because in all of these cases the size is constant at compile time, which is allowed. The only site where this is not the case is handling the rt_sigpending syscall, but there the copy is being made from a stack local variable so does not trigger the warning. Move put_compat_sigset to compat.h, and mark it static inline. This fixes the WARN on MIPS. Fixes: afcc90f8621e ("usercopy: WARN() on slab cache usercopy region violations") Signed-off-by: Matt Redfearn Acked-by: Kees Cook Cc: "Dmitry V . Levin" Cc: Al Viro Cc: kernel-hardening@lists.openwall.com Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18639/ Signed-off-by: James Hogan --- include/linux/compat.h | 26 ++++++++++++++++++++++++-- kernel/compat.c | 19 ------------------- 2 files changed, 24 insertions(+), 21 deletions(-) diff --git a/include/linux/compat.h b/include/linux/compat.h index 8a9643857c4a1..c4139c7a0de00 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -17,6 +17,7 @@ #include #include #include /* for aio_context_t */ +#include #include #include @@ -550,8 +551,29 @@ asmlinkage long compat_sys_settimeofday(struct compat_timeval __user *tv, asmlinkage long compat_sys_adjtimex(struct compat_timex __user *utp); extern int get_compat_sigset(sigset_t *set, const compat_sigset_t __user *compat); -extern int put_compat_sigset(compat_sigset_t __user *compat, - const sigset_t *set, unsigned int size); + +/* + * Defined inline such that size can be compile time constant, which avoids + * CONFIG_HARDENED_USERCOPY complaining about copies from task_struct + */ +static inline int +put_compat_sigset(compat_sigset_t __user *compat, const sigset_t *set, + unsigned int size) +{ + /* size <= sizeof(compat_sigset_t) <= sizeof(sigset_t) */ +#ifdef __BIG_ENDIAN + compat_sigset_t v; + switch (_NSIG_WORDS) { + case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3]; + case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2]; + case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1]; + case 1: v.sig[1] = (set->sig[0] >> 32); v.sig[0] = set->sig[0]; + } + return copy_to_user(compat, &v, size) ? -EFAULT : 0; +#else + return copy_to_user(compat, set, size) ? -EFAULT : 0; +#endif +} asmlinkage long compat_sys_migrate_pages(compat_pid_t pid, compat_ulong_t maxnode, const compat_ulong_t __user *old_nodes, diff --git a/kernel/compat.c b/kernel/compat.c index 3247fe761f601..3f5fa8902e7dc 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -488,25 +488,6 @@ get_compat_sigset(sigset_t *set, const compat_sigset_t __user *compat) } EXPORT_SYMBOL_GPL(get_compat_sigset); -int -put_compat_sigset(compat_sigset_t __user *compat, const sigset_t *set, - unsigned int size) -{ - /* size <= sizeof(compat_sigset_t) <= sizeof(sigset_t) */ -#ifdef __BIG_ENDIAN - compat_sigset_t v; - switch (_NSIG_WORDS) { - case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3]; - case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2]; - case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1]; - case 1: v.sig[1] = (set->sig[0] >> 32); v.sig[0] = set->sig[0]; - } - return copy_to_user(compat, &v, size) ? -EFAULT : 0; -#else - return copy_to_user(compat, set, size) ? -EFAULT : 0; -#endif -} - #ifdef CONFIG_NUMA COMPAT_SYSCALL_DEFINE6(move_pages, pid_t, pid, compat_ulong_t, nr_pages, compat_uptr_t __user *, pages32, From 61bd0f66ff92d5ce765ff9850fd3cbfec773c560 Mon Sep 17 00:00:00 2001 From: Laurent Vivier Date: Fri, 2 Mar 2018 11:51:56 +0100 Subject: [PATCH 082/336] KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN Since commit 8b24e69fc47e ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry"), if CONFIG_VIRT_CPU_ACCOUNTING_GEN is set, the guest time is not accounted to guest time and user time, but instead to system time. This is because guest_enter()/guest_exit() are called while interrupts are disabled and the tick counter cannot be updated between them. To fix that, move guest_exit() after local_irq_enable(), and as guest_enter() is called with IRQ disabled, call guest_enter_irqoff() instead. Fixes: 8b24e69fc47e ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry") Signed-off-by: Laurent Vivier Reviewed-by: Paolo Bonzini Signed-off-by: Paul Mackerras --- arch/powerpc/kvm/book3s_hv.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index b4a538b29da55..9cb9448163c4b 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2885,7 +2885,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) */ trace_hardirqs_on(); - guest_enter(); + guest_enter_irqoff(); srcu_idx = srcu_read_lock(&vc->kvm->srcu); @@ -2893,8 +2893,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) srcu_read_unlock(&vc->kvm->srcu, srcu_idx); - guest_exit(); - trace_hardirqs_off(); set_irq_happened(trap); @@ -2937,6 +2935,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) kvmppc_set_host_core(pcpu); local_irq_enable(); + guest_exit(); /* Let secondaries go back to the offline loop */ for (i = 0; i < controlled_threads; ++i) { From 7bd3e7b743956afbec30fb525bc3c5e22e3d475c Mon Sep 17 00:00:00 2001 From: Igor Pylypiv Date: Wed, 28 Feb 2018 00:59:12 -0800 Subject: [PATCH 083/336] watchdog: f71808e_wdt: Fix magic close handling Watchdog close is "expected" when any byte is 'V' not just the last one. Writing "V" to the device fails because the last byte is the end of string. $ echo V > /dev/watchdog f71808e_wdt: Unexpected close, not stopping watchdog! Signed-off-by: Igor Pylypiv Reviewed-by: Guenter Roeck Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck --- drivers/watchdog/f71808e_wdt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c index e0678c14480f2..3a33c5344bd5e 100644 --- a/drivers/watchdog/f71808e_wdt.c +++ b/drivers/watchdog/f71808e_wdt.c @@ -566,7 +566,8 @@ static ssize_t watchdog_write(struct file *file, const char __user *buf, char c; if (get_user(c, buf + i)) return -EFAULT; - expect_close = (c == 'V'); + if (c == 'V') + expect_close = true; } /* Properly order writes across fork()ed processes */ From 93ac3deb7c220cbcec032a967220a1f109d58431 Mon Sep 17 00:00:00 2001 From: Jayachandran C Date: Wed, 28 Feb 2018 02:52:20 -0800 Subject: [PATCH 084/336] watchdog: sbsa: use 32-bit read for WCV According to SBSA spec v3.1 section 5.3: All registers are 32 bits in size and should be accessed using 32-bit reads and writes. If an access size other than 32 bits is used then the results are IMPLEMENTATION DEFINED. [...] The Generic Watchdog is little-endian The current code uses readq to read the watchdog compare register which does a 64-bit access. This fails on ThunderX2 which does not implement 64-bit access to this register. Fix this by using lo_hi_readq() that does two 32-bit reads. Signed-off-by: Jayachandran C Reviewed-by: Guenter Roeck Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck --- drivers/watchdog/sbsa_gwdt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/watchdog/sbsa_gwdt.c b/drivers/watchdog/sbsa_gwdt.c index 316c2eb122d23..e8bd9887c5663 100644 --- a/drivers/watchdog/sbsa_gwdt.c +++ b/drivers/watchdog/sbsa_gwdt.c @@ -50,6 +50,7 @@ */ #include +#include #include #include #include @@ -159,7 +160,7 @@ static unsigned int sbsa_gwdt_get_timeleft(struct watchdog_device *wdd) !(readl(gwdt->control_base + SBSA_GWDT_WCS) & SBSA_GWDT_WCS_WS0)) timeleft += readl(gwdt->control_base + SBSA_GWDT_WOR); - timeleft += readq(gwdt->control_base + SBSA_GWDT_WCV) - + timeleft += lo_hi_readq(gwdt->control_base + SBSA_GWDT_WCV) - arch_counter_get_cntvct(); do_div(timeleft, gwdt->clk); From 2b3d89b402b085b08498e896c65267a145bed486 Mon Sep 17 00:00:00 2001 From: Jerry Hoemann Date: Sun, 25 Feb 2018 20:22:20 -0700 Subject: [PATCH 085/336] watchdog: hpwdt: Remove legacy NMI sourcing. Gen8 and prior Proliant systems supported the "CRU" interface to firmware. This interfaces allows linux to "call back" into firmware to source the cause of an NMI. This feature isn't fully utilized as the actual source of the NMI isn't printed, the driver only indicates that the source couldn't be determined when the call fails. With the advent of Gen9, iCRU replaces the CRU. The call back feature is no longer available in firmware. To be compatible and not attempt to call back into firmware on system not supporting CRU, the SMBIOS table is consulted to determine if it is safe to make the call back or not. This results in about half of the driver code being devoted to either making CRU calls or determing if it is safe to make CRU calls. As noted, the driver isn't really using the results of the CRU calls. Furthermore, as a consequence of the Spectre security issue, the BIOS/EFI calls are being wrapped into Spectre-disabling section. Removing the call back in hpwdt_pretimeout assists in this effort. As the CRU sourcing of the NMI isn't required for handling the NMI and there are security concerns with making the call back, remove the legacy (pre Gen9) NMI sourcing and the DMI code to determine if the system had the CRU interface. Signed-off-by: Jerry Hoemann Acked-by: Ingo Molnar Reviewed-by: Guenter Roeck Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck --- drivers/watchdog/hpwdt.c | 501 +-------------------------------------- 1 file changed, 9 insertions(+), 492 deletions(-) diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c index f1f00dfc0e68c..b0a158073abd5 100644 --- a/drivers/watchdog/hpwdt.c +++ b/drivers/watchdog/hpwdt.c @@ -28,16 +28,7 @@ #include #include #include -#ifdef CONFIG_HPWDT_NMI_DECODING -#include -#include -#include -#include -#include -#include -#endif /* CONFIG_HPWDT_NMI_DECODING */ #include -#include #define HPWDT_VERSION "1.4.0" #define SECS_TO_TICKS(secs) ((secs) * 1000 / 128) @@ -48,6 +39,9 @@ static unsigned int soft_margin = DEFAULT_MARGIN; /* in seconds */ static unsigned int reload; /* the computed soft_margin */ static bool nowayout = WATCHDOG_NOWAYOUT; +#ifdef CONFIG_HPWDT_NMI_DECODING +static unsigned int allow_kdump = 1; +#endif static char expect_release; static unsigned long hpwdt_is_open; @@ -63,373 +57,6 @@ static const struct pci_device_id hpwdt_devices[] = { }; MODULE_DEVICE_TABLE(pci, hpwdt_devices); -#ifdef CONFIG_HPWDT_NMI_DECODING -#define PCI_BIOS32_SD_VALUE 0x5F32335F /* "_32_" */ -#define CRU_BIOS_SIGNATURE_VALUE 0x55524324 -#define PCI_BIOS32_PARAGRAPH_LEN 16 -#define PCI_ROM_BASE1 0x000F0000 -#define ROM_SIZE 0x10000 - -struct bios32_service_dir { - u32 signature; - u32 entry_point; - u8 revision; - u8 length; - u8 checksum; - u8 reserved[5]; -}; - -/* type 212 */ -struct smbios_cru64_info { - u8 type; - u8 byte_length; - u16 handle; - u32 signature; - u64 physical_address; - u32 double_length; - u32 double_offset; -}; -#define SMBIOS_CRU64_INFORMATION 212 - -/* type 219 */ -struct smbios_proliant_info { - u8 type; - u8 byte_length; - u16 handle; - u32 power_features; - u32 omega_features; - u32 reserved; - u32 misc_features; -}; -#define SMBIOS_ICRU_INFORMATION 219 - - -struct cmn_registers { - union { - struct { - u8 ral; - u8 rah; - u16 rea2; - }; - u32 reax; - } u1; - union { - struct { - u8 rbl; - u8 rbh; - u8 reb2l; - u8 reb2h; - }; - u32 rebx; - } u2; - union { - struct { - u8 rcl; - u8 rch; - u16 rec2; - }; - u32 recx; - } u3; - union { - struct { - u8 rdl; - u8 rdh; - u16 red2; - }; - u32 redx; - } u4; - - u32 resi; - u32 redi; - u16 rds; - u16 res; - u32 reflags; -} __attribute__((packed)); - -static unsigned int hpwdt_nmi_decoding; -static unsigned int allow_kdump = 1; -static unsigned int is_icru; -static unsigned int is_uefi; -static DEFINE_SPINLOCK(rom_lock); -static void *cru_rom_addr; -static struct cmn_registers cmn_regs; - -extern asmlinkage void asminline_call(struct cmn_registers *pi86Regs, - unsigned long *pRomEntry); - -#ifdef CONFIG_X86_32 -/* --32 Bit Bios------------------------------------------------------------ */ - -#define HPWDT_ARCH 32 - -asm(".text \n\t" - ".align 4 \n\t" - ".globl asminline_call \n" - "asminline_call: \n\t" - "pushl %ebp \n\t" - "movl %esp, %ebp \n\t" - "pusha \n\t" - "pushf \n\t" - "push %es \n\t" - "push %ds \n\t" - "pop %es \n\t" - "movl 8(%ebp),%eax \n\t" - "movl 4(%eax),%ebx \n\t" - "movl 8(%eax),%ecx \n\t" - "movl 12(%eax),%edx \n\t" - "movl 16(%eax),%esi \n\t" - "movl 20(%eax),%edi \n\t" - "movl (%eax),%eax \n\t" - "push %cs \n\t" - "call *12(%ebp) \n\t" - "pushf \n\t" - "pushl %eax \n\t" - "movl 8(%ebp),%eax \n\t" - "movl %ebx,4(%eax) \n\t" - "movl %ecx,8(%eax) \n\t" - "movl %edx,12(%eax) \n\t" - "movl %esi,16(%eax) \n\t" - "movl %edi,20(%eax) \n\t" - "movw %ds,24(%eax) \n\t" - "movw %es,26(%eax) \n\t" - "popl %ebx \n\t" - "movl %ebx,(%eax) \n\t" - "popl %ebx \n\t" - "movl %ebx,28(%eax) \n\t" - "pop %es \n\t" - "popf \n\t" - "popa \n\t" - "leave \n\t" - "ret \n\t" - ".previous"); - - -/* - * cru_detect - * - * Routine Description: - * This function uses the 32-bit BIOS Service Directory record to - * search for a $CRU record. - * - * Return Value: - * 0 : SUCCESS - * <0 : FAILURE - */ -static int cru_detect(unsigned long map_entry, - unsigned long map_offset) -{ - void *bios32_map; - unsigned long *bios32_entrypoint; - unsigned long cru_physical_address; - unsigned long cru_length; - unsigned long physical_bios_base = 0; - unsigned long physical_bios_offset = 0; - int retval = -ENODEV; - - bios32_map = ioremap(map_entry, (2 * PAGE_SIZE)); - - if (bios32_map == NULL) - return -ENODEV; - - bios32_entrypoint = bios32_map + map_offset; - - cmn_regs.u1.reax = CRU_BIOS_SIGNATURE_VALUE; - - set_memory_x((unsigned long)bios32_map, 2); - asminline_call(&cmn_regs, bios32_entrypoint); - - if (cmn_regs.u1.ral != 0) { - pr_warn("Call succeeded but with an error: 0x%x\n", - cmn_regs.u1.ral); - } else { - physical_bios_base = cmn_regs.u2.rebx; - physical_bios_offset = cmn_regs.u4.redx; - cru_length = cmn_regs.u3.recx; - cru_physical_address = - physical_bios_base + physical_bios_offset; - - /* If the values look OK, then map it in. */ - if ((physical_bios_base + physical_bios_offset)) { - cru_rom_addr = - ioremap(cru_physical_address, cru_length); - if (cru_rom_addr) { - set_memory_x((unsigned long)cru_rom_addr & PAGE_MASK, - (cru_length + PAGE_SIZE - 1) >> PAGE_SHIFT); - retval = 0; - } - } - - pr_debug("CRU Base Address: 0x%lx\n", physical_bios_base); - pr_debug("CRU Offset Address: 0x%lx\n", physical_bios_offset); - pr_debug("CRU Length: 0x%lx\n", cru_length); - pr_debug("CRU Mapped Address: %p\n", &cru_rom_addr); - } - iounmap(bios32_map); - return retval; -} - -/* - * bios_checksum - */ -static int bios_checksum(const char __iomem *ptr, int len) -{ - char sum = 0; - int i; - - /* - * calculate checksum of size bytes. This should add up - * to zero if we have a valid header. - */ - for (i = 0; i < len; i++) - sum += ptr[i]; - - return ((sum == 0) && (len > 0)); -} - -/* - * bios32_present - * - * Routine Description: - * This function finds the 32-bit BIOS Service Directory - * - * Return Value: - * 0 : SUCCESS - * <0 : FAILURE - */ -static int bios32_present(const char __iomem *p) -{ - struct bios32_service_dir *bios_32_ptr; - int length; - unsigned long map_entry, map_offset; - - bios_32_ptr = (struct bios32_service_dir *) p; - - /* - * Search for signature by checking equal to the swizzled value - * instead of calling another routine to perform a strcmp. - */ - if (bios_32_ptr->signature == PCI_BIOS32_SD_VALUE) { - length = bios_32_ptr->length * PCI_BIOS32_PARAGRAPH_LEN; - if (bios_checksum(p, length)) { - /* - * According to the spec, we're looking for the - * first 4KB-aligned address below the entrypoint - * listed in the header. The Service Directory code - * is guaranteed to occupy no more than 2 4KB pages. - */ - map_entry = bios_32_ptr->entry_point & ~(PAGE_SIZE - 1); - map_offset = bios_32_ptr->entry_point - map_entry; - - return cru_detect(map_entry, map_offset); - } - } - return -ENODEV; -} - -static int detect_cru_service(void) -{ - char __iomem *p, *q; - int rc = -1; - - /* - * Search from 0x0f0000 through 0x0fffff, inclusive. - */ - p = ioremap(PCI_ROM_BASE1, ROM_SIZE); - if (p == NULL) - return -ENOMEM; - - for (q = p; q < p + ROM_SIZE; q += 16) { - rc = bios32_present(q); - if (!rc) - break; - } - iounmap(p); - return rc; -} -/* ------------------------------------------------------------------------- */ -#endif /* CONFIG_X86_32 */ -#ifdef CONFIG_X86_64 -/* --64 Bit Bios------------------------------------------------------------ */ - -#define HPWDT_ARCH 64 - -asm(".text \n\t" - ".align 4 \n\t" - ".globl asminline_call \n\t" - ".type asminline_call, @function \n\t" - "asminline_call: \n\t" - FRAME_BEGIN - "pushq %rax \n\t" - "pushq %rbx \n\t" - "pushq %rdx \n\t" - "pushq %r12 \n\t" - "pushq %r9 \n\t" - "movq %rsi, %r12 \n\t" - "movq %rdi, %r9 \n\t" - "movl 4(%r9),%ebx \n\t" - "movl 8(%r9),%ecx \n\t" - "movl 12(%r9),%edx \n\t" - "movl 16(%r9),%esi \n\t" - "movl 20(%r9),%edi \n\t" - "movl (%r9),%eax \n\t" - "call *%r12 \n\t" - "pushfq \n\t" - "popq %r12 \n\t" - "movl %eax, (%r9) \n\t" - "movl %ebx, 4(%r9) \n\t" - "movl %ecx, 8(%r9) \n\t" - "movl %edx, 12(%r9) \n\t" - "movl %esi, 16(%r9) \n\t" - "movl %edi, 20(%r9) \n\t" - "movq %r12, %rax \n\t" - "movl %eax, 28(%r9) \n\t" - "popq %r9 \n\t" - "popq %r12 \n\t" - "popq %rdx \n\t" - "popq %rbx \n\t" - "popq %rax \n\t" - FRAME_END - "ret \n\t" - ".previous"); - -/* - * dmi_find_cru - * - * Routine Description: - * This function checks whether or not a SMBIOS/DMI record is - * the 64bit CRU info or not - */ -static void dmi_find_cru(const struct dmi_header *dm, void *dummy) -{ - struct smbios_cru64_info *smbios_cru64_ptr; - unsigned long cru_physical_address; - - if (dm->type == SMBIOS_CRU64_INFORMATION) { - smbios_cru64_ptr = (struct smbios_cru64_info *) dm; - if (smbios_cru64_ptr->signature == CRU_BIOS_SIGNATURE_VALUE) { - cru_physical_address = - smbios_cru64_ptr->physical_address + - smbios_cru64_ptr->double_offset; - cru_rom_addr = ioremap(cru_physical_address, - smbios_cru64_ptr->double_length); - set_memory_x((unsigned long)cru_rom_addr & PAGE_MASK, - smbios_cru64_ptr->double_length >> PAGE_SHIFT); - } - } -} - -static int detect_cru_service(void) -{ - cru_rom_addr = NULL; - - dmi_walk(dmi_find_cru, NULL); - - /* if cru_rom_addr has been set then we found a CRU service */ - return ((cru_rom_addr != NULL) ? 0 : -ENODEV); -} -/* ------------------------------------------------------------------------- */ -#endif /* CONFIG_X86_64 */ -#endif /* CONFIG_HPWDT_NMI_DECODING */ /* * Watchdog operations @@ -486,30 +113,12 @@ static int hpwdt_my_nmi(void) */ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs) { - unsigned long rom_pl; - static int die_nmi_called; - - if (!hpwdt_nmi_decoding) - return NMI_DONE; - if ((ulReason == NMI_UNKNOWN) && !hpwdt_my_nmi()) return NMI_DONE; - spin_lock_irqsave(&rom_lock, rom_pl); - if (!die_nmi_called && !is_icru && !is_uefi) - asminline_call(&cmn_regs, cru_rom_addr); - die_nmi_called = 1; - spin_unlock_irqrestore(&rom_lock, rom_pl); - if (allow_kdump) hpwdt_stop(); - if (!is_icru && !is_uefi) { - if (cmn_regs.u1.ral == 0) { - nmi_panic(regs, "An NMI occurred, but unable to determine source.\n"); - return NMI_HANDLED; - } - } nmi_panic(regs, "An NMI occurred. Depending on your system the reason " "for the NMI is logged in any one of the following " "resources:\n" @@ -675,84 +284,11 @@ static struct miscdevice hpwdt_miscdev = { * Init & Exit */ -#ifdef CONFIG_HPWDT_NMI_DECODING -#ifdef CONFIG_X86_LOCAL_APIC -static void hpwdt_check_nmi_decoding(struct pci_dev *dev) -{ - /* - * If nmi_watchdog is turned off then we can turn on - * our nmi decoding capability. - */ - hpwdt_nmi_decoding = 1; -} -#else -static void hpwdt_check_nmi_decoding(struct pci_dev *dev) -{ - dev_warn(&dev->dev, "NMI decoding is disabled. " - "Your kernel does not support a NMI Watchdog.\n"); -} -#endif /* CONFIG_X86_LOCAL_APIC */ - -/* - * dmi_find_icru - * - * Routine Description: - * This function checks whether or not we are on an iCRU-based server. - * This check is independent of architecture and needs to be made for - * any ProLiant system. - */ -static void dmi_find_icru(const struct dmi_header *dm, void *dummy) -{ - struct smbios_proliant_info *smbios_proliant_ptr; - - if (dm->type == SMBIOS_ICRU_INFORMATION) { - smbios_proliant_ptr = (struct smbios_proliant_info *) dm; - if (smbios_proliant_ptr->misc_features & 0x01) - is_icru = 1; - if (smbios_proliant_ptr->misc_features & 0x1400) - is_uefi = 1; - } -} static int hpwdt_init_nmi_decoding(struct pci_dev *dev) { +#ifdef CONFIG_HPWDT_NMI_DECODING int retval; - - /* - * On typical CRU-based systems we need to map that service in - * the BIOS. For 32 bit Operating Systems we need to go through - * the 32 Bit BIOS Service Directory. For 64 bit Operating - * Systems we get that service through SMBIOS. - * - * On systems that support the new iCRU service all we need to - * do is call dmi_walk to get the supported flag value and skip - * the old cru detect code. - */ - dmi_walk(dmi_find_icru, NULL); - if (!is_icru && !is_uefi) { - - /* - * We need to map the ROM to get the CRU service. - * For 32 bit Operating Systems we need to go through the 32 Bit - * BIOS Service Directory - * For 64 bit Operating Systems we get that service through SMBIOS. - */ - retval = detect_cru_service(); - if (retval < 0) { - dev_warn(&dev->dev, - "Unable to detect the %d Bit CRU Service.\n", - HPWDT_ARCH); - return retval; - } - - /* - * We know this is the only CRU call we need to make so lets keep as - * few instructions as possible once the NMI comes in. - */ - cmn_regs.u1.rah = 0x0D; - cmn_regs.u1.ral = 0x02; - } - /* * Only one function can register for NMI_UNKNOWN */ @@ -780,44 +316,25 @@ static int hpwdt_init_nmi_decoding(struct pci_dev *dev) dev_warn(&dev->dev, "Unable to register a die notifier (err=%d).\n", retval); - if (cru_rom_addr) - iounmap(cru_rom_addr); return retval; +#endif /* CONFIG_HPWDT_NMI_DECODING */ + return 0; } static void hpwdt_exit_nmi_decoding(void) { +#ifdef CONFIG_HPWDT_NMI_DECODING unregister_nmi_handler(NMI_UNKNOWN, "hpwdt"); unregister_nmi_handler(NMI_SERR, "hpwdt"); unregister_nmi_handler(NMI_IO_CHECK, "hpwdt"); - if (cru_rom_addr) - iounmap(cru_rom_addr); -} -#else /* !CONFIG_HPWDT_NMI_DECODING */ -static void hpwdt_check_nmi_decoding(struct pci_dev *dev) -{ -} - -static int hpwdt_init_nmi_decoding(struct pci_dev *dev) -{ - return 0; +#endif } -static void hpwdt_exit_nmi_decoding(void) -{ -} -#endif /* CONFIG_HPWDT_NMI_DECODING */ - static int hpwdt_init_one(struct pci_dev *dev, const struct pci_device_id *ent) { int retval; - /* - * Check if we can do NMI decoding or not - */ - hpwdt_check_nmi_decoding(dev); - /* * First let's find out if we are on an iLO2+ server. We will * not run on a legacy ASM box. @@ -922,6 +439,6 @@ MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default=" #ifdef CONFIG_HPWDT_NMI_DECODING module_param(allow_kdump, int, 0); MODULE_PARM_DESC(allow_kdump, "Start a kernel dump after NMI occurs"); -#endif /* !CONFIG_HPWDT_NMI_DECODING */ +#endif /* CONFIG_HPWDT_NMI_DECODING */ module_pci_driver(hpwdt_driver); From 317660940fd9dddd3201c2f92e25c27902c753fa Mon Sep 17 00:00:00 2001 From: Kan Liang Date: Fri, 2 Mar 2018 07:22:30 -0800 Subject: [PATCH 086/336] perf/x86/intel/uncore: Fix Skylake UPI event format There is no event extension (bit 21) for SKX UPI, so use 'event' instead of 'event_ext'. Reported-by: Stephane Eranian Signed-off-by: Kan Liang Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Vince Weaver Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore support") Link: http://lkml.kernel.org/r/1520004150-4855-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index 6d8044ab10607..22ec65bc033a9 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -3606,7 +3606,7 @@ static struct intel_uncore_type skx_uncore_imc = { }; static struct attribute *skx_upi_uncore_formats_attr[] = { - &format_attr_event_ext.attr, + &format_attr_event.attr, &format_attr_umask_ext.attr, &format_attr_edge.attr, &format_attr_inv.attr, From 28b2182dad43f6f8fcbd167539a26714fd12bd64 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Fri, 2 Mar 2018 11:36:32 +0100 Subject: [PATCH 087/336] ahci: Add PCI-id for the Highpoint Rocketraid 644L card Like the Highpoint Rocketraid 642L and cards using a Marvel 88SE9235 controller in general, this RAID card also supports AHCI mode and short of a custom driver, this is the only way to make it work under Linux. Note that even though the card is called to 644L, it has a product-id of 0x0645. Cc: stable@vger.kernel.org BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1534106 Signed-off-by: Hans de Goede Signed-off-by: Tejun Heo Acked-by: Bjorn Helgaas --- drivers/ata/ahci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 355a95a83a340..1ff17799769d0 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -550,7 +550,9 @@ static const struct pci_device_id ahci_pci_tbl[] = { .driver_data = board_ahci_yes_fbs }, { PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9230), .driver_data = board_ahci_yes_fbs }, - { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0642), + { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0642), /* highpoint rocketraid 642L */ + .driver_data = board_ahci_yes_fbs }, + { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0645), /* highpoint rocketraid 644L */ .driver_data = board_ahci_yes_fbs }, /* Promise */ From 1903be8222b7c278ca897c129ce477c1dd6403a8 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Fri, 2 Mar 2018 11:36:33 +0100 Subject: [PATCH 088/336] PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L The Highpoint RocketRAID 644L uses a Marvel 88SE9235 controller, as with other Marvel controllers this needs a function 1 DMA alias quirk. Note the RocketRAID 642L uses the same Marvel 88SE9235 controller and already is listed with a function 1 DMA alias quirk. Cc: stable@vger.kernel.org BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1534106 Signed-off-by: Hans de Goede Acked-by: Bjorn Helgaas Signed-off-by: Tejun Heo --- drivers/pci/quirks.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index fc734014206fb..b1a3a36073b44 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3901,6 +3901,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642, quirk_dma_func1_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645, + quirk_dma_func1_alias); /* https://bugs.gentoo.org/show_bug.cgi?id=497630 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB388_ESD, From 9ac79ba9c77d8595157bbdc4327919f8ee062426 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Mon, 12 Feb 2018 14:55:13 +0100 Subject: [PATCH 089/336] gpio: rcar: Use wakeup_path i.s.o. explicit clock handling Since commit ab82fa7da4dce5c7 ("gpio: rcar: Prevent module clock disable when wake-up is enabled"), when a GPIO is used for wakeup, the GPIO block's module clock (if exists) is manually kept running during system suspend, to make sure the device stays active. However, this explicit clock handling is merely a workaround for a failure to properly communicate wakeup information to the device core. Instead, set the device's power.wakeup_path field, to indicate this device is part of the wakeup path. Depending on the PM Domain's active_wakeup configuration, the genpd core code will keep the device enabled (and the clock running) during system suspend when needed. This allows for the removal of all explicit clock handling code from the driver. Signed-off-by: Geert Uytterhoeven Signed-off-by: Linus Walleij --- drivers/gpio/gpio-rcar.c | 38 ++++++++++++++++---------------------- 1 file changed, 16 insertions(+), 22 deletions(-) diff --git a/drivers/gpio/gpio-rcar.c b/drivers/gpio/gpio-rcar.c index e76de57dd617d..ebaea8b1594b7 100644 --- a/drivers/gpio/gpio-rcar.c +++ b/drivers/gpio/gpio-rcar.c @@ -14,7 +14,6 @@ * GNU General Public License for more details. */ -#include #include #include #include @@ -37,10 +36,9 @@ struct gpio_rcar_priv { struct platform_device *pdev; struct gpio_chip gpio_chip; struct irq_chip irq_chip; - struct clk *clk; unsigned int irq_parent; + atomic_t wakeup_path; bool has_both_edge_trigger; - bool needs_clk; }; #define IOINTSEL 0x00 /* General IO/Interrupt Switching Register */ @@ -186,13 +184,10 @@ static int gpio_rcar_irq_set_wake(struct irq_data *d, unsigned int on) } } - if (!p->clk) - return 0; - if (on) - clk_enable(p->clk); + atomic_inc(&p->wakeup_path); else - clk_disable(p->clk); + atomic_dec(&p->wakeup_path); return 0; } @@ -330,17 +325,14 @@ static int gpio_rcar_direction_output(struct gpio_chip *chip, unsigned offset, struct gpio_rcar_info { bool has_both_edge_trigger; - bool needs_clk; }; static const struct gpio_rcar_info gpio_rcar_info_gen1 = { .has_both_edge_trigger = false, - .needs_clk = false, }; static const struct gpio_rcar_info gpio_rcar_info_gen2 = { .has_both_edge_trigger = true, - .needs_clk = true, }; static const struct of_device_id gpio_rcar_of_table[] = { @@ -403,7 +395,6 @@ static int gpio_rcar_parse_dt(struct gpio_rcar_priv *p, unsigned int *npins) ret = of_parse_phandle_with_fixed_args(np, "gpio-ranges", 3, 0, &args); *npins = ret == 0 ? args.args[2] : RCAR_MAX_GPIO_PER_BANK; p->has_both_edge_trigger = info->has_both_edge_trigger; - p->needs_clk = info->needs_clk; if (*npins == 0 || *npins > RCAR_MAX_GPIO_PER_BANK) { dev_warn(&p->pdev->dev, @@ -440,16 +431,6 @@ static int gpio_rcar_probe(struct platform_device *pdev) platform_set_drvdata(pdev, p); - p->clk = devm_clk_get(dev, NULL); - if (IS_ERR(p->clk)) { - if (p->needs_clk) { - dev_err(dev, "unable to get clock\n"); - ret = PTR_ERR(p->clk); - goto err0; - } - p->clk = NULL; - } - pm_runtime_enable(dev); irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0); @@ -531,11 +512,24 @@ static int gpio_rcar_remove(struct platform_device *pdev) return 0; } +static int __maybe_unused gpio_rcar_suspend(struct device *dev) +{ + struct gpio_rcar_priv *p = dev_get_drvdata(dev); + + if (atomic_read(&p->wakeup_path)) + device_set_wakeup_path(dev); + + return 0; +} + +static SIMPLE_DEV_PM_OPS(gpio_rcar_pm_ops, gpio_rcar_suspend, NULL); + static struct platform_driver gpio_rcar_device_driver = { .probe = gpio_rcar_probe, .remove = gpio_rcar_remove, .driver = { .name = "gpio_rcar", + .pm = &gpio_rcar_pm_ops, .of_match_table = of_match_ptr(gpio_rcar_of_table), } }; From 1a087f032111a88e826877449dfb93ceb22b78b9 Mon Sep 17 00:00:00 2001 From: Xinyong Date: Fri, 2 Mar 2018 19:20:07 +0800 Subject: [PATCH 090/336] usb: gadget: f_fs: Fix use-after-free in ffs_fs_kill_sb() When I debug a kernel crash issue in funcitonfs, found ffs_data.ref overflowed, While functionfs is unmounting, ffs_data is put twice. Commit 43938613c6fd ("drivers, usb: convert ffs_data.ref from atomic_t to refcount_t") can avoid refcount overflow, but that is risk some situations. So no need put ffs data in ffs_fs_kill_sb, already put in ffs_data_closed. The issue can be reproduced in Mediatek mt6763 SoC, ffs for ADB device. KASAN enabled configuration reports use-after-free errro. BUG: KASAN: use-after-free in refcount_dec_and_test+0x14/0xe0 at addr ffffffc0579386a0 Read of size 4 by task umount/4650 ==================================================== BUG kmalloc-512 (Tainted: P W O ): kasan: bad access detected ----------------------------------------------------------------------------- INFO: Allocated in ffs_fs_mount+0x194/0x844 age=22856 cpu=2 pid=566 alloc_debug_processing+0x1ac/0x1e8 ___slab_alloc.constprop.63+0x640/0x648 __slab_alloc.isra.57.constprop.62+0x24/0x34 kmem_cache_alloc_trace+0x1a8/0x2bc ffs_fs_mount+0x194/0x844 mount_fs+0x6c/0x1d0 vfs_kern_mount+0x50/0x1b4 do_mount+0x258/0x1034 INFO: Freed in ffs_data_put+0x25c/0x320 age=0 cpu=3 pid=4650 free_debug_processing+0x22c/0x434 __slab_free+0x2d8/0x3a0 kfree+0x254/0x264 ffs_data_put+0x25c/0x320 ffs_data_closed+0x124/0x15c ffs_fs_kill_sb+0xb8/0x110 deactivate_locked_super+0x6c/0x98 deactivate_super+0xb0/0xbc INFO: Object 0xffffffc057938600 @offset=1536 fp=0x (null) ...... Call trace: [] dump_backtrace+0x0/0x250 [] show_stack+0x14/0x1c [] dump_stack+0xa0/0xc8 [] print_trailer+0x158/0x260 [] object_err+0x3c/0x40 [] kasan_report_error+0x2a8/0x754 [] kasan_report+0x5c/0x60 [] __asan_load4+0x70/0x88 [] refcount_dec_and_test+0x14/0xe0 [] ffs_data_put+0x80/0x320 [] ffs_fs_kill_sb+0xc8/0x110 [] deactivate_locked_super+0x6c/0x98 [] deactivate_super+0xb0/0xbc [] cleanup_mnt+0x64/0xec [] __cleanup_mnt+0x10/0x18 [] task_work_run+0xcc/0x124 [] do_notify_resume+0x60/0x70 [] work_pending+0x10/0x14 Cc: stable@vger.kernel.org Signed-off-by: Xinyong Signed-off-by: Felipe Balbi --- drivers/usb/gadget/function/f_fs.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index c2592d883f67c..d2428a9e89003 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -1538,7 +1538,6 @@ ffs_fs_kill_sb(struct super_block *sb) if (sb->s_fs_info) { ffs_release_dev(sb->s_fs_info); ffs_data_closed(sb->s_fs_info); - ffs_data_put(sb->s_fs_info); } } From 4c437920fa216f66f6a5d469cae2a0360cc2d9c7 Mon Sep 17 00:00:00 2001 From: Amelie Delaunay Date: Thu, 1 Mar 2018 11:05:34 +0100 Subject: [PATCH 091/336] dt-bindings: usb: fix the STM32F7 DWC2 OTG HS core binding This patch fixes binding documentation for DWC2 controller in HS mode found on STMicroelectronics STM32F7 SoC. The v2 former patch [1] had been acked by Rob Herring, but v1 was merged. [1] https://patchwork.kernel.org/patch/9925575/ Fixes: 000777dadc7e ("dt-bindings: usb: Document the STM32F7xx DWC2 ...") Signed-off-by: Amelie Delaunay Reviewed-by: Rob Herring Signed-off-by: Felipe Balbi --- Documentation/devicetree/bindings/usb/dwc2.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/usb/dwc2.txt b/Documentation/devicetree/bindings/usb/dwc2.txt index e64d903bcbe81..46da5f1844608 100644 --- a/Documentation/devicetree/bindings/usb/dwc2.txt +++ b/Documentation/devicetree/bindings/usb/dwc2.txt @@ -19,7 +19,7 @@ Required properties: configured in FS mode; - "st,stm32f4x9-hsotg": The DWC2 USB HS controller instance in STM32F4x9 SoCs configured in HS mode; - - "st,stm32f7xx-hsotg": The DWC2 USB HS controller instance in STM32F7xx SoCs + - "st,stm32f7-hsotg": The DWC2 USB HS controller instance in STM32F7 SoCs configured in HS mode; - reg : Should contain 1 register range (address and length) - interrupts : Should contain 1 interrupt From 1a149e3554e0324a3d551dfb327bdb67b150a320 Mon Sep 17 00:00:00 2001 From: Amelie Delaunay Date: Thu, 1 Mar 2018 11:05:35 +0100 Subject: [PATCH 092/336] usb: dwc2: fix STM32F7 USB OTG HS compatible This patch fixes compatible for STM32F7 USB OTG HS and consistently rename dw2_set_params function. The v2 former patch [1] had been acked by Paul Young, but v1 was merged. [1] https://patchwork.kernel.org/patch/9925573/ Fixes: d8fae8b93682 ("usb: dwc2: add support for STM32F7xx USB OTG HS") Signed-off-by: Amelie Delaunay Signed-off-by: Felipe Balbi --- drivers/usb/dwc2/params.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/usb/dwc2/params.c b/drivers/usb/dwc2/params.c index 03fd20f0b4961..c4a47496d2fb9 100644 --- a/drivers/usb/dwc2/params.c +++ b/drivers/usb/dwc2/params.c @@ -137,7 +137,7 @@ static void dwc2_set_stm32f4x9_fsotg_params(struct dwc2_hsotg *hsotg) p->activate_stm_fs_transceiver = true; } -static void dwc2_set_stm32f7xx_hsotg_params(struct dwc2_hsotg *hsotg) +static void dwc2_set_stm32f7_hsotg_params(struct dwc2_hsotg *hsotg) { struct dwc2_core_params *p = &hsotg->params; @@ -164,8 +164,8 @@ const struct of_device_id dwc2_of_match_table[] = { { .compatible = "st,stm32f4x9-fsotg", .data = dwc2_set_stm32f4x9_fsotg_params }, { .compatible = "st,stm32f4x9-hsotg" }, - { .compatible = "st,stm32f7xx-hsotg", - .data = dwc2_set_stm32f7xx_hsotg_params }, + { .compatible = "st,stm32f7-hsotg", + .data = dwc2_set_stm32f7_hsotg_params }, {}, }; MODULE_DEVICE_TABLE(of, dwc2_of_match_table); From 54f02945f703404cdf17c9618316b3d3387fa072 Mon Sep 17 00:00:00 2001 From: Yoshihiro Shimoda Date: Tue, 27 Feb 2018 17:16:02 +0900 Subject: [PATCH 093/336] usb: renesas_usbhs: add binding for r8a77965 This patch adds binding for r8a77965 (R-Car M3-N). Signed-off-by: Yoshihiro Shimoda Signed-off-by: Felipe Balbi --- Documentation/devicetree/bindings/usb/renesas_usbhs.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/usb/renesas_usbhs.txt b/Documentation/devicetree/bindings/usb/renesas_usbhs.txt index d060172f15291..43960faf5a88c 100644 --- a/Documentation/devicetree/bindings/usb/renesas_usbhs.txt +++ b/Documentation/devicetree/bindings/usb/renesas_usbhs.txt @@ -12,6 +12,7 @@ Required properties: - "renesas,usbhs-r8a7794" for r8a7794 (R-Car E2) compatible device - "renesas,usbhs-r8a7795" for r8a7795 (R-Car H3) compatible device - "renesas,usbhs-r8a7796" for r8a7796 (R-Car M3-W) compatible device + - "renesas,usbhs-r8a77965" for r8a77965 (R-Car M3-N) compatible device - "renesas,usbhs-r8a77995" for r8a77995 (R-Car D3) compatible device - "renesas,usbhs-r7s72100" for r7s72100 (RZ/A1) compatible device - "renesas,rcar-gen2-usbhs" for R-Car Gen2 or RZ/G1 compatible devices From c6ba5084ce0d00d4a005b0577d9e764d39b638e1 Mon Sep 17 00:00:00 2001 From: Yoshihiro Shimoda Date: Tue, 27 Feb 2018 17:16:03 +0900 Subject: [PATCH 094/336] usb: gadget: udc: renesas_usb3: add binging for r8a77965 This patch adds binding for r8a77965 (R-Car M3-N). Signed-off-by: Yoshihiro Shimoda Signed-off-by: Felipe Balbi --- Documentation/devicetree/bindings/usb/renesas_usb3.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/usb/renesas_usb3.txt b/Documentation/devicetree/bindings/usb/renesas_usb3.txt index 87a45e2f9b7f9..2c071bb5801e7 100644 --- a/Documentation/devicetree/bindings/usb/renesas_usb3.txt +++ b/Documentation/devicetree/bindings/usb/renesas_usb3.txt @@ -4,6 +4,7 @@ Required properties: - compatible: Must contain one of the following: - "renesas,r8a7795-usb3-peri" - "renesas,r8a7796-usb3-peri" + - "renesas,r8a77965-usb3-peri" - "renesas,rcar-gen3-usb3-peri" for a generic R-Car Gen3 compatible device From 6cfc70c4321bde35cb132831cba4685821e65065 Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Thu, 1 Mar 2018 10:37:41 +0800 Subject: [PATCH 095/336] MIPS: Loongson64: Select ARCH_MIGHT_HAVE_PC_PARPORT Commit a211a0820d3c ("MIPS: Push ARCH_MIGHT_HAVE_PC_PARPORT down to platform level") moves the global MIPS ARCH_MIGHT_HAVE_PC_PARPORT select down to various platforms, but doesn't add it to Loongson64 platforms which need it, so add the selects to these platforms too. Fixes: a211a0820d3c ("MIPS: Push ARCH_MIGHT_HAVE_PC_PARPORT down to platform level") Signed-off-by: Huacai Chen Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18703/ Signed-off-by: James Hogan --- arch/mips/loongson64/Kconfig | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/mips/loongson64/Kconfig b/arch/mips/loongson64/Kconfig index bc2fdbfa8223c..12812a8b640cf 100644 --- a/arch/mips/loongson64/Kconfig +++ b/arch/mips/loongson64/Kconfig @@ -7,6 +7,7 @@ choice config LEMOTE_FULOONG2E bool "Lemote Fuloong(2e) mini-PC" select ARCH_SPARSEMEM_ENABLE + select ARCH_MIGHT_HAVE_PC_PARPORT select CEVT_R4K select CSRC_R4K select SYS_HAS_CPU_LOONGSON2E @@ -33,6 +34,7 @@ config LEMOTE_FULOONG2E config LEMOTE_MACH2F bool "Lemote Loongson 2F family machines" select ARCH_SPARSEMEM_ENABLE + select ARCH_MIGHT_HAVE_PC_PARPORT select BOARD_SCACHE select BOOT_ELF32 select CEVT_R4K if ! MIPS_EXTERNAL_TIMER @@ -62,6 +64,7 @@ config LEMOTE_MACH2F config LOONGSON_MACH3X bool "Generic Loongson 3 family machines" select ARCH_SPARSEMEM_ENABLE + select ARCH_MIGHT_HAVE_PC_PARPORT select GENERIC_ISA_DMA_SUPPORT_BROKEN select BOOT_ELF32 select BOARD_SCACHE From ee2515d95f9a12e04a3863916ae45831438210ce Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Thu, 1 Mar 2018 10:37:42 +0800 Subject: [PATCH 096/336] MIPS: Loongson64: Select ARCH_MIGHT_HAVE_PC_SERIO Commit 7a407aa5e0d3 ("MIPS: Push ARCH_MIGHT_HAVE_PC_SERIO down to platform level") moves the global MIPS ARCH_MIGHT_HAVE_PC_SERIO select down to various platforms, but doesn't add it to Loongson64 platforms which need it, so add the selects to these platforms too. Fixes: 7a407aa5e0d3 ("MIPS: Push ARCH_MIGHT_HAVE_PC_SERIO down to platform level") Signed-off-by: Huacai Chen Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18704/ Signed-off-by: James Hogan --- arch/mips/loongson64/Kconfig | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/mips/loongson64/Kconfig b/arch/mips/loongson64/Kconfig index 12812a8b640cf..72af0c1839698 100644 --- a/arch/mips/loongson64/Kconfig +++ b/arch/mips/loongson64/Kconfig @@ -8,6 +8,7 @@ config LEMOTE_FULOONG2E bool "Lemote Fuloong(2e) mini-PC" select ARCH_SPARSEMEM_ENABLE select ARCH_MIGHT_HAVE_PC_PARPORT + select ARCH_MIGHT_HAVE_PC_SERIO select CEVT_R4K select CSRC_R4K select SYS_HAS_CPU_LOONGSON2E @@ -35,6 +36,7 @@ config LEMOTE_MACH2F bool "Lemote Loongson 2F family machines" select ARCH_SPARSEMEM_ENABLE select ARCH_MIGHT_HAVE_PC_PARPORT + select ARCH_MIGHT_HAVE_PC_SERIO select BOARD_SCACHE select BOOT_ELF32 select CEVT_R4K if ! MIPS_EXTERNAL_TIMER @@ -65,6 +67,7 @@ config LOONGSON_MACH3X bool "Generic Loongson 3 family machines" select ARCH_SPARSEMEM_ENABLE select ARCH_MIGHT_HAVE_PC_PARPORT + select ARCH_MIGHT_HAVE_PC_SERIO select GENERIC_ISA_DMA_SUPPORT_BROKEN select BOOT_ELF32 select BOARD_SCACHE From 14a596a7e6fd9c5baa6b2cfc57962e2c3bda6c69 Mon Sep 17 00:00:00 2001 From: Rasmus Villemoes Date: Wed, 28 Feb 2018 20:17:35 +0100 Subject: [PATCH 097/336] fixdep: remove stale references to uml-config.h uml-config.h hasn't existed in this decade (87e299e5c750 - x86, um: get rid of uml-config.h). The few remaining UML_CONFIG instances are defined directly in terms of their real CONFIG symbol in common-offsets.h, so unlike when the symbols got defined via a sed script, anything that uses UML_CONFIG_FOO now should also automatically pick up a dependency on CONFIG_FOO via the normal fixdep mechanism (since common-offsets.h should at least recursively be a dependency). Hence I believe we should actually be able to ignore the HELLO_CONFIG_BOOM cases. Cc: Al Viro Cc: Richard Weinberger Cc: user-mode-linux-devel@lists.sourceforge.net Signed-off-by: Rasmus Villemoes Signed-off-by: Masahiro Yamada --- scripts/basic/fixdep.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/scripts/basic/fixdep.c b/scripts/basic/fixdep.c index fa3d39b6f23bb..d7fbe545dd5d4 100644 --- a/scripts/basic/fixdep.c +++ b/scripts/basic/fixdep.c @@ -93,14 +93,6 @@ * (Note: it'd be easy to port over the complete mkdep state machine, * but I don't think the added complexity is worth it) */ -/* - * Note 2: if somebody writes HELLO_CONFIG_BOOM in a file, it will depend onto - * CONFIG_BOOM. This could seem a bug (not too hard to fix), but please do not - * fix it! Some UserModeLinux files (look at arch/um/) call CONFIG_BOOM as - * UML_CONFIG_BOOM, to avoid conflicts with /usr/include/linux/autoconf.h, - * through arch/um/include/uml-config.h; this fixdep "bug" makes sure that - * those files will have correct dependencies. - */ #include #include @@ -286,7 +278,6 @@ static int is_ignored_file(const char *s, int len) { return str_ends_with(s, len, "include/generated/autoconf.h") || str_ends_with(s, len, "include/generated/autoksyms.h") || - str_ends_with(s, len, "arch/um/include/uml-config.h") || str_ends_with(s, len, "include/linux/kconfig.h") || str_ends_with(s, len, ".ver"); } From 5b8ad96d1a4421ffe417e647a65064aad1e84fb4 Mon Sep 17 00:00:00 2001 From: Rasmus Villemoes Date: Wed, 28 Feb 2018 20:17:36 +0100 Subject: [PATCH 098/336] fixdep: remove some false CONFIG_ matches The string CONFIG_ quite often appears after other alphanumerics, meaning that that instance cannot be referencing a Kconfig symbol. Omitting these means make has fewer files to stat() when deciding what needs to be rebuilt - for a defconfig build, this seems to remove about 2% of the (wildcard ...) lines from the .o.cmd files. Signed-off-by: Rasmus Villemoes Signed-off-by: Masahiro Yamada --- scripts/basic/fixdep.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/scripts/basic/fixdep.c b/scripts/basic/fixdep.c index d7fbe545dd5d4..1b21870d6e7f9 100644 --- a/scripts/basic/fixdep.c +++ b/scripts/basic/fixdep.c @@ -225,8 +225,13 @@ static int str_ends_with(const char *s, int slen, const char *sub) static void parse_config_file(const char *p) { const char *q, *r; + const char *start = p; while ((p = strstr(p, "CONFIG_"))) { + if (p > start && (isalnum(p[-1]) || p[-1] == '_')) { + p += 7; + continue; + } p += 7; q = p; while (*q && (isalnum(*q) || *q == '_')) From 638e69cf2230737655fcb5ee9879c2fab7679187 Mon Sep 17 00:00:00 2001 From: Rasmus Villemoes Date: Wed, 28 Feb 2018 20:17:37 +0100 Subject: [PATCH 099/336] fixdep: do not ignore kconfig.h kconfig.h was excluded from consideration by fixdep by 6a5be57f0f00 (fixdep: fix extraneous dependencies) to avoid some false positive hits (1) include/config/.h (2) include/config/h.h (3) include/config/foo.h (1) occurred because kconfig.h contains the string CONFIG_ in a comment. However, since dee81e988674 (fixdep: faster CONFIG_ search), we have a check that the part after CONFIG_ is non-empty, so this does not happen anymore (and CONFIG_ appears by itself elsewhere, so that check is worthwhile). (2) comes from the include guard, __LINUX_KCONFIG_H. But with the previous patch, we no longer match that either. That leaves (3), which amounts to one [1] false dependency (aka stat() call done by make), which I think we can live with: We've already had one case [2] where the lack of include/linux/kconfig.h in the .o.cmd file caused a missing rebuild, and while I originally thought we should just put kconfig.h in the dependency list without parsing it for the CONFIG_ pattern, we actually do have some real CONFIG_ symbols mentioned in it, and one can imagine some translation unit that just does '#ifdef __BIG_ENDIAN' but doesn't through some other header actually depend on CONFIG_CPU_BIG_ENDIAN - so changing the target endianness could end up rebuilding the world, minus that small TU. Quoting Linus, ... when missing dependencies cause a missed re-compile, the resulting bugs can be _really_ subtle. [1] well, two, we now also have CONFIG_BOOGER/booger.h - we could change that to FOO if we care [2] https://lkml.org/lkml/2018/2/22/838 Cc: Linus Torvalds Signed-off-by: Rasmus Villemoes Signed-off-by: Masahiro Yamada --- scripts/basic/fixdep.c | 1 - 1 file changed, 1 deletion(-) diff --git a/scripts/basic/fixdep.c b/scripts/basic/fixdep.c index 1b21870d6e7f9..449b68c4c90cb 100644 --- a/scripts/basic/fixdep.c +++ b/scripts/basic/fixdep.c @@ -283,7 +283,6 @@ static int is_ignored_file(const char *s, int len) { return str_ends_with(s, len, "include/generated/autoconf.h") || str_ends_with(s, len, "include/generated/autoksyms.h") || - str_ends_with(s, len, "include/linux/kconfig.h") || str_ends_with(s, len, ".ver"); } From f6d3f35e006496c282ccbb67494d90b04f6cba10 Mon Sep 17 00:00:00 2001 From: Sangwon Hong Date: Mon, 12 Feb 2018 04:37:44 +0900 Subject: [PATCH 100/336] perf kallsyms: Fix the usage on the man page First, all man pages highlight only perf and subcommands except 'perf kallsyms', which includes the full usage. Fix it for commands to monopolize underlines. Second, options can be ommited when executing 'perf kallsyms', so add square brackets between