-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
…ted MC > CONFIG_EDAC_SKX: > > Support for error detection and correction the Intel > Skylake server Integrated Memory Controllers. If your > system has non-volatile DIMMs you should also manually > select CONFIG_ACPI_NFIT. Fixes: #1700
libata should also support all these devices. Uniform Multi-Platform E-IDE driver ide_generic: please use "probe_mask=0x3f" module parameter for probing all legacy ISA IDE ports legacy IDE will be removed in 2021, please switch to libata Report any missing HW support to linux-ide@vger.kernel.org Probing IDE interface ide0... ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 legacy IDE will be removed in 2021, please switch to libata Report any missing HW support to linux-ide@vger.kernel.org Probing IDE interface ide1... Fixes: #1707
> SECURITY_LOCKDOWN_LSM > > Build support for an LSM that enforces a coarse kernel lockdown > behaviour. Do not enable it by default. Fixes: #1710
> CONFIG_X86_X2APIC: > > This enables x2apic support on CPUs that have this feature. > > This allows 32-bit apic IDs (so it can support very large systems), > and accesses the local apic via MSRs not via mmio. On the Dell PowerEdge T640/04WYPY, BIOS 2.4.8 11/27/2019, Linux crashes on start-up. [ 3.862327] ACPI: Core revision 20190816 [ 3.869551] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635855245 ns [ 3.878797] APIC: Switch to symmetric I/O mode setup [ 3.883893] Switched APIC routing to physical flat. [ 3.888904] ------------[ cut here ]------------ [ 3.893641] kernel BUG at arch/x86/kernel/apic/apic.c:1616! [ 3.899347] invalid opcode: 0000 [#1] SMP NOPTI [ 3.903990] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.14.mx64.317 #1 [ 3.910803] Hardware name: Dell Inc. PowerEdge T640/04WYPY, BIOS 2.4.8 11/27/2019 [ 3.918448] RIP: 0010:setup_local_APIC+0x32e/0x390 [ 3.923356] Code: 68 70 2e 01 be 00 07 01 00 bf 50 03 00 00 48 8b 40 10 e8 15 9e db 00 eb a9 be 00 04 01 00 bf 60 03 00 00 e8 04 9e db 00 eb bb <0f> 0b e8 5b 3a 00 00 [ 3.942300] RSP: 0000:ffffffff82403e88 EFLAGS: 00010246 [ 3.947641] RAX: 0000000000000000 RBX: 00000000000000ff RCX: ffffffff82454128 [ 3.955787] RDX: 0000000000000000 RSI: 00000000fffffeff RDI: 0000000000000020 [ 3.963031] RBP: ffffffffffffffff R08: 00000000000001c4 R09: 0734073407370739 [ 3.970277] R10: ffffffff82573000 R11: 0720072007730765 R12: ffffffff82a4a920 [ 3.977522] R13: 0000000000000000 R14: ffff88c07fff0e80 R15: 0000000000000000 [ 3.984766] FS: 0000000000000000(0000) GS:ffff889fffc00000(0000) knlGS:0000000000000000 [ 3.993014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.998876] CR2: ffff88c07ffff000 CR3: 000000000240a001 CR4: 00000000000606b0 [ 4.006121] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.013365] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.020611] Call Trace: [ 4.023184] apic_intr_mode_init+0x1d2/0x1ec [ 4.027573] x86_late_time_init+0x17/0x1c [ 4.031706] start_kernel+0x41f/0x4d3 [ 4.035491] secondary_startup_64+0xa4/0xb0 [ 4.039797] Modules linked in: [ 4.042997] ---[ end trace c3629ce2e87a638c ]--- [ 4.047746] RIP: 0010:setup_local_APIC+0x32e/0x390 [ 4.052663] Code: 68 70 2e 01 be 00 07 01 00 bf 50 03 00 00 48 8b 40 10 e8 15 9e db 00 eb a9 be 00 04 01 00 bf 60 03 00 00 e8 04 9e db 00 eb bb <0f> 0b e8 5b 3a 00 00 [ 4.071617] RSP: 0000:ffffffff82403e88 EFLAGS: 00010246 [ 4.076966] RAX: 0000000000000000 RBX: 00000000000000ff RCX: ffffffff82454128 [ 4.084219] RDX: 0000000000000000 RSI: 00000000fffffeff RDI: 0000000000000020 [ 4.091475] RBP: ffffffffffffffff R08: 00000000000001c4 R09: 0734073407370739 [ 4.098738] R10: ffffffff82573000 R11: 0720072007730765 R12: ffffffff82a4a920 [ 4.106000] R13: 0000000000000000 R14: ffff88c07fff0e80 R15: 0000000000000000 [ 4.113252] FS: 0000000000000000(0000) GS:ffff889fffc00000(0000) knlGS:0000000000000000 [ 4.121509] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.127380] CR2: ffff88c07ffff000 CR3: 000000000240a001 CR4: 00000000000606b0 [ 4.134632] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.141887] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.149142] Kernel panic - not syncing: Attempted to kill the idle task! [ 4.155968] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- The reason is the code below in `arch/x86/kernel/apic/apic.c`. /* * Double-check whether this APIC is really registered. * This is meaningless in clustered apic mode, so we skip it. */ BUG_ON(!apic->apic_id_registered()); With `acpi=off noapic` the panic below is shown. [ 2.577272] Kernel panic - not syncing: BIOS has enabled x2apic but kernel doesn't support x2apic, please disable x2apic in BIOS. With `nosmp` it also crashes at the same spot. [ 3.705437] APIC: SMP mode deactivated [ 3.709189] APIC: Switch to symmetric I/O mode setup in no SMP routine [ 3.715712] ------------[ cut here ]------------ [ 3.720320] kernel BUG at arch/x86/kernel/apic/apic.c:1616! Selecting X2APIC support in Linux fixes the crashes/panics. Disabling x2APIC in the Dell firmware also get the Linux kernel with no X2APIC support to boot, but some posts on the Web claim, that X2APIC is more efficient [1]. [1]: https://serverfault.com/questions/873664/when-to-use-x2apic-mode
> CONFIG_INTEL_IOMMU: > > DMA remapping (DMAR) devices support enables independent address > translations for Direct Memory Access (DMA) from devices. > These DMA remapping devices are reported via ACPI tables > and include PCI device scope covered by these DMA > remapping devices.
> CONFIG_AMD_IOMMU_V2: > > This option enables support for the AMD IOMMUv2 features of the IOMMU > hardware. Select this option if you want to use devices that support > the PCI PRI and PASID interface.
b4d2c61
to
539d669
Compare
I propose to disable the SUSPEND setting, since this option alone doesn't help a lot. (We also have this freeze issues...)
Another one would be the support for another USB serial gadget USB_SERIAL_PL2303
As for the Nvidia driver, there are some fixes conerning OpenCL pending.
Cheers,
Thomas
…On February 21, 2020 3:52:37 PM GMT+01:00, Paul Menzel ***@***.***> wrote:
Enable X2APIC support to not crash on a Dell PowerEdge T640 with x2APIC
enabled in the firmware.
Build the EDAC drivers as modules, disable the deprecated legacy IDE
driver, and enable the Intel IOMMU driver.
Tested on *hactar*.
You can view, comment on, or merge this pull request online at:
#1714
-- Commit Summary --
* linux: Add LTS version 5.4.21
* linux-5.4.21-323: Convert EDAC drivers to modules
* linux-5.4.21-323: Select EDAC driver for Intel Skylake server
Integrated MC
* linux-5.4.21-323: Disable deprecated legacy IDE driver
* linux-5.4.21-323: Select *Basic module for enforcing kernel lockdown*
* linux-5.4.21-323: Select X2APIC
* linux-5.4.21-323: Support Intel IOMMU using DMA Remapping Devices
* linux-5.4.21-323: Select *AMD IOMMU Version 2 driver*
* nvidia_linux: Build 440.44 for Linux 5.4.21-323
-- File Changes --
A linux-5.4.21-323.bee (900)
A nvidia_linux-5.4.21-323-440.44-0.bee (60)
-- Patch Links --
https://github.molgen.mpg.de/mariux64/bee-files/pull/1714.patch
https://github.molgen.mpg.de/mariux64/bee-files/pull/1714.diff
--
Thomas Kreitler - Information Retrieval
kreitler@molgen.mpg.de
49/30/8413 1702
|
The USB FTDI Single Port Serial Driver is not needed for booting, so build it as a module instead of directly into the kernel.
Do not build the USB ZyXEL omni.net LCD Plus Driver as no users are known. If the driver is needed, it should be built as a module.
We have such a device now, so build the USB Prolific 2303 Single Port Serial Driver as a module as requested in the merge/pull request comments. [1]: #1714
539d669
to
283da8c
Compare
Besides the freezing issue having nothing to do with this, MarIuX can also be used on laptops.
Done.
Unrelated to this merge/pull request. |
nope, it cannot. |
Either I dreamt it, or you did use it on a Dell laptop. |
My answer was exactly as specific as your objection. The Laptops we run mariux on do not SUSPEND. so we're not talking about mariux and laptops, we're talking about mariux and suspend. And there was never and there will be no suspend support on mariux. |
Some thoughts about the new kernel - Part 1 Concerning CONFIG_SUSPEND Listed are the PM options used in mariux and from a decent distro (Slackware in this case, the other major players are similar). My conclusions follow below.
On mariux it's mainly CONFIG_SUSPEND set to YES in the top level, all the other features are not considered. I doubt that this will yield an usable and stable configuration for the whole power-saving scenario. To me this is more like pushing one out of ten buttons and see what will happen. Also this mainly applies for laptops, in no case for our servers, and for the workstations I don't see a real reason (N.B. is there a 'Wake-On-Dist' functionality?). So shortly CONFIG_SUSPEND -> No |
Some thoughts about the new kernel - Part 2&3
CONFIG_USB_SERIAL_PL2303, ACK :) Peter and me intend to take a look on ceph-fs in the near future, from the kernel side the necessary options are already set. So no changes needed.
There are problems with the OpenCl support, but these issues need to be fixed in nvidia_current-440.44/nvidia-mxlinks, so the kernel module build is unaffected. |
Not really. CONFIG_SUSPEND is set, because it is default. Dec 2017 we changed from implicitly inheriting the configuration from previous builds into explicitly enumerating them in the bee file. The configuration in the bee file is the output from "make savedefconfig", so these are the lines where we deviate from the defaults. Our idea since then is, that ideally each line there should have some justification, which is documented in the commit which changed the line. We started from a inherited configuration jam, where nobody knew, where the individual options came from and why they were set the way they were ("Kettenmaß") . We might not yet have a commit for each and every line, but we do have a commit for every line which was touched since then, and I'd guess these are most of them. The procedure to get the change and the implied changes of the change and the implied changes of any kernel upgrade into the bee file in the correct format, is not really trivial, and maybe I should write a cookbook for it. Anyway: Is it agreed, that we shouldn't change from the Linux defaults without good reason? If so, I'd ask for an argument to override CONFIG_SUSPEND from the default "yes" to "no". We don't need an argument to not touch it. "We think, that the workstations freezes may have something to do with it, and by disabling it, we might avoid freezing workstations" would certainly be a valid argument. But is it true, do we really think, the workstation might be trying to go into suspend mode? Do we have any indication for that? Another possible argument (as an example) would be "If we don't include the SUSPEND code, we'd save so and so many GB of memory which the user could use." However, these statements are not valid arguments for me:
And
I'd say, that the Linux default configuration has the best chance of being usable and coherent.
Right, but considering that CONFIG_SUSPEND=yes is the default, this argument goes more to the other side, doesn't it? |
@donald, thank you for writing the elaborate explanation. That was my thinking too. I propose to accept this merge/pull request, as it’s already running on hactar, and kindly ask you to open an issue for disabling suspend, if somebody still thinks it would be useful for MarIuX. |
Sure, CONFIG_SUSPEND=y is the default, but if you compare the config state after several rounds of 'savedefconfig', and a plain 'make defconfig'
Then I might state, 'yes we are using a default option', but we simply ignore other defaults from the same context.
If it comes to chances, personally I wouldn't bet too much in this case, since it is not the 'Linux default-configuration' it's a config using defaults. |
I think, this is because,
Maybe it would be more logical to base our settings in the bee file on top of the It doesn't make a difference for SUSPEND, though, which is "yes" after
Is there a difference between "the 'Linux default-configuration' " and "a config using defaults" when the defaults come from Linux and there is no other source aside from that? |
I don't see that we reach an agreement here. Your argumentation goes And yes, there is a difference. I would recheck/readjust the current stage of the evolved config (ie ours) and the current default set (kernel, distros) -- at least from time to time. |
Enable X2APIC support to not crash on a Dell PowerEdge T640 with x2APIC enabled in the firmware.
Build and add EDAC and USB serial drivers as modules, disable the deprecated legacy IDE driver, and enable the Intel IOMMU driver.
Tested on hactar.