-
Notifications
You must be signed in to change notification settings - Fork 0
Kernel: Dmesg overflow at startup #1849
Comments
Are you referring to the problems on sang and mouches?
What size in KB will 19 be? The first logs were missing on sang even with |
Google 'CONFIG_LOG_BUF_SHIFT' ? |
But you already knew the value, so why let all other team members search to find out it’s 2^19 = 512 KiB? Why 512 KiB, and not 2 MiB? As it’s run-time configurable, the Linux configuration shouldn’t be changed in my opinion, and our GRUB configuration should be adapted to use |
@thomas, is there a machine where log messages are lost with default settings but not with CONFIG_LOG_BUF_SHIFT=19 ? Is there a strong argument to put it either into the kernel config or into grub.cfg? I can only find very minor arguments. |
At least mouches dmesg started once around second 2.0 after a regular boot, so yes, messages where lost. To me this means the buffer has become too small in the meantime, even if it's not reproducible (and like it or not, look at other distros). Also we often have had the situation where the buffer was stuffed with output from a 'spammy' service/driver, a larger buffer would offer some more chances to catch the cause. I strongly opt against further cramming the kernel command line (s.b.). A manually choosen value is also stupid, since within the kernel are further mechanics at work that adjust the buffer size; example: each core adds a certain amount of bufferspace. Using the bitshifted-value * N.B. in an ideal world the command line would boil down to 'root=801' -- ok, not everything is sda1 :) |
I've put that into |
As written, it has nothing to do with this. Currently, the buffer is at least 128 KB big, and currently on mouches the actual messages use up around 70 KB.
Debian uses:
After a long run-time we are going to run into that situation. So, why choose 512 KB as the base?
What is your solution for older kernels, which were build with the old limit?
(Please try to not use words like stupid in the future, as, especially in written context, it might be insulting.) 4 KB for each CPU, we’d need 1500 CPUs to reach the 2 MB limit. I am not against increasing the buffer size, but the issue title is about the overflow at start-up, and that is not solved by the proposed solution. To not speak past each other, I’d welcome, if the purpose of this issue is clarified. |
Again, booting with CONFIG_LOG_BUF_SHIFT=17 lets the remaining dmesg output start at second 4+, booting with CONFIG_LOG_BUF_SHIFT=19 lets one see the whole startup. Recently used by others: As for the 'older kernels', hopefully we don't need to boot new machines with an ancient kernel, so the problem will grow out with time. |
It does. I've tested it:
|
I looked more into it, and the problem is, that these two socket Intel Skylake-E systems have over 280 PCI devices, compared to 50 on other 16 CPU systems, resulting in a lot more messages.
That |
I still think, using the command line parameter, is the better solution, as it will improve all systems. Changing the build configuration only fixes the Linux kernels built from now on. |
I would agree to both solutions but didn't want to wait until there is an consensus (which will be never), because I'd like to switch the default kernel to latest and greatest asap and @thomas asked for the proposed change to be in the default kernel. I've included the config in #1855, so we can switch the default kernel without delay because of this minor issue. The kernel option doesn't hurt much and an additional command line option can still be proposed and discussed. |
Startup messages are lost when booting a new server. Caused by a too small kernel ring buffer.
Current
CONFIG_LOG_BUF_SHIFT=17
should be changed toCONFIG_LOG_BUF_SHIFT=19
.The text was updated successfully, but these errors were encountered: