Skip to content

Linux 5.4.39: BUG: unable to handle page fault for address: ffffffffffffffb0 #2062

Open
pmenzel opened this issue Apr 8, 2021 · 2 comments

Comments

@pmenzel
Copy link
Collaborator

pmenzel commented Apr 8, 2021

Build Python 3.8.9 environment on invidia with Linux 5.4.39, and pkg-scripts on hopp:/amd/hopp/1/home/edv/ resulted in:

[Wed Apr  7 21:39:56 2021] BUG: unable to handle page fault for address: ffffffffffffffb0
[Wed Apr  7 21:39:56 2021] #PF: supervisor read access in kernel mode
[Wed Apr  7 21:39:56 2021] #PF: error_code(0x0000) - not-present page
[Wed Apr  7 21:39:56 2021] PGD 240c067 P4D 240c067 PUD 240e067 PMD 0
[Wed Apr  7 21:39:56 2021] Oops: 0000 [#1] SMP PTI
[Wed Apr  7 21:39:56 2021] CPU: 3 PID: 75078 Comm: kworker/u164:3 Kdump: loaded Tainted: G        W         5.4.39.mx64.334 #1
[Wed Apr  7 21:39:56 2021] Hardware name: Dell Inc. PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013
[Wed Apr  7 21:39:56 2021] Workqueue: rpciod rpc_async_schedule [sunrpc]
[Wed Apr  7 21:39:56 2021] RIP: 0010:nfs4_get_valid_delegation+0x6/0x30 [nfsv4]
[Wed Apr  7 21:39:56 2021] Code: ff ff ff eb 9b e8 fa 39 bf e0 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 f0 80 4f 48 08 c3 0f 1f 44 00 00 66 66 66 66 90 53 <48> 8b 5f b0 31 f6 48 89 df e8 ac fa ff ff 84 c0 b8 00 00 00 00 48
[Wed Apr  7 21:39:57 2021] RSP: 0018:ffffc90019c67df8 EFLAGS: 00010246
[Wed Apr  7 21:39:57 2021] RAX: ffff890543e9e3c0 RBX: ffff89553972d400 RCX: 0000000000000004
[Wed Apr  7 21:39:57 2021] RDX: 0000000000008100 RSI: 0000000000000001 RDI: 0000000000000000
[Wed Apr  7 21:39:57 2021] RBP: ffff894b30e37c00 R08: 0000646f69637072 R09: 8080808080808080
[Wed Apr  7 21:39:57 2021] R10: ffffc9001b103830 R11: fefefefefefefeff R12: 0000000000000004
[Wed Apr  7 21:39:57 2021] R13: ffff8884ebb25800 R14: ffffffffa00b7cb0 R15: 0000000000000000
[Wed Apr  7 21:39:57 2021] FS:  0000000000000000(0000) GS:ffff897f7f600000(0000) knlGS:0000000000000000
[Wed Apr  7 21:39:57 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Apr  7 21:39:57 2021] CR2: ffffffffffffffb0 CR3: 000000ff64e6e003 CR4: 00000000000206e0
[Wed Apr  7 21:39:57 2021] Call Trace:
[Wed Apr  7 21:39:57 2021]  nfs4_open_prepare+0x80/0x1d0 [nfsv4]
[Wed Apr  7 21:39:57 2021]  ? __rpc_atrun+0x20/0x20 [sunrpc]
[Wed Apr  7 21:39:57 2021]  __rpc_execute+0x8c/0x420 [sunrpc]
[Wed Apr  7 21:39:57 2021]  ? __switch_to_asm+0x34/0x70
[Wed Apr  7 21:39:57 2021]  rpc_async_schedule+0x29/0x40 [sunrpc]
[Wed Apr  7 21:39:57 2021]  process_one_work+0x1e5/0x410
[Wed Apr  7 21:39:57 2021]  worker_thread+0x2d/0x3c0
[Wed Apr  7 21:39:57 2021]  ? cancel_delayed_work+0x90/0x90
[Wed Apr  7 21:39:57 2021]  kthread+0x117/0x130
[Wed Apr  7 21:39:57 2021]  ? kthread_create_worker_on_cpu+0x70/0x70
[Wed Apr  7 21:39:57 2021]  ret_from_fork+0x35/0x40
[Wed Apr  7 21:39:57 2021] Modules linked in: overlay af_packet msr nfsv4 nfs rpcsec_gss_krb5 ext4 mbcache jbd2 8021q garp stp mrp llc input_leds mgag200 drm_vram_helper ttm drm_kms_helper drm kvm hid_led led_class fb_sys_fops syscopyarea sysfillrect sysimgblt ixgbe irqbypass smartpqi wmi i7core_edac 3w_sas crc32c_intel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables unix ipv6 nf_defrag_ipv6 autofs4
[Wed Apr  7 21:39:57 2021] CR2: ffffffffffffffb0
[Wed Apr  7 21:39:57 2021] ---[ end trace b1a60fdb4ff7f969 ]---
[Wed Apr  7 21:39:57 2021] RIP: 0010:nfs4_get_valid_delegation+0x6/0x30 [nfsv4]
[Wed Apr  7 21:39:57 2021] Code: ff ff ff eb 9b e8 fa 39 bf e0 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 f0 80 4f 48 08 c3 0f 1f 44 00 00 66 66 66 66 90 53 <48> 8b 5f b0 31 f6 48 89 df e8 ac fa ff ff 84 c0 b8 00 00 00 00 48
[Wed Apr  7 21:39:57 2021] RSP: 0018:ffffc90019c67df8 EFLAGS: 00010246
[Wed Apr  7 21:39:57 2021] RAX: ffff890543e9e3c0 RBX: ffff89553972d400 RCX: 0000000000000004
[Wed Apr  7 21:39:57 2021] RDX: 0000000000008100 RSI: 0000000000000001 RDI: 0000000000000000
[Wed Apr  7 21:39:57 2021] RBP: ffff894b30e37c00 R08: 0000646f69637072 R09: 8080808080808080
[Wed Apr  7 21:39:57 2021] R10: ffffc9001b103830 R11: fefefefefefefeff R12: 0000000000000004
[Wed Apr  7 21:39:57 2021] R13: ffff8884ebb25800 R14: ffffffffa00b7cb0 R15: 0000000000000000
[Wed Apr  7 21:39:57 2021] FS:  0000000000000000(0000) GS:ffff897f7f600000(0000) knlGS:0000000000000000
[Wed Apr  7 21:39:57 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Apr  7 21:39:57 2021] CR2: ffffffffffffffb0 CR3: 000000ff64e6e003 CR4: 00000000000206e0
@donald
Copy link
Collaborator

donald commented Apr 8, 2021

@donald
Copy link
Collaborator

donald commented Apr 8, 2021

hopp has zillions of memory errors logged. Probably unrelated, though.
Can this be reproduced (happens every time) ?
Anyway... no need to spend efforts on old kernel....

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants