This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Upgrade from 5.22.1 to 5.22.2 killed my system

Tags: None
(comma "," separated)
grooveman
Registered Member
Posts
54
Karma
0
Hi.

About 1 week ago, I updated my system (which had been working beautifully for years under KDE Neon). Immediately after the update, my system started displaying all kinds of strangeness.

At first, I thought it was my network, because none of my network shares were accessible, and network monitor was complaining. After trying to figure this out by poking around the system, I found it would lock up after about 2-3 minutes. During this 2-3 minute grace period, I would be subjected to all manner of weirdness on the KDE interface... stacatto mouse movements... artifacts on the screens, failure to paint windows properly, etc. After some trouble-shooting I found that the network was not my problem. Very mysteriously, during my trouble-shooting of this issue, the network stopped complaining... but instead, I was getting CPU stalling behaviours that would continuously complain... and nvidia card errors on my VTs... it was very difficutl to see these things and trouble-shoot them because my grace period before lockup seemed to be getting shorter and shorter. Eventually, I removed all the nvidia elements from the system... no more stalling CPUs... but the system was still exhibiting all the same bizarre behaviors that would always wind up with lockups... but now within a couple seconds of login.

During the week, I'd apply any updates that came my way, in hopes that it would fix this issue. It didn't. I reinstalled the Nvidia drivers. It didn't seem to either help or hinder the situation... I disabled compositing on the desktop... this bought a few minutes again before lockup -- but all the same really weird behaviors were still occurring in the interface (mentioned above), with the new addition that the keyboard would stop sending input (mouse still worked, but I couldn't type anything into a field on the screen). I also lost the ability to view my VTs somewhere along the way.

Now, before people start suggesting this is hardware -- I've already ruled that out by booting to alternate USB/DVD media and by booting to the previous KDE Neon/Ubuntu kernel. Using the previous kernel, I cannot get a good resolution, because it isn't loading the nvidia driver -- but the system is stable, with no complaints of any kind. This problem seems to relate to the kernel itself, or the nvidia driver's goodness of fit with the 5.8 kernel. I've tried logging in using the wayland option, just to see if this helped. Of course, it didn't. I imagine there are some other hoops I have to jump through for that to work right... but I'm not interested in that right now.

I have plenty of empty hard disk space, and plenty of memory. My video card is an nvidia GTX 970, using the 460 proprietary driver. I had been using the PPA driver for a long time now, but currently I'm on the mainstream, Ubuntu driver as part of this trouble-shooting process.

I appreciate any help.

Thanks.

My CPU:
Code: Select all
cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 48
model name      : AMD Athlon(tm) X4 860K Quad Core Processor
stepping        : 1
microcode       : 0x6003104
cpu MHz         : 1696.041
cache size      : 2048 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 2
apicid          : 16
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext ptsc cpb hw_pstate ssbd vmmcall fsgsbase bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold overflow_recov
bugs            : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 7385.78
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro [13]

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 48
model name      : AMD Athlon(tm) X4 860K Quad Core Processor
stepping        : 1
microcode       : 0x6003104
cpu MHz         : 1696.350
cache size      : 2048 KB
physical id     : 0
siblings        : 4
core id         : 1
cpu cores       : 2
apicid          : 17
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext ptsc cpb hw_pstate ssbd vmmcall fsgsbase bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold overflow_recov
bugs            : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 7385.78
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro [13]

processor       : 2
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 48
model name      : AMD Athlon(tm) X4 860K Quad Core Processor
stepping        : 1
microcode       : 0x6003104
cpu MHz         : 1985.590
cache size      : 2048 KB
physical id     : 0
siblings        : 4
core id         : 2
cpu cores       : 2
apicid          : 18
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext ptsc cpb hw_pstate ssbd vmmcall fsgsbase bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold overflow_recov
bugs            : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 7385.78
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro [13]

processor       : 3
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 48
model name      : AMD Athlon(tm) X4 860K Quad Core Processor
stepping        : 1
microcode       : 0x6003104
cpu MHz         : 1898.251
cache size      : 2048 KB
physical id     : 0
siblings        : 4
core id         : 3
cpu cores       : 2
apicid          : 19
initial apicid  : 3
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext ptsc cpb hw_pstate ssbd vmmcall fsgsbase bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold overflow_recov
bugs            : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 7385.78
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro [13]


Here is some of the latest babel from my syslog:
Code: Select all
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1732, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1733, resource id: 37748802, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1734, resource id: 37748802, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1735, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1736, resource id: 37748804, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1737, resource id: 37748804, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1738, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1739, resource id: 37748806, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1740, resource id: 37748806, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1741, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1742, resource id: 37748808, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1743, resource id: 37748808, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1744, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1745, resource id: 37748810, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1746, resource id: 37748810, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1747, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1748, resource id: 37748812, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1749, resource id: 37748812, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1750, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1751, resource id: 37748814, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1752, resource id: 37748814, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 2 (BadValue), sequence: 1753, resource id: 0, major code: 53 (CreatePixmap), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1754, resource id: 37748816, major code: 55 (CreateGC), minor code: 0
Jul  3 14:03:36 HTPC plasmashell[1382]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 1755, resource id: 37748816, major code: 72 (PutImage), minor code: 0
Jul  3 14:03:36 HTPC rtkit-daemon[1229]: Supervising 2 threads of 2 processes of 1 users.
Jul  3 14:03:36 HTPC rtkit-daemon[1229]: Supervising 2 threads of 2 processes of 1 users.
Jul  3 14:03:37 HTPC plasmashell[1382]: org.kde.plasma.pulseaudio: No object for name "alsa_output.pci-0000_01_00.1.hdmi-stereo-extra1.monitor"
Jul  3 14:03:37 HTPC plasmashell[1382]: kf.kio.widgets: Invalid url ""
Jul  3 14:03:37 HTPC plasmashell[1382]: kf.kio.widgets: Invalid url ""
Jul  3 14:03:37 HTPC plasmashell[1382]: kf.solid.frontend.devicemanager: Couldn't get StorageAccess for """" - File doesn't exist
Jul  3 14:03:37 HTPC kdeinit5[1910]: Qt: Session management error: networkIdsList argument is NULL
Jul  3 14:03:37 HTPC rtkit-daemon[1229]: Supervising 2 threads of 2 processes of 1 users.
Jul  3 14:03:37 HTPC rtkit-daemon[1229]: Supervising 2 threads of 2 processes of 1 users.
Jul  3 14:03:45 HTPC systemd[1]: systemd-fsckd.service: Succeeded.
Jul  3 14:03:46 HTPC systemd-timesyncd[648]: Initial synchronization to time server 91.189.89.198:123 (ntp.ubuntu.com).
Jul  3 14:03:51 HTPC packagekitd[1463]: Starting pkgProblemResolver with broken count: 0
Jul  3 14:03:51 HTPC packagekitd[1463]: Starting 2 pkgProblemResolver with broken count: 0
Jul  3 14:03:51 HTPC packagekitd[1463]: Done
Jul  3 14:03:51 HTPC PackageKit: get-updates transaction /14056_cdccbbde from uid 1000 finished with success after 736ms
Jul  3 14:03:52 HTPC dbus-daemon[833]: [system] Activating via systemd: service name='org.freedesktop.locale1' unit='dbus-org.freedesktop.locale1.service' requested by ':1.81' (uid=1000 pid=1418 comm="/usr/lib/x86_64-linux-gnu/libexec/DiscoverNotifier" label="unconfined")
Jul  3 14:03:52 HTPC systemd[1]: Starting Locale Service...
Jul  3 14:03:55 HTPC kded5[1326]: kf.bluezqt: PendingCall Error: "Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken."
Jul  3 14:04:02 HTPC xdg-desktop-portal-kde[1606]: xdp-kde-background: GetAppState called: no parameters
Jul  3 14:04:02 HTPC systemd[1]: systemd-hostnamed.service: State 'stop-sigterm' timed out. Killing.
Jul  3 14:04:02 HTPC systemd[1]: systemd-hostnamed.service: Killing process 932 (systemd-hostnam) with signal SIGKILL.
Jul  3 14:04:05 HTPC kernel: [   56.165259] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [alsa-sink-HDMI :1704]
Jul  3 14:04:05 HTPC kernel: [   56.165264] Modules linked in: md4 cmac nls_utf8 cifs libarc4 fscache libdes snd_hda_codec_hdmi nvidia_uvm(OE) binfmt_misc nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio nvidia_drm(POE) snd_hda_intel snd_intel_dspcfg nvidia_modeset(POE) snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi nvidia(POE) joydev snd_seq edac_mce_amd input_leds kvm_amd ccp snd_seq_device snd_timer kvm k10temp fam15h_power efi_pstore snd drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport drm ip_tables x_tables autofs4 hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ahci i2c_piix4 libahci xhci_pci r8169 xhci_pci_renesas realtek video
Jul  3 14:04:05 HTPC kernel: [   56.165302] CPU: 1 PID: 1704 Comm: alsa-sink-HDMI  Tainted: P           OE     5.8.0-59-generic #66~20.04.1-Ubuntu
Jul  3 14:04:05 HTPC kernel: [   56.165303] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./F2A88XM-D3H, BIOS F6 05/28/2014
Jul  3 14:04:05 HTPC kernel: [   56.165308] RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20
Jul  3 14:04:05 HTPC kernel: [   56.165311] Code: ff 7f 5b 44 89 f0 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 49 89 f8 b8 00
Jul  3 14:04:05 HTPC kernel: [   56.165312] RSP: 0018:ffffb87e82ddbd38 EFLAGS: 00000246
Jul  3 14:04:05 HTPC kernel: [   56.165314] RAX: 0000000000000000 RBX: ffff97a5a2a18ca4 RCX: 0000000000000000
Jul  3 14:04:05 HTPC kernel: [   56.165315] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
Jul  3 14:04:05 HTPC kernel: [   56.165316] RBP: ffffb87e82ddbd38 R08: ffff97a5a2a18c00 R09: ffff97a5b5801b60
Jul  3 14:04:05 HTPC kernel: [   56.165317] R10: 0000000000000000 R11: ffffffffaaa6a908 R12: 0000000000000001
Jul  3 14:04:05 HTPC kernel: [   56.165317] R13: ffff97a5a2a18c28 R14: 0000000000000246 R15: ffff97a5a2a18c00
Jul  3 14:04:05 HTPC kernel: [   56.165319] FS:  00007f5e90707700(0000) GS:ffff97a5b6c80000(0000) knlGS:0000000000000000
Jul  3 14:04:05 HTPC kernel: [   56.165320] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul  3 14:04:05 HTPC kernel: [   56.165321] CR2: 00007fc19d8a3810 CR3: 000000022e2c2000 CR4: 00000000000406e0
Jul  3 14:04:05 HTPC kernel: [   56.165322] Call Trace:
Jul  3 14:04:05 HTPC kernel: [   56.165329]  __synchronize_hardirq+0x80/0xb0
Jul  3 14:04:05 HTPC kernel: [   56.165331]  synchronize_irq+0x39/0xb0
Jul  3 14:04:05 HTPC kernel: [   56.165334]  ? unmap_region+0xf9/0x130
Jul  3 14:04:05 HTPC kernel: [   56.165346]  snd_pcm_sync_stop+0x5c/0x60 [snd_pcm]
Jul  3 14:04:05 HTPC kernel: [   56.165351]  do_hw_free+0x1d/0x60 [snd_pcm]
Jul  3 14:04:05 HTPC kernel: [   56.165355]  snd_pcm_common_ioctl+0x763/0xf20 [snd_pcm]
Jul  3 14:04:05 HTPC kernel: [   56.165358]  ? vm_area_free+0x18/0x20
Jul  3 14:04:05 HTPC kernel: [   56.165360]  ? __do_munmap+0x346/0x540
Jul  3 14:04:05 HTPC kernel: [   56.165365]  snd_pcm_ioctl+0x27/0x40 [snd_pcm]
Jul  3 14:04:05 HTPC kernel: [   56.165368]  ksys_ioctl+0x9d/0xd0
Jul  3 14:04:05 HTPC kernel: [   56.165370]  __x64_sys_ioctl+0x1a/0x20
Jul  3 14:04:05 HTPC kernel: [   56.165373]  do_syscall_64+0x49/0xc0
Jul  3 14:04:05 HTPC kernel: [   56.165375]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul  3 14:04:05 HTPC kernel: [   56.165377] RIP: 0033:0x7f5e9593750b
Jul  3 14:04:05 HTPC kernel: [   56.165379] Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
Jul  3 14:04:05 HTPC kernel: [   56.165380] RSP: 002b:00007f5e90706b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jul  3 14:04:05 HTPC kernel: [   56.165382] RAX: ffffffffffffffda RBX: 00005575c78b9ac0 RCX: 00007f5e9593750b
Jul  3 14:04:05 HTPC kernel: [   56.165382] RDX: 0000000000000000 RSI: 0000000000004112 RDI: 0000000000000016
Jul  3 14:04:05 HTPC kernel: [   56.165383] RBP: 00005575c78b79a0 R08: 0000000000000000 R09: 0000000000000010
Jul  3 14:04:05 HTPC kernel: [   56.165384] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Jul  3 14:04:05 HTPC kernel: [   56.165385] R13: 0000000000000000 R14: 00005575c77fe250 R15: 00005575c77fe250
Jul  3 14:04:17 HTPC dbus-daemon[833]: [system] Activating via systemd: service name='org.freedesktop.locale1' unit='dbus-org.freedesktop.locale1.service' requested by ':1.81' (uid=1000 pid=1418 comm="/usr/lib/x86_64-linux-gnu/libexec/DiscoverNotifier" label="unconfined")
Jul  3 14:04:18 HTPC systemd[1]: systemd-hostnamed.service: Processes still around after SIGKILL. Ignoring.
Jul  3 14:04:33 HTPC kernel: [   84.156315] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [alsa-sink-HDMI :1704]
Jul  3 14:04:33 HTPC kernel: [   84.156319] Modules linked in: md4 cmac nls_utf8 cifs libarc4 fscache libdes snd_hda_codec_hdmi nvidia_uvm(OE) binfmt_misc nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic l
grooveman
Registered Member
Posts
54
Karma
0
5.22.3 came out today.... It fixed everything.
grooveman
Registered Member
Posts
54
Karma
0
I spoke too soon. It seemed to work for a little while there... now it is locking up all the time... make no sense to me... Here I am again on the previous kernel, and things are working fine (except for my nvidia drivers -- I'm working in 600x800 resolution). Going to the new kernel kills the stability of my system...
koffeinfriedhof
Registered Member
Posts
608
Karma
4
OS
Hi!

To get to know what is causing the issues, you could have a look at the changes at kernel.org or at first look at `journalctl -b -xep err` to get the error relevant error messages.
If your system runs with nvidia drivers, be sure not to choose wayland and stay on Xserver as display server.
grooveman
Registered Member
Posts
54
Karma
0
koffeinfriedhof wrote:Hi!

To get to know what is causing the issues, you could have a look at the changes at kernel.org or at first look at `journalctl -b -xep err` to get the error relevant error messages.


This is exceedingly difficult to do, since it typically locks up within seconds now after booting. However, I did get it to fire off once, and it only kicked out the same things I have already listed in my post above: namely the stalled CPU. I tried to get it into a text file so I could reboot to the old kernel and post it exactly, but it just is not stable enough... but I assure you, it is information that is only redundant to what I posted in the logs.

The lock ups are as strange as ever... the mouse will move across the screen, but most things are not clickable... the things I can click usually fail to manifest on the screen.. sometimes I can see the window dimensions, but the window never paints... instead it retains whatever would have been behind it within those dimensions... almost as if it is a carbon-copy of what is beneath it. When this happens, the window is not moveable. The 'x' button is still there and I can click it, but nothing happens. I cannot access VTs, and I cannot reboot the system. I have to do a hard reset.

I have pulled the pc apart and reset everything physically -- and gave it a good clean-out while I was at it... even though I'm confident this isn't hardware related.

Again, booting to the old kernel works fine.

I am not using wayland, I'm sure of that.

Any assistance is appreciated. Thank you.
koffeinfriedhof
Registered Member
Posts
608
Karma
4
OS
grooveman wrote:This is exceedingly difficult to do, since it typically locks up within seconds now after booting.

Depending on your bootloader¹ you can stop booting before SDDM starts enabling «text». For systemd there are some options like:
systemd.log_level=debug and/or systemd.unit=multi-user.target or systemd.unit=rescue
As I do not know grub2 that much, I'd try using multi-user first: On the grub-screen choose the latest kernel and press 'e'. The kernel command line with 'quiet splash' or similar must be carefully changed. Remove quiet and splash to see more messages and insert systemd.unit=multi-user.target instead. Then press F10 to continue boot with these settings.

You should now have a fully working systemd (without gui). Login and get the informations as usual, saving them to a file² for the current log or just the errors including the last logs. The messages above are cut off and do not show the system start which (hopefully) contains more information. You could also log `journalctl -k` (basically the further dmesg command).

¹,²:
    1 → normally grub2
    2 → e.g. `journalctl -b > ~/informations`, use >> instead of > to append to a file
grooveman
Registered Member
Posts
54
Karma
0
koffeinfriedhof wrote:
grooveman wrote:This is exceedingly difficult to do, since it typically locks up within seconds now after booting.

Depending on your bootloader¹ you can stop booting before SDDM starts enabling «text». For systemd there are some options like:
systemd.log_level=debug and/or systemd.unit=multi-user.target or systemd.unit=rescue
As I do not know grub2 that much, I'd try using multi-user first: On the grub-screen choose the latest kernel and press 'e'. The kernel command line with 'quiet splash' or similar must be carefully changed. Remove quiet and splash to see more messages and insert systemd.unit=multi-user.target instead. Then press F10 to continue boot with these settings.

You should now have a fully working systemd (without gui). Login and get the informations as usual, saving them to a file² for the current log or just the errors including the last logs. The messages above are cut off and do not show the system start which (hopefully) contains more information. You could also log `journalctl -k` (basically the further dmesg command).

¹,²:
    1 → normally grub2
    2 → e.g. `journalctl -b > ~/informations`, use >> instead of > to append to a file


Hi, yeah, that is essentially what I'm doing. But there are no problems (and hence no error output) when I don't have the gui up, so journalctl gives me nothing.
koffeinfriedhof
Registered Member
Posts
608
Karma
4
OS
And if you manually start an XServer after startup?
grooveman
Registered Member
Posts
54
Karma
0
I just tried starting X manually... and it is even worse... cannot even get past the giant "K" splash screen before plasma desktop loads.

For S's and G's I installed fluxbox, to see if it would fare any better. It did not. It exhibited the same behavior.

As much as I'd like to know what is going on here, I'm thinking it is time for a reinstall...
koffeinfriedhof
Registered Member
Posts
608
Karma
4
OS
If X and wayland get stuck with every user (you should create a clean one to test) there could be an issue with the graphical stack. For this you could compare the modules loaded into the kernel with `lsmod`. Perhaps it is a driver issue. What graphic card(s) do you use? Are there any acpi(d)-errors?
Code: Select all
lspci -nnk | grep -A3 "\[03..\]:"

will show the graphic card and drivers.
grooveman
Registered Member
Posts
54
Karma
0
I just downloaded a fresh copy of the KDE neon installation media. The same exact problems are present with the the live boot of the installation media itself. If I boot to nomodeset, it works, but the resolution is terrible (as one might expect).

I downloaded a copy of Manjaro, and did the same -- works perfectly, even with the proprietary drivers. It seems pretty clear to me that this is a bug, it just isn't biting very many people. I think it is a hardware + kernel version + nvidia module issue. The current Manjaro installation media is on kernel 5.10, and uses the nvidia drivers 4.65.

I can try what you suggest, but given that it is a stock install, and that it happens even with the installation media, this reeks of a bug. I will post one, then try to delve a little deeper. The problem is, I've already lost a few weeks with this, and I need this to work. I don't know how much time I can put into this... Time is a just a scarce commodity these days (which is why this system is kde neon and not Gentoo!).
grooveman
Registered Member
Posts
54
Karma
0
I cannot do this under kernel 5.8 due to the instability... but here is the output under kernel 5.4:


Code: Select all
 lspci -nnk | grep -A3 "\[03..\]:"
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd GM204 [GeForce GTX 970] [1458:3683]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
koffeinfriedhof
Registered Member
Posts
608
Karma
4
OS
Try to switch to nouveau, your card seems to be working mostly → https://nouveau.freedesktop.org/FeatureMatrix.html

Code: Select all
sudo apt purge nvidia*
sudo mv /etc/X11/xorg.conf{,_old}
sudo reboot
grooveman
Registered Member
Posts
54
Karma
0
I appreciate your input, but I I do not want to switch to nouveau. I prefer the proprietary driver.
koffeinfriedhof
Registered Member
Posts
608
Karma
4
OS
Then you can just try the dkms-driver or the default one.


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Evergrowing, Google [Bot], ourcraft