Check trust #130

donald · 2020-07-07T12:08:55Z

Show alert on lightdm greeter (login screen) and getty prompt if sytstem lost trust (was removed from "amd" group).

Requires mariux64/bee-files#1845
Addresses mariux64/bee-files#1842

Use sudo systemctl enable getty-checktrust after intallation.

pmenzel

Looks good. Using clusterd sounds like overkill though.

Wouldn’t trying to access a directory like /scratch/tmp be enough?

pmenzel · 2020-07-07T13:35:36Z

lightdm/show-trust-warning

@@ -0,0 +1,27 @@
+#! /usr/bin/bash
+
+(while true; do xdotool search --sync --name bla windowraise; sleep 1; done) &


Could you please extend the commit message (or add a comment here), what xdotool is needed for?

The greeter and potentially the dialog box are two windows on a X11 screen without a window manager. They run in parallel and the order they realize their windows is random. So the stacking order is random, too. Typically the greeter needs a longer time to realize its window, which is fullscreen, and would completely cover the error dialog.

xdotool waits until the dialog window is realized and than raises it above the login window. Because the dialog is small and centered, both windows are then visible.

Under normal circumstances this would not need to be a loop, because only the first dialog is affected. If the user clicks "OK", the conditions are reevaluated and typically another dialog will rosen. But this time it is after the greeter window and thereby on top by itself. Without a window manager, users can't change the layout. But who knows, people are creative. Perhaps user is asleep with head on "Return" key and is able to do dismiss more than one error dialog before the greeter has finished its setup. Or maybe when you enter a username with more than 127 funny unicode-Characters, the login dialog segfaults but has an internal signal handler to restart from scratch....

donald · 2020-07-07T14:16:41Z

Wouldn’t trying to access a directory like /scratch/tmp be enough?

There are a million other reasons, why access to /scratch/tmp might fail. Maybe ""moep" is down or has problems with dead processors, maybe automount is not yet started or someone missedited the map, maybe the network cable not plugged, ....

Using clusterd sounds like overkill though.

In what regard? I'm sure an automount of a remote nfs filesystem consumes a lot more ressources on client and server and network than a single short reply over tcp. netcat -w 1 $host 236 </dev/null : 8 packets. ls -ld /scratch/tmp : 105 packets plus 19 to dismount after the timeout. :-)

donald · 2020-07-08T07:21:25Z

"consumes a lot more ressources" "and server"

I take that back. Although nfs hat more complexity, it is handled in-kernel. clusterd needs a fork and several system calls.

Another problem that came to my mind with the test-mounting alternative is that the timeouts are rather long, undefined and unlimited.

donald · 2020-07-08T07:27:11Z

Perhaps we should retry forever until we get a verdict (network cable not plugged in or all systems/network equipment restarting after power drop)

donald · 2020-07-09T11:06:10Z

I clean this up bit.

checktrust/checktrust

pmenzel · 2020-07-09T11:11:20Z

So, /node/issue.d is for storing issues? What other issues could there be?

Add a very simple tcp service on port 236 to clusterd which can be used by other hosts to query, if they are still trusted. clusterd replies with either "I trust you\n" or "I don't trust you\n" depending on whether the connecting host has the amd hostconfig flag or not. After sending the message, clusterd will hang up.

Add a script to determine whether the system has lost the trust of other systems. Query a few remote systems which are supposed to be online most of the time. Note, that this script has a tristate result (trusted, not trusted, unknown) so we don't communicate the result via exit status, but output "trusted", "not trusted" or nothing.

Install three new files into the system: - /etc/xdg/lightdm/lightdm.conf.d/50-use-wrapper.conf - /usr/libexec/lightdm-greeter-wrapper - /usr/libexec/lightdm-show-trust-warning The first file adds a configuration option to lightdm to invoke the greeter via a wrapper. The second file is the wrapper script, which forks of the third script before exec-ing into the greeter. The third script uses /usr/sbin/trustcheck to find out whether we lost trust of the other nodes. If it gets a negative verdict, it shows a dialog on top of the login screen to alert the user about the condition. If it doesn't get a verdict, it keeps asking (e.g. when the network is not plugged in). xdotool is used to raise the dialog above the (full screen) login window. This has to be done in a loop, because we don't know how long the login windows needs to appear and pop up in front of the dialog.

Create a service "checktrust" which is run before getty is started. If this service detects that the system has lost trust, a warning message is dropped into /node/issue.d/notrust.issue. Create a symlink for agetty in /etc/issue.d to the (only possibly existing) file in the /node path. agetty shows this message before the login prompt. checktrust-for-getty: Use checktrust command

donald · 2020-07-09T11:54:52Z

Jul 09 13:39:45 sigusr2.molgen.mpg.de systemd[1]: Starting Check Mariux64 trust for getty...
Jul 09 13:39:45 sigusr2.molgen.mpg.de getty-checktrust[327]: afk: forward host lookup failed: Host name lookup failure : Resource temporarily unavailable
Jul 09 13:39:51 sigusr2.molgen.mpg.de systemd[1]: Started Check Mariux64 trust for getty.

This might be the same thing as what is happening to exportfs (mariux64/bee-files#1841).
Its reproducible: The error goes away when 127.0.0.1 is removed from /etc/resolve.conf, otherwise this will happen every time. After=unbound.service doesn't help.

Also doesn't happen, when afk is put into /etc/hosts. This is only relevant during testing now, because currently only afk hat the modified clusterd. When this is merged, every system will have the modified clusterd and wtf already is in /etc/hosts (for whatever reason).

donald · 2020-07-09T12:26:29Z

So, /node/issue.d is for storing issues? What other issues could there be?

We invented "/node" is for things we need in the filesystem but are different from node to node. Now we also have "/etc/local" for static configuration, so "/node" is left for things, which are dynamically generated. These things could now be in /var/run as well.

"/etc/issue.d" is from agetty and "/node/issue.d" augments it. The exact semantic ("what other issues could be there") is defined by agetty. Because we want one of the files to be dynamic (computetd during boot or later, not disted) we dist the symlink to "/node/" in "/etc".

pmenzel reviewed Jul 7, 2020

View reviewed changes

donald added Not ready for merge and removed Not ready for merge labels Jul 8, 2020

donald force-pushed the check-trust branch 5 times, most recently from a6e3f4b to 89efc53 Compare July 9, 2020 10:58

pmenzel reviewed Jul 9, 2020

View reviewed changes

checktrust/checktrust Show resolved Hide resolved

donald added 2 commits July 9, 2020 13:28

install.sh: Add function to install a symlink

0cef711

donald force-pushed the check-trust branch from 89efc53 to d22044d Compare July 9, 2020 11:33

donald added 3 commits July 9, 2020 13:35

donald force-pushed the check-trust branch from d22044d to a018d40 Compare July 9, 2020 11:35

donald removed the Not ready for merge label Jul 10, 2020

donald merged commit c96ad4b into master Jul 10, 2020

donald mentioned this pull request Jul 10, 2020

mariux64: A workstation should not start up unusuable, because it lost trust after downtime mariux64/bee-files#1842

Closed

Check trust #130

Check trust #130

donald commented Jul 7, 2020 •

edited

Loading

pmenzel left a comment

pmenzel Jul 7, 2020

donald Jul 7, 2020

donald Jul 7, 2020

donald commented Jul 7, 2020

donald commented Jul 8, 2020 •

edited

Loading

donald commented Jul 8, 2020 •

edited

Loading

donald commented Jul 9, 2020

pmenzel commented Jul 9, 2020

donald commented Jul 9, 2020 •

edited

Loading

donald commented Jul 9, 2020

		@@ -0,0 +1,27 @@
		#! /usr/bin/bash

		(while true; do xdotool search --sync --name bla windowraise; sleep 1; done) &

Check trust #130

Check trust #130

Conversation

donald commented Jul 7, 2020 • edited Loading

pmenzel left a comment

Choose a reason for hiding this comment

pmenzel Jul 7, 2020

Choose a reason for hiding this comment

donald Jul 7, 2020

Choose a reason for hiding this comment

donald Jul 7, 2020

Choose a reason for hiding this comment

donald commented Jul 7, 2020

donald commented Jul 8, 2020 • edited Loading

donald commented Jul 8, 2020 • edited Loading

donald commented Jul 9, 2020

pmenzel commented Jul 9, 2020

donald commented Jul 9, 2020 • edited Loading

donald commented Jul 9, 2020

donald commented Jul 7, 2020 •

edited

Loading

donald commented Jul 8, 2020 •

edited

Loading

donald commented Jul 8, 2020 •

edited

Loading

donald commented Jul 9, 2020 •

edited

Loading