Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Harald Freudenberger says: ==================== This is a complete rework of the protected key AES (PAES) implementation. The goal of this rework is to implement the 4 modes (ecb, cbc, ctr, xts) in a real asynchronous fashion: - init(), exit() and setkey() are synchronous and don't allocate any memory. - the encrypt/decrypt functions first try to do the job in a synchronous manner. If this fails, for example the protected key got invalid caused by a guest suspend/resume or guest migration action, the encrypt/decrypt is transferred to an instance of the crypto engine (see below) for asynchronous processing. These postponed requests are then handled by the crypto engine by invoking the do_one_request() callback but may of course again run into a still not converted key or the key is getting invalid. If the key is still not converted, the first thread does the conversion and updates the key status in the transformation context. The conversion is invoked via pkey API with a new flag PKEY_XFLAG_NOMEMALLOC. Note that once there is an active requests enqueued to get async processed via crypto engine, further requests also need to go via crypto engine to keep the request sequence. This patch together with the pkey/zcrypt/AP extensions to support the new PKEY_XFLAG_NOMEMMALOC should toughen the paes crypto algorithms to truly meet the requirements for in-kernel skcipher implementations and the usage patterns for the dm-crypt and dm-integrity layers. The new flag PKEY_XFLAG_NOMEMALLOC tells the PKEY layer (and subsidiary layers) that it must not allocate any memory causing IO operations. Note that the patches for this pkey/zcrypt/AP extensions are currently in the features branch but may be seen in the master branch with the next merge. There is still some confusion about the way how paes treats the key within the transformation context. The tfm context may be shared by multiple requests running en/decryption with the same key. So the tfm context is supposed to be read-only. The s390 protected key support is in fact an encrypted key with the wrapping key sitting in the firmware. On each invocation of a protected key instruction the firmware unwraps the pkey and performs the operation. Part of the protected key is a hash about the wrapping key used - so the firmware is able to detect if a protected key matches to the wrapping key or not. If there is a mismatch the cpacf operation fails with cc 1 (key invalid). Such a situation can occur for example with a kvm live guest migration to another machine where the guest simple awakens in a new environment. As the wrapping key is NOT transfered, after the reawakening all protected key cpacf operations fail with "key invalid". There exist other situations where a protected key cpacf operation may run into "key invalid" and thus the code needs to be prepared for such cpacf failures. The recovery is simple: via pkey API the source key material (in real cases this is usually a secure key bound to a HSM) needs to generate a new protected key which is the wrapped by the wrapping key of the current firmware. So the paes tfms hold the source key material to be able to re-generate the protected key at any time. A naive implementation would hold the protected key in some kind of running context (for example the request context) and only the source key would be stored in the tfm context. But the derivation of the protected key from the source key is an expensive and time consuming process often involving interaction with a crypto card. And such a naive implementation would then for every tfm in use trigger the derivation process individual. So why not store the protected key in tfm context and only the very first process hitting the "invalid key" cc runs the derivation and updates the protected key stored in the tfm. The only really important thing is that the protected key update and cloning from this value needs to be done in a atomic fashion. Please note that there are still race conditions where the protected key stored in the tfm may get updated by an (outdated) protected key value. This is not an issue and the code handles this correctly by again re-deriving the protected key. The only fact that matters, is that the protected key must always be in a state where the cpacf instructions can figure out if it is valid (the hash part of the protected key matches to the hash of the wrapping key) or invalid (and refuse the crypto operation with "invalid key"). Changelog: v1 - first version. Applied and tested on top of the mentioned pkey/zcrypt/AP changes. Selftests and multithreaded testcases executed via AP_ALG interface run successful and even instrumented code (with some sleeps to force asynch pathes) ran fine. Code is good enough for a first code review and collecting feedback. v2 - A new patch which does a slight rework of the cpacf_pcc() inline function to return the condition code. A rework of the paes implementation based on feedback from Herbert and Ingo: - the spinlock is now consequently used to protect updates and changes on the protected key and protected key state within the transformation context. - setkey() is now synchronous - the walk is now held in the request context and thus the postponing of a request to the engine and later processing can continue at exactly the same state. - the param block needed for the cpacf instructions is constructed once and held in the request context. - if a request can't get handled synchronous, it is postponed for asynch processing via an instance of the crpyto engine. With v2 comes a patch which updates the crypto engine docu in Documentation/crypto. Feel free to use it or drop it or do some rework - at least it needs some review. v2 was only posted internal to collect some feedback within IBM. v3 - Slight improvements based on feedback from Finn. v4 - With feedback from Holger and Herbert Xu. Holger gave some good hints about better readability of the code and I picked nearly all his suggestions. Herbert noted that once a request goes via engine to keep the sequence as long as there are requests enqueued the following requests should also go via engine. This is now realized via a via_engine_ctr atomic counter in the tfm context. Stress tested with lots of debug code to run through all the failure paths of the code. Looks good. v5 - Fixed two typos and 1 too long line in the commit message found by Holger. Added Acked-by and Reviewed-by. Removed patch #3 which updates the crypto engine docu - this will go separate. All prepared for picking in the s390 subsystem. ==================== Link: https://lore.kernel.org/r/20250514090955.72370-1-freude@linux.ibm.com/ Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
- Loading branch information