Commit 5f8fad0647 for openssl.org
commit 5f8fad06475fae024609cf09a1bb2ca8c74b44d6
Author: Milan Broz <gmazyland@gmail.com>
Date: Wed Mar 11 22:40:27 2026 +0100
Align Windows RCU implementation to the pthread variant
Unlike the pthread variant, Windows RCU uses broadcast instead
of targeted signal calls in some places, unnecessarily increasing
the number of used cycles.
The retire_qp should wake up only one thread to proceed, not
all of them. For update_qp, that signals the thread after
increasing writers_alloced, signalling all threads does not make
sense either.
The speedup is significant on lhash_test, running on many CPUs
(on 32 cores, a speedup from 6:20 to 1:40 minutes on test hw).
Co-Authored-By: Claude Opus 4.6 Extended <noreply@anthropic.com>
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Reviewed-by: Saša NedvÄ›dický <sashan@openssl.org>
Reviewed-by: Nikola Pajkovsky <nikolap@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
MergeDate: Fri Mar 13 17:25:47 2026
(Merged from https://github.com/openssl/openssl/pull/30388)
diff --git a/crypto/threads_win.c b/crypto/threads_win.c
index 8b45363572..f96c625a3b 100644
--- a/crypto/threads_win.c
+++ b/crypto/threads_win.c
@@ -348,7 +348,7 @@ static struct rcu_qp *update_qp(CRYPTO_RCU_LOCK *lock, uint32_t *curr_id)
#endif
/* wake up any waiters */
- ossl_crypto_condvar_broadcast(lock->alloc_signal);
+ ossl_crypto_condvar_signal(lock->alloc_signal);
ossl_crypto_mutex_unlock(lock->alloc_lock);
return &lock->qp_group[current_idx];
}
@@ -358,7 +358,7 @@ static void retire_qp(CRYPTO_RCU_LOCK *lock,
{
ossl_crypto_mutex_lock(lock->alloc_lock);
lock->writers_alloced--;
- ossl_crypto_condvar_broadcast(lock->alloc_signal);
+ ossl_crypto_condvar_signal(lock->alloc_signal);
ossl_crypto_mutex_unlock(lock->alloc_lock);
}