Commit 5f8fad0647 for openssl.org

commit 5f8fad06475fae024609cf09a1bb2ca8c74b44d6
Author: Milan Broz <gmazyland@gmail.com>
Date:   Wed Mar 11 22:40:27 2026 +0100

    Align Windows RCU implementation to the pthread variant

    Unlike the pthread variant, Windows RCU uses broadcast instead
    of targeted signal calls in some places, unnecessarily increasing
    the number of used cycles.

    The retire_qp should wake up only one thread to proceed, not
    all of them.  For update_qp, that signals the thread after
    increasing writers_alloced, signalling all threads does not make
    sense either.

    The speedup is significant on lhash_test, running on many CPUs
    (on 32 cores, a speedup from 6:20 to 1:40 minutes on test hw).

    Co-Authored-By: Claude Opus 4.6 Extended <noreply@anthropic.com>

    Signed-off-by: Milan Broz <gmazyland@gmail.com>

    Reviewed-by: Saša NedvÄ›dický <sashan@openssl.org>
    Reviewed-by: Nikola Pajkovsky <nikolap@openssl.org>
    Reviewed-by: Tomas Mraz <tomas@openssl.org>
    MergeDate: Fri Mar 13 17:25:47 2026
    (Merged from https://github.com/openssl/openssl/pull/30388)

diff --git a/crypto/threads_win.c b/crypto/threads_win.c
index 8b45363572..f96c625a3b 100644
--- a/crypto/threads_win.c
+++ b/crypto/threads_win.c
@@ -348,7 +348,7 @@ static struct rcu_qp *update_qp(CRYPTO_RCU_LOCK *lock, uint32_t *curr_id)
 #endif

     /* wake up any waiters */
-    ossl_crypto_condvar_broadcast(lock->alloc_signal);
+    ossl_crypto_condvar_signal(lock->alloc_signal);
     ossl_crypto_mutex_unlock(lock->alloc_lock);
     return &lock->qp_group[current_idx];
 }
@@ -358,7 +358,7 @@ static void retire_qp(CRYPTO_RCU_LOCK *lock,
 {
     ossl_crypto_mutex_lock(lock->alloc_lock);
     lock->writers_alloced--;
-    ossl_crypto_condvar_broadcast(lock->alloc_signal);
+    ossl_crypto_condvar_signal(lock->alloc_signal);
     ossl_crypto_mutex_unlock(lock->alloc_lock);
 }