Commit 544ddbb637 for qemu.org

commit 544ddbb6373d61292a0e2dc269809cd6bd5edec6
Author: Kevin Wolf <kwolf@redhat.com>
Date:   Wed Mar 4 13:28:00 2026 +0100

    block: Never drop BLOCK_IO_ERROR with action=stop for rate limiting

    Commit 2155d2dd introduced rate limiting for BLOCK_IO_ERROR to emit an
    event only once a second. This makes sense for cases in which the guest
    keeps running and can submit more requests that would possibly also fail
    because there is a problem with the backend.

    However, if the error policy is configured so that the VM is stopped on
    errors, this is both unnecessary because stopping the VM means that the
    guest can't issue more requests and in fact harmful because stopping the
    VM is an important state change that management tools need to keep track
    of even if it happens more than once in a given second. If an event is
    dropped, the management tool would see a VM randomly going to paused
    state without an associated error, so it has a hard time deciding how to
    handle the situation.

    This patch disables rate limiting for action=stop by not relying on the
    event type alone any more in monitor_qapi_event_queue_no_reenter(), but
    checking action for BLOCK_IO_ERROR, too. If the error is reported to the
    guest or ignored, the rate limiting stays in place.

    Fixes: 2155d2dd7f73 ('block-backend: per-device throttling of BLOCK_IO_ERROR reports')
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    Message-ID: <20260304122800.51923-1-kwolf@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>

diff --git a/monitor/monitor.c b/monitor/monitor.c
index 1273eb7260..37fa674cfe 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -367,14 +367,33 @@ monitor_qapi_event_queue_no_reenter(QAPIEvent event, QDict *qdict)
 {
     MonitorQAPIEventConf *evconf;
     MonitorQAPIEventState *evstate;
+    bool throttled;

     assert(event < QAPI_EVENT__MAX);
     evconf = &monitor_qapi_event_conf[event];
     trace_monitor_protocol_event_queue(event, qdict, evconf->rate);
+    throttled = evconf->rate;
+
+    /*
+     * Rate limit BLOCK_IO_ERROR only for action != "stop".
+     *
+     * If the VM is stopped after an I/O error, this is important information
+     * for the management tool to keep track of the state of QEMU and we can't
+     * merge any events. At the same time, stopping the VM means that the guest
+     * can't send additional requests and the number of events is already
+     * limited, so we can do without rate limiting.
+     */
+    if (event == QAPI_EVENT_BLOCK_IO_ERROR) {
+        QDict *data = qobject_to(QDict, qdict_get(qdict, "data"));
+        const char *action = qdict_get_str(data, "action");
+        if (!strcmp(action, "stop")) {
+            throttled = false;
+        }
+    }

     QEMU_LOCK_GUARD(&monitor_lock);

-    if (!evconf->rate) {
+    if (!throttled) {
         /* Unthrottled event */
         monitor_qapi_event_emit(event, qdict);
     } else {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index b66bf316e2..da0b36a375 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5794,7 +5794,7 @@
 # .. note:: If action is "stop", a `STOP` event will eventually follow
 #    the `BLOCK_IO_ERROR` event.
 #
-# .. note:: This event is rate-limited.
+# .. note:: This event is rate-limited, except if action is "stop".
 #
 # Since: 0.13
 #