Commit 814e8828 for xz

commit 814e8828b57af30b529aee15132f4066ddb8fcf0
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   Wed May 20 01:25:41 2026 +0300

    po4a/update-po: Workaround an issue with non-ASCII chars in tables

    Non-ASCII chars in tables (.TS ... .TE) result in warnings:

        man --warnings po4a/man/uk/xz.1 > /dev/null
        troff:<standard input>:464: warning: special character 'u0411\[tbl' not defined
        troff:<standard input>:1157: warning: special character 'u044\[tbl' not defined
        troff:<standard input>:1682: warning: special character 'u0434\[tbl' not defined

    po4a generates groff Unicode escapes like '\u0435' followed immediately
    by '[' in table contexts, causing groff to interpret the full string
    'u0435[tbl' as a single special character name. Fix this by inserting
    a dummy character (\&) after every Unicode escape so that the bracket
    is treated as literal text instead of part of the character name.

    Co-authored-by: Otto Kekäläinen <otto@debian.org>
    Partially-fixes: https://github.com/tukaani-project/xz/pull/220

diff --git a/po4a/update-po b/po4a/update-po
index ef72f836..710b0ed8 100755
--- a/po4a/update-po
+++ b/po4a/update-po
@@ -84,6 +84,12 @@ po4a --force --verbose \
 # This way they won't get included in distribution tarballs.
 rm -f -- *.po.authors

+# Non-ASCII characters in table data can cause groff to merge UTF-8 escapes
+# with internal table markup like [tbl], creating invalid character names.
+# Insert \& (a dummy character) after non-ASCII characters in table data
+# rows to prevent this.
+perl -i -CSD -pe 's/([^\x00-\x7F])/\1\\&/g if /^\.TS/ .. /^\.TE/' man/*/*.1
+
 # Add the customized POT header which contains the SPDX license
 # identifier and spells out the license name instead of saying
 # "the same license as the XZ Utils package".