Commit 814e8828 for xz
commit 814e8828b57af30b529aee15132f4066ddb8fcf0
Author: Lasse Collin <lasse.collin@tukaani.org>
Date: Wed May 20 01:25:41 2026 +0300
po4a/update-po: Workaround an issue with non-ASCII chars in tables
Non-ASCII chars in tables (.TS ... .TE) result in warnings:
man --warnings po4a/man/uk/xz.1 > /dev/null
troff:<standard input>:464: warning: special character 'u0411\[tbl' not defined
troff:<standard input>:1157: warning: special character 'u044\[tbl' not defined
troff:<standard input>:1682: warning: special character 'u0434\[tbl' not defined
po4a generates groff Unicode escapes like '\u0435' followed immediately
by '[' in table contexts, causing groff to interpret the full string
'u0435[tbl' as a single special character name. Fix this by inserting
a dummy character (\&) after every Unicode escape so that the bracket
is treated as literal text instead of part of the character name.
Co-authored-by: Otto Kekäläinen <otto@debian.org>
Partially-fixes: https://github.com/tukaani-project/xz/pull/220
diff --git a/po4a/update-po b/po4a/update-po
index ef72f836..710b0ed8 100755
--- a/po4a/update-po
+++ b/po4a/update-po
@@ -84,6 +84,12 @@ po4a --force --verbose \
# This way they won't get included in distribution tarballs.
rm -f -- *.po.authors
+# Non-ASCII characters in table data can cause groff to merge UTF-8 escapes
+# with internal table markup like [tbl], creating invalid character names.
+# Insert \& (a dummy character) after non-ASCII characters in table data
+# rows to prevent this.
+perl -i -CSD -pe 's/([^\x00-\x7F])/\1\\&/g if /^\.TS/ .. /^\.TE/' man/*/*.1
+
# Add the customized POT header which contains the SPDX license
# identifier and spells out the license name instead of saying
# "the same license as the XZ Utils package".