Ticket #47321

\leavevmode together with \everypar{\strut} causes \strutbox to have zero dimensions

Open Date: 2023-02-07 06:14 Last Update: 2023-04-09 07:56

Reporter:
Owner:
(None)
Type:
Status:
Closed
Component:
(None)
MileStone:
(None)
Priority:
5 - Medium
Severity:
5 - Medium
Resolution:
None
File:
None

Details

  • The original issue was in build of Japanese version of Python docs which uses Sphinx LaTeX.
  • It got reported at <https://groups.google.com/g/sphinx-users/c/9sKyElODU5I>
  • There the problem was reduced to a small example with titlesec, an unnumbered section and \texttt.
  • The present file reduces it even further and includes analysis of cause of the problem.

Regards, Jean-François B. (jfbu)

Details:

  1. \leavevmode expands to \unhbox \voidb@x
  2. \unhbox is modified in luatexja-core.sty, and \unhcopy also (the original primitive is preserved as \ltj@@orig@unhbox)
  3. the modified \unhbox uses temporary count register \ltj@tempcnta and the modified \unhcopy uses also this registr
  4. when the \everypar tokens cause an \unhcopy this may cause the \ltj@tempcnta to be altered hence the \unhbox will end up as collateral emptying some other box. With \strut it ends up being the \strutbox.

Additional remarks:

  • I wondered if there was an interaction with the "para" hooks from LaTeX of about 2 years ago, but then I accessed a final TeXLive2019 which did not have yet the "para" hooks and the problem was there (and much easier to analyse via \tracingmacros1). Here is the log trace with my TL2019. We can see that \ltj@tempcnta ends up being overwritten when the \strut expands to an \unhcopy
  • I wonder if there is a problem with \ltj@reset@globaldefs being used but not \ltj@restore@globaldefs but I am not knowledgeable in lualatex.
  1. % ****************************** log trace with TeXLive 2019
  2. % ****************************** (without the "para" hooks)
  3. % ****************************** \unhbox/\unhcopy slightly different with TL2022
  4. \leavevmode ->\unhbox \voidb@x
  5. \unhbox ->\ltj@reset@globaldefs \afterassignment \ltj@@unhbox \ltj@tempcnta
  6. \ltj@reset@globaldefs ->\luafunction \ltj@reset@globaldefs@inner
  7. \ltj@@unhbox ->\directlua {luatexja.direction.unbox_check_dir()}\ltj@@orig@unhb
  8. ox \ltj@tempcnta
  9. \eh@prepar ->
  10. \strut ->\protect \strut
  11. \strut ->\relax \ifnum \ltjgetparameter {direction}=1 \ifmmode \copy \dstrutbo
  12. x \else \unhcopy \dstrutbox \fi \else \ifnum \ltjgetparameter {direction}=4 \if
  13. mmode \copy \ystrutbox \else \unhcopy \ystrutbox \fi \else \ifmmode \copy \tstr
  14. utbox \else \unhcopy \tstrutbox \fi \fi \fi
  15. \ltjgetparameter #1->\directlua {luatexja.base.start_time_measure('get_par')}\i
  16. fcsname ltj@@array@param/#1\endcsname \expandafter \ltx@firstoftwo \else \expan
  17. dafter \ltx@secondoftwo \fi {\ltj@@getparam@two {#1}}{\ltj@@getparam@one {#1}}
  18. #1<-direction
  19. \ltx@secondoftwo #1#2->#2
  20. #1<-\ltj@@getparam@two {direction}
  21. #2<-\ltj@@getparam@one {direction}
  22. \ltj@@getparam@one #1->\directlua {luatexja.ext_get_parameter_unary('#1')}
  23. #1<-direction
  24. \ltjgetparameter #1->\directlua {luatexja.base.start_time_measure('get_par')}\i
  25. fcsname ltj@@array@param/#1\endcsname \expandafter \ltx@firstoftwo \else \expan
  26. dafter \ltx@secondoftwo \fi {\ltj@@getparam@two {#1}}{\ltj@@getparam@one {#1}}
  27. #1<-direction
  28. \ltx@secondoftwo #1#2->#2
  29. #1<-\ltj@@getparam@two {direction}
  30. #2<-\ltj@@getparam@one {direction}
  31. \ltj@@getparam@one #1->\directlua {luatexja.ext_get_parameter_unary('#1')}
  32. #1<-direction
  33. \unhcopy ->\ltj@reset@globaldefs \afterassignment \ltj@@unhcopy \ltj@tempcntb
  34. \ltj@reset@globaldefs ->\luafunction \ltj@reset@globaldefs@inner
  35. \ltj@@unhcopy ->\directlua {luatexja.direction.unbox_check_dir(true)}\ltj@@orig
  36. @unhcopy \ltj@tempcntb \directlua {luatexja.direction.uncopy_restore_whatsit()}
  37. \eh@postpar ->

File demonstrating the issue

  1. \documentclass{ltjsbook}
  2. % TO BE COMPILED WITH lualatex OF A TEXLIVE 2022 INSTALLATION
  3. % Februrary 6, 2023
  4. % partial analysis by jfbu:
  5. % Bug: the \strut in the \everypar causes the \unhbox \ltj@tempcnta
  6. % to modify the value of \ltj@tempcnta, because it is overwritten
  7. % by a usage of \unhcopy. Then the box which is unhbox'ed is not
  8. % \voidb@x but ends up referring to \strutbox !
  9. % This bug was observed under these conditions:
  10. % - usage of titlesec
  11. % - for either \section* (i.e. unnumbered) or say \subsubsection
  12. % (with secnumdepth=2)
  13. % - with a heading starting with \texttt{...} (not \ttfamily)
  14. % Indeed titlesec puts a \strut in \everypar, and if the title
  15. % is unnnumbered the first thing will be a \leavevmode inserted
  16. % by \texttt. A numbered entry uses \noindent I think and does
  17. % not trigger this bug. A \ttfamily does no \leavevmode and does
  18. % not trigger the bug.
  19. % Build failure arises with Sphinx in case a table containing
  20. % a multirow merged cell because Sphinx wants to use the
  21. % height of the \strutbox to make a division, so this causes
  22. % a division by zero which is reported by TeX as an arithmetic
  23. % overflow.
  24. % Original issue was reported at
  25. % <https://groups.google.com/g/sphinx-users/c/9sKyElODU5I>
  26. \begin{document}
  27. START OF TEST
  28. \the\ht\strutbox
  29. \everypar{\strut}
  30. % \tracingmacros1
  31. \leavevmode TEST
  32. % \tracingmacros0
  33. %\showthe\ht\strutbox
  34. \the\ht\strutbox% HAS BECOME ZERO PT!
  35. % To confirme the analysis I modify \unhcopy
  36. \makeatletter
  37. % this is syntax for a TeXLive 2022, uptodate as of Feb 6, 2022
  38. \protected\def\ltj@@unhcopy{%
  39. \ltj@reset@globaldefs
  40. \afterassignment\ltj@@unhcopy@
  41. \ltj@tempcntb %<<<<<<<<<<<<<<<<<<<<<< cntb in place of cnta
  42. }
  43. \protected\def\ltj@@unhcopy@{%
  44. \directlua{luatexja.direction.unbox_check_dir(true)}%
  45. \ltj@@orig@unhcopy\ltj@tempcntb %<<<<<<<<<<<<<<<<< cntb in place of cnta
  46. \directlua{luatexja.direction.uncopy_restore_whatsit()}%
  47. }
  48. \let\unhcopy\ltj@@unhcopy
  49. RESET STRUTBOX
  50. \normalsize % triggers reset of \strutbox
  51. \the\ht\strutbox
  52. \everypar{\strut}
  53. \leavevmode TEST WITH PATCH
  54. %\showthe\ht\strutbox
  55. \the\ht\strutbox% IS NOW UNMODIFIED
  56. \end{document}

Ticket History (3/7 Histories)

2023-02-07 06:14 Updated by: jfbu
  • New Ticket "\leavevmode together with \everypar{\strut} causes \strutbox to have zero dimensions" created
2023-02-07 06:34 Updated by: jfbu
Comment

By accident I posted the TeX trace when using a patched version of \unhcopy which used \ltj@tempcntb. In truth the real TeX trace ends like this

  1. \unhcopy ->\ltj@reset@globaldefs \afterassignment \ltj@@unhcopy \ltj@tempcnta
  2. \ltj@reset@globaldefs ->\luafunction \ltj@reset@globaldefs@inner
  3. \ltj@@unhcopy ->\directlua {luatexja.direction.unbox_check_dir(true)}\ltj@@orig
  4. @unhcopy \ltj@tempcnta \directlua {luatexja.direction.uncopy_restore_whatsit()}

i.e. it uses \ltj@tempcnta which originates the issue.

It was bit mind-boggling to me that \ltj@tempcnta value would change between start and end of \unhbox \ltj@tempcnta but this is what happens in the above indeed: \unhcopy reassigns \ltj@tempcnta before it has been used by the \unhbox original TeX primitive...

2023-02-07 20:20 Updated by: h7k
Comment

Thanks for the report. This is mind-boggling to me, too...

a patched version of \unhcopy which used \ltj@tempcntb

I come up with another approach: guarding LuaTeX-ja's \unhcopy by \begingroup...\endgroup (similar guards with \unvcopy, \unhbox, \unvbox).

  1. \let\ltj@@orig@unhcopy\unhcopy
  2. \protected\def\ltj@@unhcopy{\begingroup\ltj@reset@globaldefs\afterassignment\ltj@@unhcopy@\ltj@tempcnta}
  3. \protected\def\ltj@@unhcopy@{%
  4. \directlua{luatexja.direction.unbox_check_dir(true)}%
  5. \ltj@@orig@unhcopy\ltj@tempcnta
  6. \directlua{luatexja.direction.uncopy_restore_whatsit()}\endgroup}

This makes an assignment to \ltj@tempcnta by \unhcopy:

  1. \ltj@@unhbox@ ->\ltj@@lua@unboxcheckdir \ltj@@orig@unhbox \ltj@tempcnta \endgro
  2. up
  3. {expandable luacall 51}
  4. {\unhbox} % <== "\unhbox \voidb@x" (\leavevmode) in vmode
  5. % ... start a paragraph and re-read this command
  6. ...
  7. \strut ->\protect \strut
  8. ...
  9. \unhcopy ->\begingroup \ltj@reset@globaldefs \afterassignment \ltj@@unhcopy@ \l
  10. tj@tempcnta
  11. {\begingroup}
  12. {reassigning \nolocalwhatsits=0}
  13. {reassigning \nolocaldirs=0}
  14. {luacall 32}
  15. {reassigning \globaldefs=0}
  16. {\afterassignment}
  17. {\count187}
  18. {changing \count187=10}
  19. {into \count187=11}
  20. \ltj@@unhcopy@ ->\directlua {luatexja.direction.unbox_check_dir(true)}\ltj@@ori
  21. g@unhcopy \ltj@tempcnta \directlua {luatexja.direction.uncopy_restore_whatsit()
  22. }\endgroup
  23. {\directlua}
  24. {\unhcopy}
  25. {\directlua}
  26. {\endgroup}
  27. {restoring \count187=10}
  28. ...
  29. {\unhbox} % <== "\unhbox \voidb@x" (\leavevmode) in hmode
2023-02-07 23:08 Updated by: jfbu
Comment

Répondre à h7k

Thanks for the report. This is mind-boggling to me, too...

a patched version of \unhcopy which used \ltj@tempcntb

I come up with another approach: guarding LuaTeX-ja's \unhcopy by \begingroup...\endgroup (similar guards with \unvcopy, \unhbox, \unvbox).

I think the scope limiting approach via groups is fine (in theory it has a limitation to 255 levels of nesting if I remember correctly from TeX, I do not know LuaTeX, anyway this is not a truly limiting factor in practice). At any rate, my patch with \ltj@tempcntb was only for me to confirm I had understood the issue, but it is definitely not a robust fix, for example \unhbox can be used in the \everypar too, not only \unhcopy. Your groups if I understand correctly do fix such a situation too of nested usage, that's the whole point.

I had thought of still another approach like this, via some \expandafter.

  1. \protected\def\ltj@@unhbox@{\ltj@@lua@unboxcheckdir\expandafter\ltj@@orig@unhbox\the\ltj@tempcnta\relax}

Other things should probably be modified in this spirit for example

  1. \protected\def\ltj@@unhcopy@{%
  2. \directlua{luatexja.direction.unbox_check_dir(true)}%
  3. \expandafter\ltj@@orig@unhcopy\the\ltj@tempcnta\relax
  4. \directlua{luatexja.direction.uncopy_restore_whatsit()}}

and perhaps similar changes to \unvbox/\unvcopy. After all, \everypar tokens may cause many things to happen.

As I don't know the LuaTeX-ja code base I do not know if the \begingroup/\endgroup may have any side effect but I am quite confident that the above will not have any.

I wanted to show a trace (where I had only patched ltj@unhbox@) but to show what happens I had to replace \ltj@@orig@unhbox with one more layer picking up the digits tokens until \relax to fetch them to the primitive \unhbox

  1. \leavevmode ->\unhbox \voidb@x
  2. \unhbox ->\ltj@reset@globaldefs \afterassignment \ltj@@unhbox@ \ltj@tempcnta
  3. {luacall 32}
  4. {\afterassignment}
  5. {\count187}
  6. \ltj@@unhbox@ ->\ltj@@lua@unboxcheckdir \expandafter \debug@unhbox \the \ltj@te
  7. mpcnta \relax
  8. \debug@unhbox #1\relax ->\ltj@@orig@unhbox #1\relax
  9. #1<-10
  10. {\unhbox}

(I used

  1. \tracingcommands1
  2. \tracingmacros1

)

2023-02-07 23:21 Updated by: jfbu
Comment

Répondre à jfbu

Répondre à h7k

Thanks for the report. This is mind-boggling to me, too...

a patched version of \unhcopy which used \ltj@tempcntb

I come up with another approach: guarding LuaTeX-ja's \unhcopy by \begingroup...\endgroup (similar guards with \unvcopy, \unhbox, \unvbox).

>...

I had thought of still another approach like this, via some \expandafter. {{{ code latex \protected\def\ltj@@unhbox@{\ltj@@lua@unboxcheckdir\expandafter\ltj@@orig@unhbox\the\ltj@tempcnta\relax} }}}

A variant is to insert a space token rather than a \relax I am not sure which is more efficient, but anyhow the \expandafter is probably the costlier bit.

To use a space token as delimiter of the explicit digit tokens one can do:

  1. \def\ltj@@unhbox@ #1% will redefine itself with #1 = space token
  2. {%
  3. \protected\def\ltj@@unhbox@{\ltj@@lua@unboxcheckdir\expandafter\ltj@@orig@unhbox\the\ltj@tempcnta#1}
  4. }%
  5. \ltj@@unhbox@{ }

or alternatively with an \edef and some \unexpanded.

(Edited, 2023-02-07 23:31 Updated by: jfbu)
2023-02-10 19:37 Updated by: h7k
Comment

jfbu への返信

I had thought of still another approach like this, via some \expandafter. {{{ code latex \protected\def\ltj@@unhbox@{\ltj@@lua@unboxcheckdir\expandafter\ltj@@orig@unhbox\the\ltj@tempcnta\relax} }}}

A variant is to insert a space token rather than a \relax I am not sure which is more efficient, but anyhow the \expandafter is probably the costlier bit.

Thanks. It seems that your \expandafter...\the\ltj@tempcnta\relax approach is slightly faster than \expandafter...\the\ltj@tempcnta(space token) and \begingroup...\endgroup approaches.

I'll adopt \expandafter...\the\ltj@tempcnta\relax approach to other commands in few days.

2023-04-09 07:56 Updated by: h7k
  • Status Update from Open to Closed
Comment

This issue is fixed in 20230211.0, so I close this ticket.

Attachment File List

No attachments

Edit

You are not logged in. I you are not logged in, your comment will be treated as an anonymous post. » Login