-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Description
Currently, XFA (even with LTO) produces code like "load value from memory location A into reg0, load value from memory location B into reg1, and then iterate from reg0 to reg1" -- even if the values in both memory locations are 0.
It Would Be Nice If XFA_LEN could be available to the compiler, as that would make loops over 0-length arrays just go away. Then, many "but XFA is optional here" ifdefs could go away.
This should not need any adjustments in how XFA is used (but allow the above simplification).
How it could be done
- After a build, have a Very Small Shell Script (may actually be Python) pick out all the XFA lengths from the linked binary using
nmand process them into XFA lengths - In the following build, enable the "use static XFA lengths" flag. That will
- replace the XFA_LEN macro with one that looks into the extracted lengths
- put an assertion into the build process that the extracted lengths have not changed (and fail the build process after the above extraction step with a "I am LaTeX please run me again")
If we're courageous, the error could trigger a recursion in make and wouldn't need a "use static XFA lengths" flag any more at the cost of doubling initial build times.
Possible extension
This approach would allow loops over zero-length arrays to be optimized out and some simple unrolling (for lengths of 1 or 2), but no inlining (as the items are extern).
If the extraction works well, a second stage could be considered where not only XFA_LEN is extracted into the preprocessor, but also XFA_USE_CONST could expand to the actual content (which is extracted and asserted to be equal across builds). Then (for example in auto_init) the actual functions can be inlined.
This is tricker not only because it involves more than counting and expressing-in-a-header, but also because realizing its potential benefits (an XFA_USE_CONST array may not need to be actually emitted to the ROM) harder. Thus, it's out of scope for the first iteration.
Useful links
Came up around #16061 (comment)
I didn't find any discussion of something like this in the original XFA PR #15002.
Road map / status
Right now I can't implement this, so I'm letting it sit to collect comments or serve as a starting point if anyone wants this faster than me.
Alternatives
If LTO becomes a lot cleverer, the benefits of this might just vanish.