Skip to content

XFA: Make accessible to the optimizer #16075

@chrysn

Description

@chrysn

Description

Currently, XFA (even with LTO) produces code like "load value from memory location A into reg0, load value from memory location B into reg1, and then iterate from reg0 to reg1" -- even if the values in both memory locations are 0.

It Would Be Nice If XFA_LEN could be available to the compiler, as that would make loops over 0-length arrays just go away. Then, many "but XFA is optional here" ifdefs could go away.

This should not need any adjustments in how XFA is used (but allow the above simplification).

How it could be done

  • After a build, have a Very Small Shell Script (may actually be Python) pick out all the XFA lengths from the linked binary using nm and process them into XFA lengths
  • In the following build, enable the "use static XFA lengths" flag. That will
    • replace the XFA_LEN macro with one that looks into the extracted lengths
    • put an assertion into the build process that the extracted lengths have not changed (and fail the build process after the above extraction step with a "I am LaTeX please run me again")

If we're courageous, the error could trigger a recursion in make and wouldn't need a "use static XFA lengths" flag any more at the cost of doubling initial build times.

Possible extension

This approach would allow loops over zero-length arrays to be optimized out and some simple unrolling (for lengths of 1 or 2), but no inlining (as the items are extern).

If the extraction works well, a second stage could be considered where not only XFA_LEN is extracted into the preprocessor, but also XFA_USE_CONST could expand to the actual content (which is extracted and asserted to be equal across builds). Then (for example in auto_init) the actual functions can be inlined.

This is tricker not only because it involves more than counting and expressing-in-a-header, but also because realizing its potential benefits (an XFA_USE_CONST array may not need to be actually emitted to the ROM) harder. Thus, it's out of scope for the first iteration.

Useful links

Came up around #16061 (comment)

I didn't find any discussion of something like this in the original XFA PR #15002.

Road map / status

Right now I can't implement this, so I'm letting it sit to collect comments or serve as a starting point if anyone wants this faster than me.

Alternatives

If LTO becomes a lot cleverer, the benefits of this might just vanish.

Metadata

Metadata

Assignees

Labels

Area: coreArea: RIOT kernel. Handle PRs marked with this with care!State: staleState: The issue / PR has no activity for >185 daysType: enhancementThe issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions