Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Break up try regions in Compiler::fgMoveColdBlocks, and fix contiguity later #108914

Merged
merged 4 commits into from
Oct 18, 2024

Conversation

amanasifkhalid
Copy link
Member

@amanasifkhalid amanasifkhalid commented Oct 16, 2024

Part of #107749. Compiler::fgMoveColdBlocks currently moves cold try blocks to the end of their innermost regions. This is problematic for our 3-opt layout plans: When identifying a candidate span of blocks to reorder, assuming that all cold blocks are at the end of the method vastly simplifies our implementation. However, if we have EH regions with their own cold sections, fgMoveColdBlocks will interleave hot and cold blocks. To facilitate later layout passes, we can simplify fgMoveColdBlocks to naively move all cold blocks to the end of the method, regardless of EH region, and rely on a "fixup" pass for making EH regions contiguous again.

To start, I've tweaked fgMoveColdBlocks to break up try regions only. When handlers are placed in the funclet section, we don't need to do anything extra to get cold EH blocks out of the main method body's hot section. However, for jitted x86 code, we don't use the funclet model (yet), so cold handler blocks can still litter the main method body, hindering 3-opt's candidate space. I'd rather not expand on this PR's logic to rebuild handler regions if we can do something simpler, such as getting #101613 merged in, and using fgRelocateEHRegions to move all handlers to the end of the method under the assumption that they're cold (i.e. a pseudo-funclet region).

Moving try entry blocks in fgMoveColdBlocks proved painful enough that I think we're better off leaving them as-is. Leaving each try region's entry in-place gives us a nice breadcrumb for reinserting the remaining blocks, and it might be beneficial to leave these entries in the candidate span of blocks for 3-opt, so we can effectively move entire try regions just by moving the entry. For try regions that are entirely cold, I can look into calling fgRelocateEHRegions before fgMoveColdBlocks on all platforms to quickly get these out of the way.

All of this would be unnecessary if we could remove the VM's requirement of contiguous EH regions, and the codegen improvements would likely outweigh the additional VM complexity, though that's a conversation for another day.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 16, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Member Author

/azp run runtime-coreclr outerloop, Fuzzlyn, Antigen

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@amanasifkhalid amanasifkhalid marked this pull request as ready for review October 16, 2024 18:48
@amanasifkhalid
Copy link
Member Author

amanasifkhalid commented Oct 16, 2024

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Fuzzlyn/Antigen runs hit only known failures. Diffs show small changes in cold block placement, and PerfScore diffs look inconsequential. Formalizing cold EH region movement by enabling fgRelocateEHRegions on all platforms would probably net improvements in hot code density -- I can look into this as a follow-up. Thanks!

// If we have such a block that is also part of an exception handler, don't bother moving it.
// Finally, don't move block if it is the beginning of a call-finally pair,
// We only want to move cold blocks.
// Also, don't move block if it is the beginning of a call-finally pair,
// as we want to keep these pairs contiguous
// (if we encounter the end of a pair below, we'll move the whole pair).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated comment; I'll fix this in the next review round.

@BruceForstall
Copy link
Member

This seems worrisome. Don't we have existing IR checking that verifies the nesting requirements of EH regions that will now be ineffective? Also, if EH regions are being broken up, they eventually need to be placed back together in proper nesting format, so are we really gaining anything?

@amanasifkhalid
Copy link
Member Author

Don't we have existing IR checking that verifies the nesting requirements of EH regions that will now be ineffective?

We do: fgVerifyHandlerTab runs as a post-phase check, but since the EH region destruction is contained to optOptimizeLayout, the post-phase verification shouldn't fail as long as fgRebuildEHRegions does its job correctly.

Also, if EH regions are being broken up, they eventually need to be placed back together in proper nesting format, so are we really gaining anything?

This is true, though temporarily breaking up EH regions enables a simpler 3-opt implementation. Before we can start partitioning blocks up, we need to identify a sequence of basic blocks that are worth reordering in the first place (i.e. hot blocks). If we leave EH regions intact before running 3-opt, the search for hot blocks will either bail early when it encounters a cold block in an EH region, or it will need to be insightful enough to skip cold EH blocks. If we do the latter approach, then 3-opt will scramble EH regions, and rebuilding them is awkward if the region's entry is no longer the first block of that region in the lexical ordering. By leaving EH entries in the hot section, we have the added benefit of being able to move entire regions, since they must eventually be contiguous -- in other words, if 3-opt decides fallthrough into a region isn't all that important, it can move the region out of the way without having to think about the region's other blocks, thus treating the entry as a "superblock."

This seems worrisome.

Agreed. I find all of this juggling of EH regions we have to do during layout worrisome, considering the subtler bugs we've hit lately. I think this complexity is a good motivator for eliminating the contiguous EH region constraint altogether (though not during .NET 10), but in the meantime, I think it's worth tolerating these awkward semantics between layout phases if it enables better layout overall, and if we're confident that we'll eventually repair the regions correctly.

@amanasifkhalid amanasifkhalid merged commit 023686e into dotnet:main Oct 18, 2024
108 checks passed
@amanasifkhalid amanasifkhalid deleted the eh-repair-pass branch October 18, 2024 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants