-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideas for how to make compile time faster without affecting performance #48
Comments
@siboehm this manual inlining, how would one go about implementing it? There is def _populate_forest_func(forest, root_func, tree_funcs, fblocksize):
"""Populate root function IR for forest"""
assert fblocksize > 0
# generate the setup-blocks upfront, so each instruction_block can be passed its successor
instr_blocks = [
(
root_func.append_basic_block("instr-block-setup"),
tree_funcs[i : i + fblocksize],
)
for i in range(0, len(tree_funcs), fblocksize)
]
term_block = root_func.append_basic_block("term")
ir.IRBuilder(term_block).ret_void()
for i, (setup_block, tree_func_chunk) in enumerate(instr_blocks):
next_block = instr_blocks[i + 1][0] if i < len(instr_blocks) - 1 else term_block
eval_objective_func = next_block == term_block
_populate_instruction_block(
i,
forest,
root_func,
tree_func_chunk,
setup_block,
next_block,
eval_objective_func,
) Could one just "manually inline" I am also asking because with |
Not sure I understand the question. In the current state, the forest is a function ( Actually now that I'm thinking about it I think the inlining is maybe a red herring. I'm not sure anymore how I ran these benchmarks I posted in the top comment. What may actually be happening is not that the inlining takes a lot of time, it's that the inlining causes the forest function to become very large, which causes the compiler backend to become much slower at codegen. The way to test this would be to:
imo if you want the compilation to be faster on your Kubernetes cluster, then implementing more generic arch targeting is going to be much easier and a much more certain win than the inlining. |
Thanks a lot for sharing those insights and ideas. Let me try to explain my reasoning behind the question: I was thinking how one would implement the manual inlining using as much of the existing code and I was assuming that one would not inline more than I am not familiar with I will definitively look into the generic arch targeting topic, thanks for the pointer. |
I will implement them at some point myself or I'm also accepting PRs:
tree_root
function. That function is enormous (after inlining every tree), so it takes a long time.For mtpl2 (XACT machine):
What's not clear from these timings: How long would the optimization of the inlined function take? E.g. if we already inlined everything in the frontend.
Ways to mitigate:
The text was updated successfully, but these errors were encountered: