Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redebug #1583

Open
wants to merge 37 commits into
base: main
Choose a base branch
from
Open

Redebug #1583

wants to merge 37 commits into from

Conversation

Bike
Copy link
Member

@Bike Bike commented May 13, 2024

Refactors debug info in clasp-cleavir (bytecode is basically unaffected). As part of this, refactors how inlining works. Instead of doing the actual procedure integration ourselves, we let LLVM handle it. And as part of this I enabled more LLVM optimizations than we have been doing.

This simplifies our debug info handling and inlining mechanism. It also lets us inline some multiple value calls (e.g. from multiple value bind), which is pretty nice.

Bike and others added 30 commits May 16, 2024 13:25
I'm going to remake it entirely and this is a rare opportunity
where I can start mostly from scratch without breaking
everything. (tests now fail, so some things are still broken)

Bytecode debug info is fine & will be fine since it's fine.
It works well enough to get function names and arguments into
backtraces. Which is really pretty much everything since argument
parsing is hard, so.

It's probably busted w/r/t inlining.
Inlining is where the DWARF generation gets complicated. I also
want to pretty much completely overhaul how inlining works.
the *argument-out* thing is a holdover from bclasp
since it's only loaded very late after clasp-cleavir, unlike before
I want to use the argument processor for inline bodies, so there
may not actually be a closure.
It's not really part of argument parsing, so it doesn't belong in
the "calling convention" code at all.
I don't think it has been able to return false for years, and I'm
not sure why it ever would.
I'm going to have all these return a list of arguments instead, so
returning the number remaining is no good.
and remove the confusing calling-convention stuff. And unify the
code so that the required-and-optionals case works efficiently
without having its own code.

Moving towards being able to process arguments in the middle of a
function (for an inline call).
should make it easier to use for local calls. bonus: no weird
dependence on *datum-values* in xep layouter.
This will be needed to use it for local calls.

I skipped &key because I couldn't bear to make that code even
crazier. Also, &rest arguments are always boxed, which we could
maybe fix at some point.

In the future we should probably force suppliedp parameters to be
boolean rtype, cos what else would they even be.
Unifying this makes translate much less messy with all that
duplicate code, and makes it possible to local call anything,
including functions with &rest and/or &key parameters. Also means
we don't have to allocate a closure for local calls ever - that was
super dumb and I'm disappointed in myself for ever thinking it was
a good idea.

Argument parsing for local calls is moved into the caller in all
cases now. That may be a bit suboptimal for local functions that
are called in more than one place, since the parsing code is
duplicated. I really doubt that will be a big deal except _maybe_
for functions with &key, since key parsing is actually a fair bit
of code. But &key functions are already kind of doomed as far as
optimization goes, and local functions that are called in multiple
places are rare to begin with (only arising from flet/labels,
versus the much more common type checkers and m-v-bind lambdas) so
I'm not too worried. The Right Thing To Do long term is probably
to do as karlosz suggested years ago and represent XEPs explicitly
in the IR. A shorter term fix would be to create pseudo-XEPs for
these functions - "pseudo" in that they would parse arbitrary
arguments, but take the environment as individual arguments instead
of as an allocated closure.
poison seems to be the wave of the future (LLVM docs currently say
to use it instead of undef whenever possible). It's
"more undefined" than undef, so it might help LLVM passes do more
optimization. Though I don't know how much it matters in practice.

I left in a few undefs I didn't immediately understand the context
of, but those can probably be poison too.
When LLVM optimization is on, it will often inline functions, and
in particular will inline core functions into their XEPs. The
DWARF information will reflect this inlining, so the address range
for a given return address may be of the inline function body
rather than that of the actual entry point. And that would break
backtraces.

This fix ignores the address ranges of the return address, instead
just checking the ranges of the object file's literal functions.
That works, but ideally we would use the DWARF information to give
us information about inlined functions. Also, with optimization on
a few tests on backtraces fail, so that also needs fixing.
No idea what this is, but nothing actually uses it
Doing it in a macro is just kind of error prone. COMPILE was not
doing optimization because the macro didn't have an optimizer
listed. And what kind of non-LLVM optimizer function would we even
use to optimize an LLVM module?
As long as the register save area is generated we should be able to
get arguments.
I honestly don't understand why the meat of this change - getting
the function description from the closure - is necessary, since I
would think the fd is available from the entry point. Whatever.
buildPerModuleDefaultPipeline does not operate by side effect.

We should probably wrap all the optimization pipeline/passes stuff,
or at least what we use, so as to experiment with what actually
helps. Right now we just do -O3 depending on a dynamic variable,
which isn't amazing.

I also disabled the inline threshold change. I'm seeing plenty of
inlining even with the default threshold.
It's ugly and I have a better way. Hopefully. Not yet. This means
that inline function source locations are in the wrong place.
you know, for inlining. This fixes source info for inlined calls.
(If LLVM is doing the inlining, anyway.)
arityEntryAddress is something I added while debugging backtraces
and I suppose we may as well keep it around.
having them is probably inefficient but there's no reason we can't
generate the appropriate code to box and unbox.

The ANSI test MISC-105 was failing without this, because with
later inlining an unwind actually remains an unwind. Something to
look into before merge.
Missed a value. Oops
rather than explicitly specifying :source or w/e
Bike added 3 commits May 17, 2024 10:34
Saves a bunch of code to go through file scopes, which are arguably
unnecessary anyway, and definitely an irrelevant implementation
detail.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants