Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 4369 boom #4401

Closed
wants to merge 16 commits into from
Closed

Fix 4369 boom #4401

wants to merge 16 commits into from

Conversation

SeanTAllen
Copy link
Member

Not sure why yet.

@SeanTAllen SeanTAllen requested a review from jemc August 16, 2023 17:12
@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Aug 16, 2023
@@ -1018,7 +1018,7 @@ static void optimise(compile_t* c, bool pony_specific)
if (c->opt->release) {
MPM = PB.buildPerModuleDefaultPipeline(OptimizationLevel::O3);
} else {
MPM = PB.buildO0DefaultPipeline(OptimizationLevel::O0);
MPM = PB.buildPerModuleDefaultPipeline(OptimizationLevel::O1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This switches us back to "make the boom happen"

@@ -254,6 +254,7 @@ LLVMValueRef gen_localdecl(compile_t* c, ast_t* ast)
// Store the alloca to use when we reference this local.
codegen_setlocal(c, name, alloc);

/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting out the debug info in gen_localdecl removes the boom from the 4369 example

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be an interesting next step to try to pare down to remove debug info for specific locals by name and see which specific local variable is breaking it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, modify the generated LLVM ir for the given program?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joe was thinking "check the name and decide if to put" as using command line tools to link might change the situation.

@SeanTAllen
Copy link
Member Author

armhf failure is unrelated. it is #4391.

@SeanTAllen
Copy link
Member Author

Working with the #4369 original example, I found that the "cause" of the issue is the debug metadata on the name "t1". Without it we are good. I'm leaving the IR out for now as I don't see anything that makes me go "o that is clearly wrong" at the IR level (but I could be missing something).

I did assembly for "Good" aka no debug metadata on t1 and "Bad" aka debug metadata on t1 and the meaningful difference I can see is:

good:

.LBB49_1:
.LBB49_2:
.Ltmp5:
	.loc	4 0 29
	movq	56(%rsp), %rdi
	xorl	%esi, %esi
	callq	pony_alloc_small@PLT
	movq	56(%rsp), %rdi
	movq	%rax, 40(%rsp)

bad:

.LBB49_1:
.LBB49_2:
	.loc	4 0 29
	movq	56(%rsp), %rdi
.Ltmp25:
.Ltmp5:
	xorl	%esi, %esi
	callq	pony_alloc_small@PLT
	movq	56(%rsp), %rdi
	movq	%rax, 40(%rsp)

@SeanTAllen
Copy link
Member Author

reminder on the error we get:

* thread #3, name = 'nick-seg-2', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
    frame #0: 0x00005555555835cc nick-seg-2`ponyint_heap_alloc_small(actor=0x0000555555582806, heap=0x0000555555582846, sizeclass=0, track_finalisers_mask=0) at heap.c:579:29
   576    if(chunk != NULL)
   577    {
   578      // Clear and use the first available slot.
-> 579      uint32_t slots = chunk->slots;
   580      uint32_t bit = __pony_ctz(slots);
   581      slots &= ~((uint32_t)1 << bit);
   582
(lldb) bt
* thread #3, name = 'nick-seg-2', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
  * frame #0: 0x00005555555835cc nick-seg-2`ponyint_heap_alloc_small(actor=0x0000555555582806, heap=0x0000555555582846, sizeclass=0, track_finalisers_mask=0) at heap.c:579:29
    frame #1: 0x000055555557a3ca nick-seg-2`pony_alloc_small(ctx=0x00007fffef4789d0, sizeclass=0) at actor.c:1115:10
    frame #2: 0x000055555556ce85 nick-seg-2`Main_tag_create_oo at list_node.pony:0:29
    frame #3: 0x000055555557966a nick-seg-2`handle_message(ctx=0x00007ffff7c78848, actor=0x00007ffff7c77800, msg=0x00007ffff7c46560) at actor.c:507:7
    frame #4: 0x0000555555578d3a nick-seg-2`ponyint_actor_run(ctx=0x00007ffff7c78848, actor=0x00007ffff7c77800, polling=false) at
 actor.c:592:20
    frame #5: 0x0000555555588143 nick-seg-2`run(sched=0x00007ffff7c78800) at scheduler.c:1075:23
    frame #6: 0x0000555555587630 nick-seg-2`run_thread(arg=0x00007ffff7c78800) at scheduler.c:1127:3
    frame #7: 0x00007ffff7d11b43 libc.so.6`___lldb_unnamed_symbol3481 + 755
    frame #8: 0x00007ffff7da3a00 libc.so.6`___lldb_unnamed_symbol3865 + 11

Note that what is really interesting is that this is our segfault:

  // If there are none in this size class, get a new one.
  if(chunk != NULL)
  {
    // Clear and use the first available slot.
    uint32_t slots = chunk->slots;

So chunk which is not NULL is resulting in signal SIGSEGV: invalid address (fault address: 0x0) which is just bizarro on the face of it.

(lldb) p chunk
(small_chunk_t *) $0 = 0xfff0758b48f87d8b
(lldb) p chunk->slots
error: Couldn't apply expression side effects : Couldn't dematerialize a result variable: couldn't read its memory

@SeanTAllen
Copy link
Member Author

@jemc...

GCC_except_table49:
.Lexception1:
	.byte	255
	.byte	155
	.uleb128 .Lttbase1-.Lttbaseref1
.Lttbaseref1:
	.byte	1
	.uleb128 .Lcst_end1-.Lcst_begin1
.Lcst_begin1:
	.uleb128 .Lfunc_begin49-.Lfunc_begin49
	.uleb128 .Ltmp3-.Lfunc_begin49
	.byte	0
	.byte	0
	.uleb128 .Ltmp3-.Lfunc_begin49
	.uleb128 .Ltmp4-.Ltmp3
	.uleb128 .Ltmp5-.Lfunc_begin49
	.byte	1
	.uleb128 .Ltmp4-.Lfunc_begin49
	.uleb128 .Ltmp6-.Ltmp4
	.byte	0
	.byte	0
	.uleb128 .Ltmp6-.Lfunc_begin49
	.uleb128 .Ltmp7-.Ltmp6
	.uleb128 .Ltmp8-.Lfunc_begin49
	.byte	1

@SeanTAllen
Copy link
Member Author

OK I officially have 2 identical copies of assembly, one goes boom, one does not. Difference, debug or release runtime.

cc @jemc

@SeanTAllen SeanTAllen removed the discuss during sync Should be discussed during an upcoming sync label Sep 5, 2023
@SeanTAllen SeanTAllen closed this Sep 12, 2023
@SeanTAllen SeanTAllen deleted the force-4369 branch February 19, 2024 03:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants