Skip to content

Issue with fpu scheduler on macos/aarch64 #1870

@m4b

Description

@m4b

I tested two applications on with uhyve, and I think there's some issue with fpu_owner in the scheduler in kernel.

E.g., here is stdin app:

[    0.001090][0][DEBUG] Open /proc/version, OpenOption(O_RDWR | O_CREAT), AccessPermission(S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH)
[    0.001106][0][DEBUG] Open file /proc/version with OpenOption(O_RDWR | O_CREAT)
[    0.001804][0][DEBUG] Try to initialize fuse filesystem
[    0.001817][0][INFO] Try to initialize uhyve filesystem
[    0.001842][0][INFO] Mounting uhyve filesystem at /root
[    0.001879][0][DEBUG] Mounting /root
[    0.002006][0][DEBUG] Setting argv as: []
[    0.002015][0][DEBUG] Setting envv as: []
[    0.002053][0][DEBUG] Receive interrupt 1
[    0.002075][0][INFO] Jumping into application
[    0.002108][0][WARN] Unable to read entropy! Fallback to a naive implementation!
[    0.002130][0][DEBUG] Switching FPU owner from task 0 to 1
[    0.002261][0][DEBUG] Create EventFd 0, EventFlags(EFD_NONBLOCK | EFD_CLOEXEC)
[    0.002617][0][DEBUG] sys_socket: domain 3, type 18433, protocol 0
Error: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }
[    0.003833][0][DEBUG] Exit program with error code 1!
[    0.003844][0][INFO] Number of interrupts
[    0.003858][0][INFO] [0][Reschedule]: 2

I saw it crash in the scheduler, but I can't seem to repro, it was a double borrow on the mutable section, so I modified it to print:

diff --git a/src/scheduler/mod.rs b/src/scheduler/mod.rs
index 8944058a..38512999 100644
--- a/src/scheduler/mod.rs
+++ b/src/scheduler/mod.rs
@@ -664,7 +664,11 @@ impl PerCoreScheduler {
                                self.current_task.borrow().id
                        );

-                       self.fpu_owner.borrow_mut().last_fpu_state.save();
+                       if let Some(mut owner) = self.fpu_owner.try_borrow_mut().ok() {
+                               owner.last_fpu_state.save();
+                       } else {
+                               info!("Could not borrow fpu_owner");
+                       }
                        self.current_task.borrow().last_fpu_state.restore();
                        self.fpu_owner = self.current_task.clone();
                }

If I git stash this I get on the stdin example:

[    0.001857][0][WARN] Unable to read entropy! Fallback to a naive implementation!
[    0.001877][0][DEBUG] Switching FPU owner from task 0 to 1
Error: Kind(Uncategorized)
[    0.002291][0][DEBUG] Exit program with error code 1!
[    0.002305][0][INFO] Number of interrupts
[    0.002316][0][INFO] [0][Reschedule]: 2

hermit-rs: 5e12a62bfc538ae91103e0415065f7062bf8938c
kernel: 53536a5c0db8e1b84d828057b1c6e6ec97bcd78e

commands I'm running, with a git uhyve signed as instructions note:

HERMIT_LOG_LEVEL_FILTER=Debug cargo +nightly b -Zbuild-std=std,panic_abort --target aarch64-unknown-hermit
RUST_LOG=debug RUST_BACKTRACE=1 uhyve ../../target/aarch64-unknown-hermit/debug/stdin

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions