Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky testsuit track list #4193

Open
3 of 10 tasks
soulomoon opened this issue Apr 27, 2024 · 1 comment
Open
3 of 10 tasks

Flaky testsuit track list #4193

soulomoon opened this issue Apr 27, 2024 · 1 comment
Labels
CI Continuous integration component: cli About the pure command line interface of the hls executable flaky test type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc.. type: enhancement New feature or request

Comments

@soulomoon
Copy link
Collaborator

soulomoon commented Apr 27, 2024

Trackers

Some tests are really falky, it is important for us to take a look and solve them.It is crucial important to the HLS stability.
It should be part of #3736

PS. I am doing some migration in #4173 collecting these

Set up long run CI to test the flaky test for 500 times

We can set up a standard to verify a flaky test is gone by running it consectively 500 times.
Since our CI is likely to miss this, we can set up a long running CI elsewhere to track our main branch and run the flaky test we pick, so we can see status of it .

@jhrcek already develop a script to run these test. we can build the long running CI based on it

# recommended to build test binary separately and then run it in a loop (to avoid running cabal test in a loop)
# Run tests in a loop
for i in {1..500}; do
    echo "Iteration $i" &&
    LSP_TEST_LOG_MESSAGES=0 LSP_TEST_LOG_STDERR=0 TASTY_PATTERN="Notification Handlers" $(find dist-newstyle -name ghcide-tests -type f | head -n 1) \
    || {
        echo "Warning: error at iteration $i"
        break
        }; done
@soulomoon soulomoon added type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc.. status: needs triage labels Apr 27, 2024
@soulomoon soulomoon changed the title Flaky test track list Flaky testsuit track list Apr 27, 2024
@soulomoon
Copy link
Collaborator Author

#4194 and #4093 are caused by the malformed status in HLS graph. I wrote something to demonstrate how the problem occuring

Build System

The build system in hls and based on hls-graph which light way shake like system.
hls-graph now is works on session between session.
Whenever something is notified as dirty in the session, the session will restart
and the dirty keys collected from the session will be re-evaluated along with their reverse dependencies.

Registration of key

The key would be registered with the compute function of key in the TheRules.

State of Key

In the inside of hls-graph, it is basically a key-value store.
But the value is not a simple value, but rather a status of the key.

data Status
    = Clean !Result
    | Dirty (Maybe Result)
    | Running {
        runningStep   :: !Step,
        runningWait   :: !(IO ()),
        runningResult :: Result,     -- LAZY
        runningPrev   :: !(Maybe Result)
        }

Sessions

Whenever a session is started, all the marked dirty keys and their reverse dependencies would be marked as dirty session in the kv store, would be fired in parallelization. Scheduling between the running of keys is handled automatically by RTS provided the Status we have.
And whenever as some keys need to be changed to dirty, the session would be restarted.

Dependencies and shortcuts

Before we actually compute the key, it would first try to fire up the dependencies of the key.
It would result in two run mode that would be used in the actual computation.

data RunMode
    = RunDependenciesSame -- ^ My dependencies have not changed.
    | RunDependenciesChanged -- ^ At least one of my dependencies from last time have changed, or I have no recorded dependencies.
      deriving (Eq,Show)

Then compute the key in respective run mode and possible cache, the behavior depends on computation function of the key.
Then we collect the run result record its build time and changed time. After that it is collecting the dependencies of the key and the reverse dependencies of the key. And finally mark the key as clean.

The problem of the build system

There are two places we maintain the state of a key, one is inside of hls-graph database, and the other is outside of hls-graph database
used by the actual computing function see defineEarlyCutoff' families of functions. The outside world contains a copy of the dirtiness and the cache of the key

One problem is that when the session is restarting, we stop the running of current session entirely. Sometime it interrupt at the middle of the computation of the key and make the two state of the key inconsistent, and nothing sensible can be done to prevent it.

Another problem is that the once the dirty key state is added, before the session restart, it should not be removed until the next session is started.

The solution

Problem one

To maintain the consistency state of the key, we need to do dirty keys house keeping and cache house keeping inside the hls-graph database.

Thus I have added a new field to the RunResult called runHook, to be used by the hls-graph to do the house keeping. So the runHook would be called whenever the key is marked as clean, and do house keeping for values in shakeExtra in atomic way.

Problem two

we send the dirty keys directly to the restart session, and the dirty keys would be added to the session only between session. So the dirty keys would be removed from the session only when the session is restarted.

@soulomoon soulomoon added CI Continuous integration component: cli About the pure command line interface of the hls executable type: enhancement New feature or request labels May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous integration component: cli About the pure command line interface of the hls executable flaky test type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc.. type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant