Flaky testsuit track list #4193

soulomoon · 2024-04-27T06:53:33Z

Trackers

Some tests are really falky, it is important for us to take a look and solve them.It is crucial important to the HLS stability.
It should be part of #3736

Flaky simple-multi-def-test on windows #4270
Flaky test failure result in error of GetLinkable #4093 fixed by Stabilize the build system by correctly house keeping the dirtykeys and rule values [flaky test #4185 #4093] #4190
Cabal Plugin Test Case is flaky #3333 fixed by Stabilize the build system by correctly house keeping the dirtykeys and rule values [flaky test #4185 #4093] #4190
ghcide-tests' addDependentFile test #4194 fixed by Stabilize the build system by correctly house keeping the dirtykeys and rule values [flaky test #4185 #4093] #4190
error-order-test‘s InternalError over InvalidParams get stuck https://github.com/haskell/haskell-language-server/actions/runs/8868613651/job/24372650102?pr=4195
ghcide test session-deps-are-picked-up https://github.com/haskell/haskell-language-server/actions/runs/8900448886/job/24442056968?pr=4199
ghcide test IfaceTests, we should remove interface cache dir before we run the test, or the diganostic simply gone #4200
UnitTests's notification handlers run in priority order
ghcide/test/exe/UnitTests.hs:91: expected: [20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1] but got: [20,19]
hls-splice:
TDeclKindError (golden): FAIL (66.44s) Timed out waiting to receive a message from the server. Last message received: { "jsonrpc": "2.0", "method": "$/progress", "params": { "token": "35", "value": { "kind": "end", "message": "Finished indexing 3 files" } } }

ghcide-tests'

 th-linking-test:                                                                                                         FAIL
   Exception: Timed out waiting to receive a message from the server.
   Last message received:
   {
       "jsonrpc": "2.0",
       "method": "$/progress",
       "params": {
           "token": "7",
           "value": {
               "kind": "report",
               "message": "5/5"
           }
       }
   }

PS. I am doing some migration in #4173 collecting these

Set up long run CI to test the flaky test for 500 times

We can set up a standard to verify a flaky test is gone by running it consectively 500 times.
Since our CI is likely to miss this, we can set up a long running CI elsewhere to track our main branch and run the flaky test we pick, so we can see status of it .

@jhrcek already develop a script to run these test. we can build the long running CI based on it

# recommended to build test binary separately and then run it in a loop (to avoid running cabal test in a loop)
# Run tests in a loop
for i in {1..500}; do
    echo "Iteration $i" &&
    LSP_TEST_LOG_MESSAGES=0 LSP_TEST_LOG_STDERR=0 TASTY_PATTERN="Notification Handlers" $(find dist-newstyle -name ghcide-tests -type f | head -n 1) \
    || {
        echo "Warning: error at iteration $i"
        break
        }; done

The text was updated successfully, but these errors were encountered:

soulomoon · 2024-04-28T14:34:29Z

#4194 and #4093 are caused by the malformed status in HLS graph. I wrote something to demonstrate how the problem occuring

Build System

The build system in hls and based on hls-graph which light way shake like system.
hls-graph now is works on session between session.
Whenever something is notified as dirty in the session, the session will restart
and the dirty keys collected from the session will be re-evaluated along with their reverse dependencies.

Registration of key

The key would be registered with the compute function of key in the TheRules.

State of Key

In the inside of hls-graph, it is basically a key-value store.
But the value is not a simple value, but rather a status of the key.

data Status
    = Clean !Result
    | Dirty (Maybe Result)
    | Running {
        runningStep   :: !Step,
        runningWait   :: !(IO ()),
        runningResult :: Result,     -- LAZY
        runningPrev   :: !(Maybe Result)
        }

Sessions

Whenever a session is started, all the marked dirty keys and their reverse dependencies would be marked as dirty session in the kv store, would be fired in parallelization. Scheduling between the running of keys is handled automatically by RTS provided the Status we have.
And whenever as some keys need to be changed to dirty, the session would be restarted.

Dependencies and shortcuts

Before we actually compute the key, it would first try to fire up the dependencies of the key.
It would result in two run mode that would be used in the actual computation.

data RunMode
    = RunDependenciesSame -- ^ My dependencies have not changed.
    | RunDependenciesChanged -- ^ At least one of my dependencies from last time have changed, or I have no recorded dependencies.
      deriving (Eq,Show)

Then compute the key in respective run mode and possible cache, the behavior depends on computation function of the key.
Then we collect the run result record its build time and changed time. After that it is collecting the dependencies of the key and the reverse dependencies of the key. And finally mark the key as clean.

The problem of the build system

There are two places we maintain the state of a key, one is inside of hls-graph database, and the other is outside of hls-graph database
used by the actual computing function see defineEarlyCutoff' families of functions. The outside world contains a copy of the dirtiness and the cache of the key

One problem is that when the session is restarting, we stop the running of current session entirely. Sometime it interrupt at the middle of the computation of the key and make the two state of the key inconsistent, and nothing sensible can be done to prevent it.

Another problem is that the once the dirty key state is added, before the session restart, it should not be removed until the next session is started.

The solution

Problem one

To maintain the consistency state of the key, we need to do dirty keys house keeping and cache house keeping inside the hls-graph database.

Thus I have added a new field to the RunResult called runHook, to be used by the hls-graph to do the house keeping. So the runHook would be called whenever the key is marked as clean, and do house keeping for values in shakeExtra in atomic way.

Problem two

we send the dirty keys directly to the restart session, and the dirty keys would be added to the session only between session. So the dirty keys would be removed from the session only when the session is restarted.

soulomoon added type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc.. status: needs triage labels Apr 27, 2024

soulomoon changed the title ~~Flaky test track list~~ Flaky testsuit track list Apr 27, 2024

soulomoon added flaky test and removed status: needs triage labels Apr 27, 2024

soulomoon added CI Continuous integration component: cli About the pure command line interface of the hls executable type: enhancement New feature or request labels May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky testsuit track list #4193

Flaky testsuit track list #4193

soulomoon commented Apr 27, 2024 •

edited

Loading

soulomoon commented Apr 28, 2024

Flaky testsuit track list #4193

Flaky testsuit track list #4193

Comments

soulomoon commented Apr 27, 2024 • edited Loading

Trackers

Set up long run CI to test the flaky test for 500 times

soulomoon commented Apr 28, 2024

Build System

Registration of key

State of Key

Sessions

Dependencies and shortcuts

The problem of the build system

The solution

Problem one

Problem two

soulomoon commented Apr 27, 2024 •

edited

Loading