Skip to content

Commit 1296828

Browse files
committed
Merge branch 'master' into nonidem
Conflicts: haskell/DEVLOG.md haskell/lvish/Control/LVish/Sched.hs haskell/lvish/Control/LVish/SchedQueue.hs haskell/lvish/lvish.cabal
2 parents 79eca81 + 9571a40 commit 1296828

File tree

17 files changed

+524
-331
lines changed

17 files changed

+524
-331
lines changed

.jenkins_script.sh

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#!/bin/bash
2+
3+
set -e
4+
set -x
5+
6+
# This is specific to our testing setup at IU:
7+
source $HOME/rn_jenkins_scripts/acquire_ghc.sh
8+
9+
cd haskell/lvish
10+
cabal sandbox init
11+
cabal sandbox hc-pkg list
12+
13+
if [ "$PROF" == "" ] || [ "$PROF" == "0" ]; then
14+
CFG="--disable-library-profiling --disable-executable-profiling"
15+
else
16+
CFG="--enable-library-profiling --enable-executable-profiling"
17+
fi
18+
# --reinstall --force-reinstalls
19+
20+
cabal install $CFG $CABAL_FLAGS --only-dependencies --enable-tests
21+
cabal configure $CFG $CABAL_FLAGS --with-ghc=ghc-$JENKINS_GHC
22+
23+
# Avoding the atomic-primops related bug on linux / GHC 7.6:
24+
if [ `uname` == "Linux" ]; then
25+
cabal install
26+
else
27+
cabal test --show-details=always
28+
fi

haskell/DEVLOG.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,29 @@ v9f is interesting however... It uses no callbacks or handler pools.
364364
It simply writes N ivars and then reads them.
365365

366366

367+
[2013.12.16] {Working on fixing a scheduler bug}
368+
--------------------------------------------------------------------------------
369+
370+
The bug is that the final global poll check on getLV doesn't obey the
371+
GET_ONCE stricture.
372+
373+
This is in the context of the v9f test. Alas... the fix that I tried
374+
(simply doing the execFlag check), is leading to deadlock or
375+
blocked-indefinitely-on-mvar exceptions.
376+
377+
It takes about NUMELEMS=3000 to trigger the the deadlock in a
378+
reasonable number of iterations.
379+
380+
Interestingly not MANY of these iterations are blocking at all. The
381+
writing thread is getting enough ahead that usually no gets block at
382+
all.
383+
384+
Actually... now, with my "fix", I'm seing duplication AND deadlock
385+
with GET_ONCE. Hmm.
386+
387+
388+
389+
367390
[2013.12.26] {Adding a schedule-control facility to the debug-logging one}
368391

369392
The first draft is now running... but it's chewing up a bunch of user
@@ -605,3 +628,51 @@ are timeouts not working?
605628

606629
Ah, ok, I can get failures on AddRemoveSetTests
607630

631+
632+
633+
[2014.01.24] {More scheduler debugging}
634+
----------------------------------------
635+
636+
Seeing failures right now on multiple threads on v9f1 v9f2 v9g and (i
637+
think) mc2.
638+
639+
640+
[2014.01.30] {Working on stressTest and the logging framework}
641+
--------------------------------------------------------------
642+
643+
Right now I'm having trouble getting, e.g., AddRemoveSetTests' v3
644+
working. It deadlocks with the new WaitNum method.
645+
646+
647+
[2014.01.31] {Still some spurious duplication, issue #70}
648+
---------------------------------------------------------
649+
650+
Here's an example:
651+
652+
|2| wrkr0 waitRemovedSize: about to block.
653+
|2| wrkr0 PureSet.waitSize: about to (potentially) block:
654+
|7| wrkr0 [dbg-lvish] getLV: first readIORef , lv 19 on worker 0
655+
|7| wrkr0 [dbg-lvish] getLV (active): check globalThresh, lv 19 on worker 0
656+
|8| wrkr2 [dbg-lvish] putLV: initial lvar status read, lv 19 on worker 2
657+
|8| wrkr2 [dbg-lvish] putLV: setStatus,, lv 19 on worker 2
658+
|5| wrkr2 [dbg-lvish] putLV: about to mutate lvar, lv 19 on worker 2
659+
|8| wrkr2 [dbg-lvish] putLV: read final status before unsetting, lv 19 on worker 2
660+
|8| wrkr0 [dbg-lvish] getLV 20: blocking on LVar, registering listeners...
661+
|8| wrkr2 [dbg-lvish] putLV: UN-setStatus, lv 19 on worker 2
662+
|9| wrkr2 [dbg-lvish] putLV: calling each listener's onUpdate, lv 19 on worker 2
663+
|7| wrkr2 [dbg-lvish] getLV (active): callback: check thresh, lv 19 on worker 2
664+
|8| wrkr0 [dbg-lvish] getLV (active): second frozen check, lv 19 on worker 0
665+
|7| wrkr0 [dbg-lvish] getLV (active): second globalThresh check, lv 19 on worker 0
666+
|7| wrkr0 [dbg-lvish] getLV (active): second globalThresh tripped, remove tok, lv 19 on worker 0
667+
|8| wrkr2 [dbg-lvish] getLV 20 on worker 2: winner check? True
668+
|8| wrkr0 [dbg-lvish] getLV 20 on worker 0: winner check? True
669+
|7| Starting pushWork on worker 2
670+
|2| wrkr0 waitRemovedSize: unblocked, returning.
671+
|2| wrkr2 waitRemovedSize: unblocked, returning.
672+
|5| wrkr0 [dbg-lvish] freezeLV: atomic modify status to Freezing, lv 20 on worker 0
673+
|5| wrkr2 [dbg-lvish] freezeLV: atomic modify status to Freezing, lv 20 on worker 2
674+
|7| !cpu 1 woken up
675+
676+
Rather than debug this, it may be better to test whether the non-idem branch does better.
677+
678+

0 commit comments

Comments
 (0)