@@ -364,6 +364,29 @@ v9f is interesting however... It uses no callbacks or handler pools.
364364It simply writes N ivars and then reads them.
365365
366366
367+ [ 2013.12.16] {Working on fixing a scheduler bug}
368+ --------------------------------------------------------------------------------
369+
370+ The bug is that the final global poll check on getLV doesn't obey the
371+ GET_ONCE stricture.
372+
373+ This is in the context of the v9f test. Alas... the fix that I tried
374+ (simply doing the execFlag check), is leading to deadlock or
375+ blocked-indefinitely-on-mvar exceptions.
376+
377+ It takes about NUMELEMS=3000 to trigger the the deadlock in a
378+ reasonable number of iterations.
379+
380+ Interestingly not MANY of these iterations are blocking at all. The
381+ writing thread is getting enough ahead that usually no gets block at
382+ all.
383+
384+ Actually... now, with my "fix", I'm seing duplication AND deadlock
385+ with GET_ONCE. Hmm.
386+
387+
388+
389+
367390[ 2013.12.26] {Adding a schedule-control facility to the debug-logging one}
368391
369392The first draft is now running... but it's chewing up a bunch of user
@@ -605,3 +628,51 @@ are timeouts not working?
605628
606629Ah, ok, I can get failures on AddRemoveSetTests
607630
631+
632+
633+ [ 2014.01.24] {More scheduler debugging}
634+ ----------------------------------------
635+
636+ Seeing failures right now on multiple threads on v9f1 v9f2 v9g and (i
637+ think) mc2.
638+
639+
640+ [ 2014.01.30] {Working on stressTest and the logging framework}
641+ --------------------------------------------------------------
642+
643+ Right now I'm having trouble getting, e.g., AddRemoveSetTests' v3
644+ working. It deadlocks with the new WaitNum method.
645+
646+
647+ [ 2014.01.31] {Still some spurious duplication, issue #70 }
648+ ---------------------------------------------------------
649+
650+ Here's an example:
651+
652+ |2| wrkr0 waitRemovedSize: about to block.
653+ |2| wrkr0 PureSet.waitSize: about to (potentially) block:
654+ |7| wrkr0 [dbg-lvish] getLV: first readIORef , lv 19 on worker 0
655+ |7| wrkr0 [dbg-lvish] getLV (active): check globalThresh, lv 19 on worker 0
656+ |8| wrkr2 [dbg-lvish] putLV: initial lvar status read, lv 19 on worker 2
657+ |8| wrkr2 [dbg-lvish] putLV: setStatus,, lv 19 on worker 2
658+ |5| wrkr2 [dbg-lvish] putLV: about to mutate lvar, lv 19 on worker 2
659+ |8| wrkr2 [dbg-lvish] putLV: read final status before unsetting, lv 19 on worker 2
660+ |8| wrkr0 [dbg-lvish] getLV 20: blocking on LVar, registering listeners...
661+ |8| wrkr2 [dbg-lvish] putLV: UN-setStatus, lv 19 on worker 2
662+ |9| wrkr2 [dbg-lvish] putLV: calling each listener's onUpdate, lv 19 on worker 2
663+ |7| wrkr2 [dbg-lvish] getLV (active): callback: check thresh, lv 19 on worker 2
664+ |8| wrkr0 [dbg-lvish] getLV (active): second frozen check, lv 19 on worker 0
665+ |7| wrkr0 [dbg-lvish] getLV (active): second globalThresh check, lv 19 on worker 0
666+ |7| wrkr0 [dbg-lvish] getLV (active): second globalThresh tripped, remove tok, lv 19 on worker 0
667+ |8| wrkr2 [dbg-lvish] getLV 20 on worker 2: winner check? True
668+ |8| wrkr0 [dbg-lvish] getLV 20 on worker 0: winner check? True
669+ |7| Starting pushWork on worker 2
670+ |2| wrkr0 waitRemovedSize: unblocked, returning.
671+ |2| wrkr2 waitRemovedSize: unblocked, returning.
672+ |5| wrkr0 [dbg-lvish] freezeLV: atomic modify status to Freezing, lv 20 on worker 0
673+ |5| wrkr2 [dbg-lvish] freezeLV: atomic modify status to Freezing, lv 20 on worker 2
674+ |7| !cpu 1 woken up
675+
676+ Rather than debug this, it may be better to test whether the non-idem branch does better.
677+
678+
0 commit comments