Skip to content

Cancellation is not working properly and integration test is wrong #240

@kpiekara

Description

@kpiekara

I believe it is the same issue as #206 which was closed based on "integration unittest is passing".

This UT is passing:

[Test]
public async Task Crawl_Synchronous_CancellationTokenCancelled_StopsCrawl()
{
    var cancellationTokenSource = new CancellationTokenSource();
    var timer = new System.Timers.Timer(800);
    timer.Elapsed += (o, e) =>
    {
        cancellationTokenSource.Cancel();
        timer.Stop();
        timer.Dispose();
    };
    timer.Start();

    var crawler = new PoliteWebCrawler();
    var result = await crawler.CrawlAsync(new Uri("https://github.com/"), cancellationTokenSource);

    Assert.IsTrue(result.ErrorOccurred);
    Assert.IsTrue(result.ErrorException is OperationCanceledException);
}

But if we change time (from 800ms to 3s) to actually crawler starting to work:

[Test]
public async Task Crawl_Synchronous_CancellationTokenCancelled_StopsCrawl()
{
    var cancellationTokenSource = new CancellationTokenSource();
    var timer = new System.Timers.Timer(3000);
    timer.Elapsed += (o, e) =>
    {
        cancellationTokenSource.Cancel();
        timer.Stop();
        timer.Dispose();
    };
    timer.Start();

    var crawler = new PoliteWebCrawler();
    var result = await crawler.CrawlAsync(new Uri("https://github.com/"), cancellationTokenSource);

    Assert.IsTrue(result.ErrorOccurred);
    Assert.IsTrue(result.ErrorException is OperationCanceledException);
}

We have failure which will crash application as unhandled exception

Exit code is -532462766 (Output is too long. Showing the last 100 lines:

   at System.Threading.CancellationToken.ThrowIfCancellationRequested()
   at Abot2.Crawler.WebCrawler.ThrowIfCancellationRequested()
   at Abot2.Crawler.WebCrawler.ProcessPage(PageToCrawl pageToCrawl)
   at Abot2.Crawler.WebCrawler.<CrawlSite>b__64_0()
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallback.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()

Issue: there is no way to cancel crawler

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions