Skip to content

Unsure on how to test scrapper #3

@cnshing

Description

@cnshing

The scrapper component uses PRAW to fetch the corresponding data for each comment, parent, and submission.
There are a couple of issues that make testing the scrapper component difficult.

  1. Component is a very minimal wrapper of PRAW.
    Because the component itself isn't very complex and its underlying implementation is already well tested,
    it becomes very redundant to unit test the scrapper. At best, we can check for expected behavior given
    a variety of different input scenarios, which isn't a whole lot since most of the fetching is done under PRAW
    anyway.

  2. Test Cases are not isolative and is dependent on factors outside our control.
    Our component retrieves data based off a lits of ids or authors. The problem is that when the underlying data changes, the expected behavior of can potentially change. For example, if a test case assumes a valid list of submission ids finds that these submissions are deleted for not passing Reddit's post guidelines, then these test cases are no longer valid.

  3. PRAW's testing integration with "cassetes" - I have no idea what to do with them
    I assume that PRAW can pass these limitations with their testing suite. But I currently have no knowledge for how they managed to pass these limitations.

The best idea I can currently think of is to manually create a subreddit and create your own data for the scrapper to work on. With this method, our data should be static and therefore more managable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions