Skip to content

Button elements are ignored #25

@tom-macneil

Description

@tom-macneil

Trying autologin against some of the sites in the training data, I found that some sites have changed since the data was collected and won't work.
Formasaurus is ignoring 'button' elements, which in these cases are being used for the submit instead of an input element and are required to login.

Examples:

The problem mainly seems to be that Buttons are Elements or HtmlElements. Unlike InputElements these don't have .name or .type attributes so are filtered by if getattr(f, 'name', None), and then if I modify the code so that that doesn't filter them it blows up later on when it assumes it's got .name and .type attributes.

As a hacky workaround/proof I modified html.load_html to convert all button elements to input elements:

    parsed = lxml.html.fromstring(html, base_url=base_url, parser=parser)
    for node in parsed.xpath('//button'):
        new_node = etree.Element("input")
        for a,b in node.items():
            new_node.set(a, b)
        node.getparent().replace(node, new_node)
    return parsed

After which autologin worked on the above sites.

Thanks,

Tom

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions