Skip to content

Conversation

@jpbrodrick89
Copy link
Contributor

@jpbrodrick89 jpbrodrick89 commented Nov 14, 2025

Description of changes

Support union and optional types (e.g. float | str, str | None, Hobby | list[Hobby]:

Resolution rules:
1. If null is in union, remove it and set is_optional=True
2. If a single type remains, return that with is_optional
3. If only int/float types remain, resolve to "number"
4. Otherwise resolve to "union" which is parsed by _autocast (tries to convert to int -> float -> string)

Added checkboxes for optional arguments if unselected then None is passed. Unfortunately, removing the input field conditionally is not possible in a jinja template. Any reason why we're using jinja templates and not pure streamlit here?

Errors now raised for unsupported types

Testing done

⚠️ Made some changes to test_app that weren't directly related to this PR, I removed the output field dummy which is not in the OutputSchema and changed int.hobby.name to str.hobby.name. I have no idea how tests passed previously...

Added test cases to mock schemas and goodbyeworld.
Added unit tests for parsing functions.
Check errors raise for unsupported types
Manual testing.

@dionhaefner
Copy link
Contributor

Looks really useful, thanks. @jacanchaplais are you taking this one?

@jacanchaplais
Copy link
Collaborator

Thanks both @dionhaefner and @jpbrodrick89. Yep, will review this today or tomorrow. :)

Copy link
Collaborator

@jacanchaplais jacanchaplais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jpbrodrick89. Thanks so much for the PR, these features are obviously useful!

I like using a UI fallback of a textbox for union types. However, since the key changes in what's rendered in the web UI is to replace specialised inputs with a textbox, and an optional type is to add a checkbox to the input container, I think we can probably avoid type checking the unions, and tracking if it could be a number, etc.

Tesseract-Core's reliance on Pydantic gives us type checking for free, and also we can just use a few simple if statements on the strings obtained from the textbox inputs. Python strings have a tonne of useful methods for checking the type of what they contain, if you tab complete str.is you'll get

isalnum, isalpha, isascii, isdecimal, isdigit, isidentifier, islower, isnumeric, isprintable, isspace, istitle, isupper

To make sure we can manage code complexity, I'd be keen to implement this feature with a smaller diff, just focusing on detecting if a union type has turned up, and adding a new "type" expected by the Jinja template of "union". Then in the Jinja template, you'd just need to add one extra if statement to create a text_input field if it's a union type, and you could inject a function into the template source that looks something like this (more explicit compared with attempting to parse as JSON and catching exceptions):

def _autocast(data: str) -> int | float | bool | str:
    if data.isdigit():
        return int(data)
    if data.replace(".", "", 1).isdigit():
        return float(data)
    if data.lower() == "false":
        return False
    if data.lower() == "true":
        return True
    return data

As for the optional type, rather than adding more UI, I think it might simpler to append "(optional)" to the title of the input field. Then if they leave the field blank, it'll be assumed that the value should be None.

As for the collections of composite types, I think your JSON parsing idea is reasonable, but I'd rather make that its own PR. I documented it as a limitation of Tesseract Streamlit at the bottom of the README, and I'd like to think about how the UX will be impacted on its own terms.

I hope that makes sense! Thanks again for the contribution, I think if we can add this feature with a smaller footprint on the codebase, it'll add a lot of value and will open us up for some great extensibility moving forward! 💙

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants