-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support functools.partial
handlers
#6
base: master
Are you sure you want to change the base?
Conversation
Handlers can now be wrapped by `functools.partial`, and take arbitrary arguments.
4becf5a
to
74581ba
Compare
Thank you for the PR! I understand that the usecase is to be able to pass some context from the caller of At a first glance, I feel like using I took the time to think about this (and have improved a few things like typing on a branch of mine). My main concern is that this would be cumbersome for nested handlers. For example, it would mean doing something like this: # before
class Handler:
sub_handler: SubHandler
# after
class Handler:
def __init__(self, context):
self.sub_handler = partial(SubHandler, context) … and if you have a chain of handlers you would have to basically pass that context from one to the other using Now, taking a step back, the usecase of passing some data down the call chain is quite frequent, but the usual solutions may not be the best here. Below is what comes to my mind. 1. Use a global variableOf course this works, but creates concurrency issues if you parse several XML documents in parallel. 2. Use a contextvarJust like 1, but creates concurrency issues if you parse several XML documents in parallel in the same thread. 3. Pass variables as parameters from caller to calleeIn regular code, this is maybe the most obvious choice. def f(*args):
...
g(*args)
... But it is not possible because bigxml is calling the callee itself, not the user. However, bigxml could detect a special parameter in the handler signature by its name (say This could look like this: @xml_handle_element("vehicles", "vehicle")
def handler(node, bigxml_context):
if int(node.attributes["speed"]) > bigxml_context:
yield node.text
with open("vehicles.xml", "rb") as stream:
for vehicule in Parser(stream).iter_from(Handler, bigxml_context=150):
print(vehicule) 4. Have caller and callee been methods of the same class instance (and store the data as an attribute on the instance)This works if your handler is a function (by transforming it to a method), but not if it's a class. class Handler:
def __init__(self, speed_threshold):
self.speed_threshold = speed_threshold
@xml_handle_element("vehicles", "vehicle")
def handler(self, node):
if int(node.attributes["speed"]) > self.speed_threshold:
yield node.text
with open("vehicles.xml", "rb") as stream:
for vehicule in Parser(stream).iter_from(Handler(150)):
print(vehicule) But not being able to use class handlers is quite limiting I think. 5. Define the callee inside the caller so that it can access its scope (and use
|
Another possible answer is to implement some separation of concerns. Instead of trying to do everything at the same time, let's the handler only extract the relevant pieces of data, and have a post-processing step afterwards deal with what is needed. This could look like this: @xml_handle_element("vehicles", "vehicle")
def handler(node):
yield (node.text, int(node.attributes["speed"]))
def postprocess(items, speed_threshold):
for vehicule, speed in items:
if speed > speed_threshold:
yield vehicule
with open("vehicles.xml", "rb") as stream:
for vehicule in postprocess(Parser(stream).iter_from(handler), 150):
print(vehicule) I feel like this separation of concerns design makes the code easier to understand, but there may be a performance cost. |
This PR introduces
functools.partial
usage in order to pass arbitrary arguments to handlers.Every handler type can easily be wrapped.
The usage can be tested as described in the docs