-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduces XSD schema loading and validation wrappers #64
Conversation
In terms of level of abstraction of the API, it is kept as low as possible without exposing intermediate structures which are required to exist somewhere while validation process is in progress. Which is mostly the Schema, wrapping over xmlSchemaPtr. There are still improvements that can be made over some error handling which will be coming as this module gets used in a downstream project.
@lweberk Thanks for a very notable upgrade! While I'm reading the code, do you feel upto quickly enumerating the changes as an addition to |
Sure thing |
Sorry for the delay, I'll take a detailed look over the weekend - and I assume will merge - travelling days at the moment. |
No worries. Take your time. |
Anything on this one? |
So far my only reaction is that you may want to run a |
src/error.rs
Outdated
|
||
/// Wrapper around xmlErrorPtr | ||
#[derive(Debug)] | ||
pub struct XmlStructuredError(*mut bindings::_xmlError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar enough with the approach of "structured errors", which is my only hesitation in merging right away. Are these worth adopting for all errors produced by libxml2, or are they specific to the schema reporting? If the latter, they ought to be under libxml::schema::error
, and if the former, I would need to think about which of the other error interfaces should be refactored to use these - and it they're the right abstraction.
You're also using a FFI-like name with XmlStructuredError
, which is something nicely compartmentalized in bindings.rs
-- all "higher" modules stay away from that naming convention. If these were to be the main error interface of the crate, simply naming the struct Error
would be completely acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They represent a generalized way of reporting errors in libxml2. But to be frank I myself find the whole error handling thing in libxml2 a little confusing and difficult to work with. I get the distinct feeling, that it also grew organically out of the necessities of each module being implemented along the way.
Perhaps we should dig deeper and come up with the cleanest unified approach possible.
As for changing the name, sure thing. No problem. On the other hand I'd prefer not to go for "Error" for it being to generic. Would "StructuredError" be ok? Error is something more reserved to error handling the Rust way, which is not the case unless we make it so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing; would you be opposed to me opening an issue for discussion on the level of abstraction being aimed at in this library wrapper? I'm currently having trouble getting a clear distinction between high and low levels and what they are. It feels a little awkward. Sometimes the wrapping is high level sometimes low. Perhaps it would be sensible to split the project into two layers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I was just discussing the low vs high levels in #60 , and we may indeed benefit from a discussion and a convention. This wrapper is also growing organically, and is starting to try and service several competing use patterns, so it may be a good time to take a step back and discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also StructuredError
is a less generic name than Error
, but it is also a somewhat poor name from the libxml2 folks, more confusing than informative.
It is also available in bindings.rs even in master, so I am unsure what benefit there is to draw from recreating it as part of the wrapper interfaces - might as well unify the error-handling for the whole crate before we move forward here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I was just discussing the low vs high levels in #60 , and we may indeed benefit from a discussion and a convention. This wrapper is also growing organically, and is starting to try and service several competing use patterns, so it may be a good time to take a step back and discuss.
Ok, I'll draft a proposal for discussion in a new issue. Will take some time, since I do not want to jump into discussion without having a clear picture on the topic myself.
Also
StructuredError
is a less generic name thanError
, but it is also a somewhat poor name from the libxml2 folks, more confusing than informative.
Agreed.
It is also available in bindings.rs even in master, so I am unsure what benefit there is to draw from recreating it as part of the wrapper interfaces - might as well unify the error-handling for the whole crate before we move forward here.
You are right. Clarity in the way forward might help to answer the question whether it is a useful approach or not.
Sorry did not see the rustfmt file. Sure thing. Will do. |
to stay away from naming schema used by the bindings::*
as per project policy.
Least pleasant feedback last - it also looks like the memory model has to be carefully analyzed and valgrind be made happy. I'll check if I can quickly spot the cause. Here are the couple of minor leaks revealed by running the test:
|
Oha, looks like pointers are being leaked. Let me have a look. |
Hey @lweberk just found a great simplification that also leaks nothing under valgrind. You can actually frame the errors without any of the let mut errors: Vec<StructuredError> = Vec::new();
unsafe {
bindings::xmlSchemaSetParserStructuredErrors(
parser,
Some(common::structured_error_handler),
errors.as_mut_ptr() as *mut c_void,
);
}
//... and the errlog field is just the minimal:
This reframing keeps things simple in C land - and I don't think you ever need to clone errlog anywhere really. You could also rewrite the handler as: pub fn structured_error_handler(ctx: *mut c_void, error: bindings::xmlErrorPtr) {
let vec_mut = ctx as *mut Vec<StructuredError>;
if let Some(vec_ptr) = unsafe { vec_mut.as_mut() } {
let error = StructuredError::from_raw(error);
vec_ptr.push(error);
} else {
panic!("Underlying error log should not have outlived callback registration");
}
} Which avoids any ownership questions, as you only work on the pointer. |
There is something that worries me about that approach;
Since Will try to elaborate on your proposal, perhaps there is a close enough way at will allow us to remove all that RefCell machinery. |
Could you actually devise a test for this reallocation scenario, so that we have a clear target of the behavior we're guarding against? It sounds a bit like a vague threat at the moment ... And I would always be tempted to patch the memory leak I see against potential undefined behavior I can't emulate. Thanks for looking into the details again! |
Unfortunately its not a potential memory leak, but a segfault due to access after free. Sorry have not been able to get to it yet. |
@cbarber Thanks a lot. I learned a lot from your changes. I'm still relatively new to the dark arts of FFI in Rust :D. |
Lovely, moving to a review in #67 then, thanks again to everyone for the contributions! |
In terms of level of abstraction of the API, it is kept as low as
possible without exposing intermediate structures which are required to
exist somewhere while validation process is in progress. Which is mostly
the Schema, wrapping over xmlSchemaPtr.
There are still improvements that can be made over some error handling
which will be coming as this module gets used in a downstream project.