Skip to content

Introduces XSD schema loading and validation wrappers #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ Cargo.lock
# Vim swap files
*.swp

# VSCode project folder
.vscode/

# Generated by Cargo
/target/

Expand Down
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ Thanks to @jangernert for the upgrades to `Document` serialization!
### Added
* `Document::to_string_with_options` allowing to customize document serialization
* `Document::SaveOptions` containing the currently supported serialization options, as provided internally by libxml
* `Schema` holding and managing `xmlSchemaPtr` as created while parsing by `SchemaParserContext`
* `SchemaParserContext` holding source of XSD and parsing into a `Schema` while gathering and –in case returning– errors that arise from the XSD parser across the FFI to libxml
* `SchemaValidationContext` holding the `Schema` from resulting `SchemaParserContext` parse and offering validation methods for `Document`, `Node` or file path to XML, while gathering and –in case returning– validation errors from the XML validator across the FFI to libxml

### Changed
* the `Document::to_string()` serialization method is now implemented through `fmt::Display` and no longer takes an optional boolean flag. The default behavior is now unformatted serialization - previously `to_string(false)`, while `to_string(true)` can be realized via
Expand Down
34 changes: 34 additions & 0 deletions examples/schema_example.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
//!
//! Example Usage of XSD Schema Validation
//!
use libxml::schemas::SchemaParserContext;
use libxml::schemas::SchemaValidationContext;

use libxml::parser::Parser;

fn main() {
let xml = Parser::default()
.parse_file("tests/resources/schema.xml")
.expect("Expected to be able to parse XML Document from file");

let mut xsdparser = SchemaParserContext::from_file("tests/resources/schema.xsd");
let xsd = SchemaValidationContext::from_parser(&mut xsdparser);

if let Err(errors) = xsd {
for err in &errors {
println!("{}", err.message());
}

panic!("Failed to parse schema");
}

let mut xsd = xsd.unwrap();

if let Err(errors) = xsd.validate_document(&xml) {
for err in &errors {
println!("{}", err.message());
}

panic!("Invalid XML accoding to XSD schema");
}
}
35 changes: 35 additions & 0 deletions src/error.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
//!
//! Wrapper for xmlError
//!
use super::bindings;

use std::ffi::CStr;

/// Wrapper around xmlErrorPtr
#[derive(Debug)]
pub struct StructuredError(*mut bindings::_xmlError);

impl StructuredError {
/// Wrap around and own a raw xmllib2 error structure
pub fn from_raw(error: *mut bindings::_xmlError) -> Self {
Self { 0: error }
}

/// Human-readable informative error message
pub fn message(&self) -> &str {
let msg = unsafe { CStr::from_ptr((*self.0).message) };

msg.to_str().unwrap()
}

/// Return a raw pointer to the underlying xmlError structure
pub fn as_ptr(&self) -> *const bindings::_xmlError {
self.0 // we loose the *mut since we own it
}
}

impl Drop for StructuredError {
fn drop(&mut self) {
unsafe { bindings::xmlResetError(self.0) }
}
}
8 changes: 8 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,18 @@ mod c_helpers;

/// XML and HTML parsing
pub mod parser;

/// Manipulations on the DOM representation
pub mod tree;

/// XML Global Error Structures and Handling
pub mod error;

/// `XPath` module for global lookup in the DOM
pub mod xpath;

/// Schema Validation
pub mod schemas;

/// Read-only parallel primitives
pub mod readonly;
26 changes: 26 additions & 0 deletions src/schemas/common.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
//!
//! Common Utilities
//!
use crate::bindings;

use crate::error::StructuredError;

use std::cell::RefCell;
use std::ffi::c_void;
use std::rc::Weak;

/// Provides a callback to the C side of things to accumulate xmlErrors to be
/// handled back on the Rust side.
pub fn structured_error_handler(ctx: *mut c_void, error: bindings::xmlErrorPtr) {
let errlog = unsafe { Box::from_raw(ctx as *mut Weak<RefCell<Vec<StructuredError>>>) };

let error = StructuredError::from_raw(error);

if let Some(errors) = errlog.upgrade() {
errors.borrow_mut().push(error);
} else {
panic!("Underlying error log should not have outlived callback registration");
}

Box::leak(errlog);
}
18 changes: 18 additions & 0 deletions src/schemas/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
//!
//! Schema Validation Support (XSD)
//!
//! This module exposes wraps xmlschemas in libxml2. See original documentation or
//! look at the example at examples/schema_example.rs for usage.
//!
//! WARNING: This module has not been tested in a multithreaded or multiprocessing
//! environment.
//!
mod common;
mod parser;
mod schema;
mod validation;

use schema::Schema; // internally handled by SchemaValidationContext

pub use parser::SchemaParserContext;
pub use validation::SchemaValidationContext;
97 changes: 97 additions & 0 deletions src/schemas/parser.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
//!
//! Wrapping of the Parser Context (xmlSchemaParserCtxt)
//!
use super::common;

use crate::bindings;
use crate::error::StructuredError;
use crate::tree::document::Document;

use std::cell::RefCell;
use std::ffi::CString;
use std::os::raw::c_char;
use std::rc::Rc;

/// Wrapper on xmlSchemaParserCtxt
pub struct SchemaParserContext {
inner: *mut bindings::_xmlSchemaParserCtxt,
errlog: Rc<RefCell<Vec<StructuredError>>>,
}

impl SchemaParserContext {
/// Create a schema parsing context from a Document object
pub fn from_document(doc: &Document) -> Self {
let parser = unsafe { bindings::xmlSchemaNewDocParserCtxt(doc.doc_ptr()) };

if parser.is_null() {
panic!("Failed to create schema parser context from XmlDocument"); // TODO error handling
}

Self::from_raw(parser)
}

/// Create a schema parsing context from a buffer in memory
pub fn from_buffer<Bytes: AsRef<[u8]>>(buff: Bytes) -> Self {
let buff_bytes = buff.as_ref();
let buff_ptr = buff_bytes.as_ptr() as *const c_char;
let buff_len = buff_bytes.len() as i32;

let parser = unsafe { bindings::xmlSchemaNewMemParserCtxt(buff_ptr, buff_len) };

if parser.is_null() {
panic!("Failed to create schema parser context from buffer"); // TODO error handling
}

Self::from_raw(parser)
}

/// Create a schema parsing context from an URL
pub fn from_file(path: &str) -> Self {
let path = CString::new(path).unwrap(); // TODO error handling for \0 containing strings
let path_ptr = path.as_bytes_with_nul().as_ptr() as *const i8;

let parser = unsafe { bindings::xmlSchemaNewParserCtxt(path_ptr) };

if parser.is_null() {
panic!("Failed to create schema parser context from path"); // TODO error handling
}

Self::from_raw(parser)
}

/// Drains error log from errors that might have accumulated while parsing schema
pub fn drain_errors(&mut self) -> Vec<StructuredError> {
self.errlog.borrow_mut().drain(0..).collect()
}

/// Return a raw pointer to the underlying xmlSchemaParserCtxt structure
pub fn as_ptr(&self) -> *mut bindings::_xmlSchemaParserCtxt {
self.inner
}
}

/// Private Interface
impl SchemaParserContext {
fn from_raw(parser: *mut bindings::_xmlSchemaParserCtxt) -> Self {
let errors = Rc::new(RefCell::new(Vec::new()));

unsafe {
bindings::xmlSchemaSetParserStructuredErrors(
parser,
Some(common::structured_error_handler),
Box::into_raw(Box::new(Rc::downgrade(&errors))) as *mut _,
);
}

Self {
inner: parser,
errlog: errors,
}
}
}

impl Drop for SchemaParserContext {
fn drop(&mut self) {
unsafe { bindings::xmlSchemaFreeParserCtxt(self.inner) }
}
}
35 changes: 35 additions & 0 deletions src/schemas/schema.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
//!
//! Wrapping of the Schema (xmlSchema)
//!
use super::SchemaParserContext;

use crate::bindings;

use crate::error::StructuredError;

/// Wrapper on xmlSchema
pub struct Schema(*mut bindings::_xmlSchema);

impl Schema {
/// Create schema by having a SchemaParserContext do the actual parsing of the schema it was provided
pub fn from_parser(parser: &mut SchemaParserContext) -> Result<Self, Vec<StructuredError>> {
let raw = unsafe { bindings::xmlSchemaParse(parser.as_ptr()) };

if raw.is_null() {
Err(parser.drain_errors())
} else {
Ok(Self { 0: raw })
}
}

/// Return a raw pointer to the underlying xmlSchema structure
pub fn as_ptr(&self) -> *mut bindings::_xmlSchema {
self.0
}
}

impl Drop for Schema {
fn drop(&mut self) {
unsafe { bindings::xmlSchemaFree(self.0) }
}
}
Loading