Skip to content

Unfolding should be done first (before tokenizing) #27

@Fuerst2718

Description

@Fuerst2718

If you try to parse a vcard string with a “:” after a CRNL (as an example) and a space (or tab) in a line break, the parser fails. If you unfold the string before parsing, the parsing is successful. The string of the successfully parsed vCard also contains the colon after a space in a line break.
I think the unfolding should be done before parsing a string. Below is a small test program and the output of the program.

fn main() {
    let vcard_strings = vec![
        // '"' at the end of 'U.S.A"' is at col 75,
        // the ':' is in the next line at col 2 after NL + ' ')
        r#"
BEGIN:VCARD
VERSION:4.0
FN:John Doe
ADR;VALUE=text;GEO="geo:12.3457,78.910";LABEL="Mr. John Q. Public, Esq.\n
 Mail Drop: TNE-- QB\n123 Main Street\nAny Other Town, CA  91921-123\nU.S.A."
 :;;123 Main Street;Any Other Town;CA;91921-123;USA.
END:VCARD"#,
        //
        // folded 'NOTE' property with leading space after NL
        r#"
BEGIN:VCARD
VERSION:4.0
FN:Jane Smith
NO
  TE
 :Bad example but should be parsed correctly.
END:VCARD"#,
        //
        // folded 'NOTE' property with leading tab after NL
        r#"
BEG
 IN:
 VCARD
VERSION:4.0
FN:Jane Smith
NO
 TE
 :Very bad example but should be parsed correctly.
END:VCARD"#,
    ];

    for (i, example) in vcard_strings.iter().enumerate() {
        let n = i + 1;
        // remove preceeding whitespaces from the example
        let folded = example.trim_start().to_string();
        // simply unfold the folded lines
        let pre_unfolded = folded
            .replace("\r\n ", "")
            .replace("\n ", "")
            .replace("\r\n\t", "")
            .replace("\n\t", "");
        for (j, vcardstr) in [folded, pre_unfolded].iter().enumerate() {
            let txt = if j == 0 {
                " Folded version"
            } else {
                " Unfolded version"
            };
            println!("--- VCard example #{n} {txt}");
            println!("{vcardstr}");
            println!("---");
            let result = vcard4::parse(vcardstr);
            match result {
                Ok(vcards) => {
                    for vcard in vcards {
                        print!("Parsed vCard(s) as string:...\n{}", vcard.to_string());
                    }
                }
                Err(e) => {
                    println!(">> Failed to parse VCard example {n} {txt}: {e}");
                }
            }
            println!("=== End of example #{n} {txt}\n");
        }
    }
}

and the output ...

--- VCard example #1  Folded version
BEGIN:VCARD
VERSION:4.0
FN:John Doe
ADR;VALUE=text;GEO="geo:12.3457,78.910";LABEL="Mr. John Q. Public, Esq.\n
 Mail Drop: TNE-- QB\n123 Main Street\nAny Other Town, CA  91921-123\nU.S.A."
 :;;123 Main Street;Any Other Town;CA;91921-123;USA.
END:VCARD
---
>> Failed to parse VCard example 1  Folded version: property or parameter delimiter expected
=== End of example #1  Folded version

--- VCard example #1  Unfolded version
BEGIN:VCARD
VERSION:4.0
FN:John Doe
ADR;VALUE=text;GEO="geo:12.3457,78.910";LABEL="Mr. John Q. Public, Esq.\nMail Drop: TNE-- QB\n123 Main Street\nAny Other Town, CA  91921-123\nU.S.A.":;;123 Main Street;Any Other Town;CA;91921-123;USA.
END:VCARD
---
Parsed vCard(s) as string:...
BEGIN:VCARD
VERSION:4.0
FN:John Doe
ADR;VALUE=text;GEO="geo:12.3457,78.910";LABEL="Mr. John Q. Public, Esq.\nM
 ail Drop: TNE-- QB\n123 Main Street\nAny Other Town, CA  91921-123\nU.S.A."
 :;;123 Main Street;Any Other Town;CA;91921-123;USA.
END:VCARD
=== End of example #1  Unfolded version

--- VCard example #2  Folded version
BEGIN:VCARD
VERSION:4.0
FN:Jane Smith
NO
  TE
 :Bad example but should be parsed correctly.
END:VCARD
---
>> Failed to parse VCard example 2  Folded version: property or parameter delimiter expected
=== End of example #2  Folded version

--- VCard example #2  Unfolded version
BEGIN:VCARD
VERSION:4.0
FN:Jane Smith
NO TE:Bad example but should be parsed correctly.
END:VCARD
---
>> Failed to parse VCard example 2  Unfolded version: property or parameter delimiter expected
=== End of example #2  Unfolded version

--- VCard example #3  Folded version
BEG
 IN:
 VCARD
VERSION:4.0
FN:Jane Smith
NO
 TE
 :Very bad example but should be parsed correctly.
END:VCARD
---
>> Failed to parse VCard example 3  Folded version: input token 'Text' was incorrect
=== End of example #3  Folded version

--- VCard example #3  Unfolded version
BEGIN:VCARD
VERSION:4.0
FN:Jane Smith
NOTE:Very bad example but should be parsed correctly.
END:VCARD
---
Parsed vCard(s) as string:...
BEGIN:VCARD
VERSION:4.0
FN:Jane Smith
NOTE:Very bad example but should be parsed correctly.
END:VCARD
=== End of example #3  Unfolded version

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions