Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post content didn't import properly due to parsing issue in get_tag #82

Open
vabc3 opened this issue Oct 16, 2020 · 0 comments
Open

Post content didn't import properly due to parsing issue in get_tag #82

vabc3 opened this issue Oct 16, 2020 · 0 comments

Comments

@vabc3
Copy link

vabc3 commented Oct 16, 2020

Exported XML via wordpress 5.1 could not be properly imported.

The imported content will always have ]> at end of post.

The xml is like:

<content:encoded>
\t\t<![CDATA[some stuff]]>
\t\t</content:encoded>

This is valid XML, but it will cause issue in current parsing logic:

function get_tag( $string, $tag ) {
preg_match( "|<$tag.*?>(.*?)</$tag>|is", $string, $return );
if ( isset( $return[1] ) ) {
if ( substr( $return[1], 0, 9 ) == '<![CDATA[' ) {

'<![CDATA[' is not the heading content, here we need to 'trim' content inside the tag first.

A quick fix is to add \s* around (.*?) in preg_match.

@vabc3 vabc3 changed the title Post content didn't import properly due to parsing issue of get_tag Post content didn't import properly due to parsing issue in get_tag Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants