Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text parsing buffer length check #38

Open
joejohndoe opened this issue May 19, 2015 · 4 comments
Open

Text parsing buffer length check #38

joejohndoe opened this issue May 19, 2015 · 4 comments

Comments

@joejohndoe
Copy link

I'm getting "Max buffer length exceeded" errors while parsing large text nodes. Looking at the code, this line seems to indicate that the parser should be able to handle strings that are larger than MAX_BUFFER_LENGTH. But closeText() doesn't exist. In practice, it never gets called because the buffer is named "textNode" instead of "text" (code).

I can work around the issue by increasing MAX_BUFFER_LENGTH, but being able to deal with arbitrary string sizes would be nicer, of course.

@hrdwdmrbl
Copy link

I also have this same (similar) question. I even changed the code to "textNode", but the function closeText doesn't even exist...

@dfahlander
Copy link

dfahlander commented Jan 28, 2020

I have two users of dexie-export-import that bump on this issue too. Could you give some guidance on how to resolve it? Is the only solution to increase MAX_BUFFER_LENGTH to a greater value than 64k or could it be done more streamingly? I suppose we might need to support hundreds of megabytes in our case as people might store large blobs this way.

@corno
Copy link

corno commented Feb 4, 2020

@dfahlander If the blobs are stored in a string, then I think that with the current API, your only option is to increase MAX_BUFFER_LENGTH. The reason is that the API has a callback 'onvalue' which has a string as argument containing all the data at once.
If this is not an option, you might want to change the API to allow streaming values

@dfahlander
Copy link

Thanks! I did increase it to 10MB for now. I see the problem. In our code we also assume that each single row could always fit into memory. We wouldn't benefit from streaming within a row internally. I might increase the limit even more if needed. Using a fork of clarinet right now. I could PR it if there's an interest to incorporate the extended limit into the main package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants