Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make streaming of files possible #24

Open
blancadesal opened this issue Jul 4, 2022 · 1 comment
Open

Make streaming of files possible #24

blancadesal opened this issue Jul 4, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@blancadesal
Copy link
Member

See #5 for details

@blancadesal blancadesal added the enhancement New feature or request label Jul 4, 2022
@blancadesal
Copy link
Member Author

Copying from previous issue:

Streaming gzipped files using requests is indeed possible. 400kb seems to be enough to get all the metadata + at least the first INSERT INTO statement, even in the case of large files with many fields like enwiki page.

Example code:

URL = 'https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-page.sql.gz'
r = requests.get(URL, stream=True)
with open("filename.sql.gz",'wb') as outfile:
    for chunk in r.iter_content(chunk_size=400000):
        if chunk: 
            outfile.write(chunk)
            break

This partial file can then be read as usual using gzip.open().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant