Innodb Page Compression #21

artfiedler · 2021-01-15T16:58:26Z

It seems one of the ibd files I'm looking to recover (I altered the table thinking I was doing a "create like", so some columns got dropped) has a mix of compressed pages and uncompressed. I'm able to extract X number or rows that I can visibility see in the ibd file that I believe was before I set the page compression on a year ago. However everything since then is not being pulled out, I believe its because of this page compression, I see in the 36mb file a bunch of compressed looking text (believe zlib) however the extracted data results in 5.5mb or so

Does this tool support decompressing pages? Will it?

artfiedler · 2021-01-15T17:35:37Z

Here is some information on the page compression, https://mariadb.com/kb/en/innodb-page-compression/

akuzminsky · 2021-01-15T17:48:49Z

stream_parser cannot handle compressed pages. It doesn't understand that format and a size of a page is different (less than 16k).
You may want to look at https://bazaar.launchpad.net/~akuzminsky/percona-data-recovery-tool-for-innodb/decompress/view/head:/page_parser.c .
It's an experimental branch. AFAIR, if page_parser sees a compressed page it will uncompress it and save as a separate file.

artfiedler · 2021-01-16T03:00:57Z

Well, I modified(hacked it with an axe) your stream_parser and it seems to now support mariadb's innodb page compression. Previously out of a 36MB file 372 pages were uncompressed and resulted in about 5.5MB of data extracted... now with this page compression support added I'm able to get another 1345 pages extracted resulting in 27MB data extracted... there appears to also be some other "mysql" compression pages as well which were skipped until I find some information on that.

However, now I ran into the issue ~~that c_parser errors on sql_parser.y line 149~~(this was due to wrong field name in the table create script for the primary key) now getting Segmentation fault which I think is telling me its hitting some pages that only have 3 fields verse the 9 fields of the original table (alter table dropped some columns) so hopefully I'll be able to wack this with a branch and see if I can get c_parser to output 2 different schemas or skip pages that dont match the current schema and just run it twice.

artfiedler · 2021-01-16T13:56:10Z

Score! Was able to extract the data I needed only lost 0.001% (hand full of rows), but those wont be worth my time recovering... they are probably there just in that mysql compression format instead of the mariadb page compression format.

Few problems I ran into generating the data with c_parser

datetime in mariadb has an optional microsecond, it seems based on the create script if you exclude it, it was consuming more bytes from the pages so everything was offset by about 3-4 bytes, set the create script to datetime(0) and it worked correctly.
column character set for DB trax id or the other internal field was producing a segment fault, but it was only for the debug printing it seems so I just commented out that line, this was before correcting the datetime(0), so maybeee.... this issue would have went away by itself, not sure.
sql tab separated format would not load, so I made an option -s to generate insert values() lines instead, worked fine

I'll removed my debugging junk I threw in and I'll attach the updated files here... you may want to organize the code a little differently, I was all about getting it done as fast as possible

artfiedler · 2021-01-16T15:30:51Z

Attached at the modified files, needs zlib, it should be easy to add other compression support just need the references and add the call to the libraries decompress function. I rarely write c/c++ so may need to fix a data type here or there, not sure it matters it works!

After I removed my forced debug for the c_parser it doesn't seg fault on the debug print... anyway see attached, merge if you would like.
modified.zip

akuzminsky · 2021-01-18T18:03:37Z

Thank you for your contribution!

bmakan · 2021-02-24T15:36:39Z

@artfiedler I'm trying to compile your modified code, but it's failing with fatal error: zlib/zlib.h: No such file or directory even after I installed the zlib-devel library (centos). Do I need to do something else to compile this beside running make?

Edit:
Sorry for bothering you. I managed to do it. Had to replace quote include with the sharp bracket include.

Turns out it didn't help my case. The parsed data is still missing innodb pages and even the parsed rows have weird values for some columns (usually the first few are fine).
The logs always ignores a lot of pages:

Stream contained 0 blob, 5 innodb, 0 mysql compressed, 0 mariadb compressed and 1567 ignored page read attempts

I suppose my data is corrupted beyond recovery.

artfiedler · 2021-02-24T21:40:59Z

I believe I copied zlib.h into the libs folder

…

Sent from my Windows 10 device From: Branislav Makan Sent: Wednesday, February 24, 2021 9:36 AM To: twindb/undrop-for-innodb Cc: Arthur Fiedler; Mention Subject: Re: [twindb/undrop-for-innodb] Innodb Page Compression (#21) @artfiedler I'm trying to compile your modified code, but it's failing with fatal error: zlib/zlib.h: No such file or directory even after I installed the zlib-devel library. Do I need to do something else to compile this beside running make? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

xinxinfly · 2022-08-04T03:39:48Z

Attached at the modified files, needs zlib, it should be easy to add other compression support just need the references and add the call to the libraries decompress function. I rarely write c/c++ so may need to fix a data type here or there, not sure it matters it works!

After I removed my forced debug for the c_parser it doesn't seg fault on the debug print... anyway see attached, merge if you would like. modified.zip

hey, buddy， I tried，compressed table cannot work

- Handles compressed pages (based on twindb#21) - Additional logging

voltageek added a commit to voltageek/undrop-for-innodb that referenced this issue Jul 27, 2024

Changelog:

a8e8092

- Handles compressed pages (based on twindb#21) - Additional logging

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Innodb Page Compression #21

Innodb Page Compression #21

artfiedler commented Jan 15, 2021

artfiedler commented Jan 15, 2021

akuzminsky commented Jan 15, 2021

artfiedler commented Jan 16, 2021 •

edited

Loading

artfiedler commented Jan 16, 2021

artfiedler commented Jan 16, 2021 •

edited

Loading

akuzminsky commented Jan 18, 2021

bmakan commented Feb 24, 2021 •

edited

Loading

artfiedler commented Feb 24, 2021 via email

xinxinfly commented Aug 4, 2022

Innodb Page Compression #21

Innodb Page Compression #21

Comments

artfiedler commented Jan 15, 2021

artfiedler commented Jan 15, 2021

akuzminsky commented Jan 15, 2021

artfiedler commented Jan 16, 2021 • edited Loading

artfiedler commented Jan 16, 2021

artfiedler commented Jan 16, 2021 • edited Loading

akuzminsky commented Jan 18, 2021

bmakan commented Feb 24, 2021 • edited Loading

artfiedler commented Feb 24, 2021 via email

xinxinfly commented Aug 4, 2022

artfiedler commented Jan 16, 2021 •

edited

Loading

artfiedler commented Jan 16, 2021 •

edited

Loading

bmakan commented Feb 24, 2021 •

edited

Loading