tools: support zstd's dict compression/decompression #22

neverchanje · 2018-11-20T11:45:45Z

as we can see from the above two pics, dict-based compression can significantly improve compression ratio (+20%), besides, with a larger dictionary buffer the compression ratio increases (by 10%).

qinzuoyan · 2018-11-21T07:25:53Z

这个让用户怎么用？感觉不太易用啊？譬如：

用户如果用spark写数据，多个task并发写数据，那么由谁来train？Dict怎么建立？
用户现在从一个表里读数据，怎么知道是不是要用Dict？用哪个Dict？

neverchanje · 2018-11-21T08:14:14Z

在 zstd 的设计里，train 就是在测试的时候干的，dict 训练好之后保存在 pegasus 里，用户启动 spark 的时候把 dict 拿到，然后压缩解压就用这个 dict。

dict 可以每个 spark 任务一个，也可以做成 thread-safe 的单例。现在没帮用户做成单例，用户就只能每个 task 拿一个 dict，这点可以改。

qinzuoyan · 2018-11-21T08:37:06Z

如果每个spark一个，就会有很多个Dict。另一个业务如果要读数据，应当选择哪个Dict？如果读的数据来自不同spark写入的，应当用哪个Dict？整个使用场景都应当想清楚，让用户用起来简单、无歧义。

…ssion not responding.

neverchanje requested review from acelyc111, hycdong, mentoswang, qinzuoyan, vagetablechicken, zhangyifan27 and shengofsun November 20, 2018 11:45

neverchanje force-pushed the thrift-0.11.0-inlined branch from 3af031b to 9c467ac Compare January 17, 2019 03:57

bugfix: set error to ERR_SESSION_RESET to trigger meta query while se…

5b507cd

…ssion not responding.

neverchanje force-pushed the thrift-0.11.0-inlined branch from 969cf67 to 5b507cd Compare September 20, 2019 10:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tools: support zstd's dict compression/decompression #22

tools: support zstd's dict compression/decompression #22

neverchanje commented Nov 20, 2018 •

edited

Loading

qinzuoyan commented Nov 21, 2018

neverchanje commented Nov 21, 2018 •

edited

Loading

qinzuoyan commented Nov 21, 2018

tools: support zstd's dict compression/decompression #22

Are you sure you want to change the base?

tools: support zstd's dict compression/decompression #22

Conversation

neverchanje commented Nov 20, 2018 • edited Loading

qinzuoyan commented Nov 21, 2018

neverchanje commented Nov 21, 2018 • edited Loading

qinzuoyan commented Nov 21, 2018

neverchanje commented Nov 20, 2018 •

edited

Loading

neverchanje commented Nov 21, 2018 •

edited

Loading