Is it necessary to configure large write buffers for good performance of X-Engine? #18

iyupeng · 2022-04-24T03:04:50Z

Hi galaxyengine developers,

I found some large configurations for write buffers of X-Engine in:

storage/xengine/core/pt_beilou.sh
storage/xengine/tools/sysbench_benchmark/rds_my.cnf

Like:

xengine_write_buffer_size=512M
xengine_max_write_buffer_number=1000
xengine_max_write_buffer_number_to_maintain=1000
xengine_db_write_buffer_size=100G
xengine_db_total_write_buffer_size=100G

These options make much memory be used in memtables, which seems pretty more than that in RocskDB's benchmark.

It is said that larger memtables result in longer recovery time. And the write throughput is finally decided by compaction.

Could you tell us more about your considerations on these buffer configurations? Is it necessary to configure large write buffers for good performance of X-Engine?

Thanks.

The text was updated successfully, but these errors were encountered:

xtcyclist · 2022-04-24T09:12:43Z

In the end, it's up to the users to configure these numbers according to their workloads. For the commercial version of X-Engine, there are recommended parameters available in the Alibaba Cloud Database Console, which are selected to fit the environments in Alibaba Cloud.

Using small memtables does not prevent your X-Engine (or RocksDB) from being bounded by compactions if your workload is very write-intensive over a long period of time. A very large main memory may delay flushes. But, in the end, all new data needs to be flushed and compacted.

If you want to compare X-Engine with RocksDB using a benchmark, you should consider setting their parameters to the same values for a fair comparison.

A rich literature on configuring LSM-tree systems could be found here: https://disc.bu.edu/publications.

Recovery time is only an issue when the system does crash. X-Engine has parallel recovery in place to accelerate this process. If you opt for fast recoveries, you could consider configuring the system with more aggressive flushes to reduce the scale of volatile states in the main memory. With X-Engine, we also have a persistent memory system that achieves very fast recoveries (paper: http://www.vldb.org/pvldb/vol14/p1872-yan.pdf).

luckywhu · 2022-04-25T01:10:40Z

this document will tell you how to run x-engine with correct configuration.

[X-Engin Configuration](https://github.com/ApsaraDB/galaxyengine/wiki/2.1-MySQL-X-Engine%E5%BC%95%E6%93%8E%E5%8F%82%E6%95%B0%E9%85%8D%E7%BD%AE%E5%BB%BA%E8%AE%AE#x-engine%E5%86%85%E5%AD%98%E5%8F%8A%E7%BA%BF%E7%A8%8B%E5%8F%82%E6%95%B0）

i usually run pt_beilou.sh on machine with 96cpu / 768g memory. so it may not be a suitable configuration for you

iyupeng · 2022-04-25T02:15:33Z

Thanks a lot @chengxuntao-ntu, @luckywhu. Your information is really helpful!

I studied your paper before. My work at Intel is related to Persistent Memory too.

It's a great idea to use Persistent Memory in X-Engine, achieving better performance, faster recovery and less DRAM cost.

xtcyclist · 2022-04-25T03:25:09Z

Gald to know we have readers out there! With PM buffering or caching most of the main-memory data, the recovery could be made lightning fast. Faster than high-availability switches (switch the hot backup to be the new master, when the old master crashes). This could potentially remove the number of hot backup nodes in database clusters, by up to 50%. But this kind of new design is a bit too aggressive. We are not expecting any real deployment soon.

Regarding memtable sizes, ideally, one could dynamically adjust them in response to changing workload pressures and types. We have observed in real production environments in the cloud that there are only so few hours within a day that database clusters have to process significant transactions (writes). Most of the time, there aren't many writes. So, caches would be more useful than memtables.

Read partition statistics directly into buffers provided by caller wihtout clobbering the handler::stats and other values reserved for table statistics. The function is used by I_S.partitions queries and is tested in for example ndb_partition_range.test Change-Id: I1ee4471ec7015983c73208b83ba910410f76447f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it necessary to configure large write buffers for good performance of X-Engine? #18

Is it necessary to configure large write buffers for good performance of X-Engine? #18

iyupeng commented Apr 24, 2022

xtcyclist commented Apr 24, 2022

luckywhu commented Apr 25, 2022 •

edited

Loading

iyupeng commented Apr 25, 2022

xtcyclist commented Apr 25, 2022

Is it necessary to configure large write buffers for good performance of X-Engine? #18

Is it necessary to configure large write buffers for good performance of X-Engine? #18

Comments

iyupeng commented Apr 24, 2022

xtcyclist commented Apr 24, 2022

luckywhu commented Apr 25, 2022 • edited Loading

iyupeng commented Apr 25, 2022

xtcyclist commented Apr 25, 2022

luckywhu commented Apr 25, 2022 •

edited

Loading