Performance comparison between VictoriaTraces and VictoriaLogs

### Is your feature request related to a problem? Please describe

Our application monitoring scenario involves using victoriaLogs (hereinafter referred to as vlog) to store trace data for the past two weeks. However, we found that querying by trace_id was slow. After learning that VictoriaTraces (hereinafter referred to as vtrace) had made improvements in this area ([#594](https://github.com/VictoriaMetrics/VictoriaLogs/issues/594)), we started using vtrace and compared the two.

1. Data format (some sensitive data, such as IP address and app_name, have been masked):
The data format of vlog is:
```
{
    "_time": "2025-08-25T02:58:19.042Z",
    "_stream_id": "0000000a00000000050b35a441085b72c690cf4305694343",
    "_stream": "{app_name=\"demo\",method_name=\"getRemind(java.lang.String,com.demo.fin.std.gold.schedule.domain.po.UserPriceRemind)\",service_name=\"com.demo.fin.std.gold.schedule.service.impl.RemindServiceImpl\",tenant=\"jdjr\"}",
    "_msg": "v6;demo;com.demo.fin.std.gold.schedule.service.impl.RemindServiceImpl;getRemind(java.lang.String,com.demo.fin.std.gold.schedule.domain.po.UserPriceRemind);127.0.0.1;consumer_gold_remind_partition_schedule_0_0_2_1756090670731;0;;;1756090699042;64;1857972;4490954132420954139;4490954132420954140;4490962172599732232;;;;;;injvm;gson:1:22;;1089;m6;;group=group-product-m6;jdjr;;;;",
    "app_name": "demo",
    "ip": "127.0.0.1",
    "line_tracing": "0",
    "method_name": "getRemind(java.lang.String,com.demo.fin.std.gold.schedule.domain.po.UserPriceRemind)",
    "protocol": "injvm",
    "sampling": "0",
    "service_name": "com.demo.fin.std.gold.schedule.service.impl.RemindServiceImpl",
    "success": "1",
    "tenant": "jdjr",
    "testing": "0",
    "duration": "1",
    "trace_id": "4490954132420954139"
}

```

The data format of vtrace is:
```
{
    "_time": "2025-08-24T20:07:10.009444528Z",
    "_stream_id": "0000000a000000001c3a3604e527859d9dccb13f6e64d312",
    "_stream": "{name=\"-\",resource_attr:service.name=\"demo\",resource_attr:service.namespace=\"jdjr\",span_attr:method=\"config(long,java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,javax.servlet.http.HttpServletResponse)\",span_attr:service=\"com.demo.jr.sgm.server.controller.AgentController\"}",
    "_msg": "-",
    "dropped_attributes_count": "0",
    "dropped_events_count": "0",
    "dropped_links_count": "0",
    "flags": "0",
    "kind": "0",
    "name": "-",
    "resource_attr:service.name": "demo",
    "resource_attr:service.namespace": "jdjr",
    "scope_name": "sgm-trace",
    "scope_version": "1.0.0",
    "span_attr:line_tracing": "0",
    "span_attr:local_port": "0",
    "span_attr:method": "config(long,java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,javax.servlet.http.HttpServletResponse)",
    "span_attr:protocol": "injvm",
    "span_attr:remote_port": "0",
    "span_attr:service": "com.demo.fin.std.gold.schedule.service.impl.RemindServiceImpl",
    "span_attr:testing": "0",
    "span_attr:timeout": "0",
    "span_attr:zone": "hc",
    "status_code": "1",
    "duration": "7444528",
    "end_time_unix_nano": "1756066030009444528",
    "parent_span_id": "3b62436021ea1801",
    "span_attr:child": "hikari:1:2",
    "span_attr:group": "server-hc",
    "span_attr:ip": "127.0.0.1",
    "span_attr:process": "16707",
    "span_attr:sampling": "0",
    "span_attr:thread": "http-nio-8080-exec-160",
    "span_id": "3b62436021ea1802",
    "start_time_unix_nano": "1756066030002000000",
    "trace_id": "3b62436021ea1801"
}
```

2. The write performance comparison between the two is as follows:

Specifications of each component node:
vtrace:
insert   4c8g * 6
storage  26c56g10TB * 9
select   8c16g * 3

vlog:
insert   4c8g * 24
storage  26c56g10TB * 9
select   8c16g * 3

Data from 11:00 on Monday for two consecutive weeks
<img width="2736" height="302" alt="Image" src="https://github.com/user-attachments/assets/e108510b-e6f3-43bd-90e8-656d784ff283" />

**insert Ingestion Rate**
vtrace
11.72 MiB/s * 24 = 281 MiB/s
<img width="2874" height="518" alt="Image" src="https://github.com/user-attachments/assets/f8d8fdd2-4a89-4d3f-a82c-05db6ec7b7ff" />

vlog
74.96 MiB/s * 6 = 450 MiB/s
<img width="2858" height="500" alt="Image" src="https://github.com/user-attachments/assets/0683917c-c0b5-40b6-aa06-a32db14f5950" />


**insert CPU Useage**
vtrace
4c * 27.2% * 24 = 26c
<img width="2868" height="522" alt="Image" src="https://github.com/user-attachments/assets/97cae9ff-049a-4680-8362-74881c0aa90b" />

vlog
4c * 27.47% * 6 = 6.6c
<img width="2894" height="524" alt="Image" src="https://github.com/user-attachments/assets/bb43bc02-df04-4326-b30d-26fa721b0914" />

**insert Memory Usage**
vtrace
8GB * 49.91% * 24 = 95.8GB
<img width="2866" height="500" alt="Image" src="https://github.com/user-attachments/assets/baaafe29-a4a9-46f5-a592-4635d7882583" />

vlog
8GB * 5.41% * 6 = 2.6GB
<img width="2874" height="510" alt="Image" src="https://github.com/user-attachments/assets/8eada460-7bba-4d2b-9885-4cbd80aa8c93" />

**storage Ingestion Rate**
vtrace
63.82 MiB/s * 9 = 574 MiB/s
<img width="2866" height="506" alt="Image" src="https://github.com/user-attachments/assets/78c9d6f6-9b3f-417b-a99d-7f3981b6f388" />

vlog
69.28 MiB/s * 9 = 624 MiB/s
<img width="2862" height="510" alt="Image" src="https://github.com/user-attachments/assets/cb485349-6ddd-4667-b98a-d2d8714c8974" />

**storage CPU Usage**
vtrace
26c * 29.96% * 9 = 70c
<img width="2858" height="492" alt="Image" src="https://github.com/user-attachments/assets/17a62fce-0bcd-4564-a6c6-ced1ebcb8f7d" />

vlog
26c * 31.31% * 9 = 73c
<img width="2866" height="532" alt="Image" src="https://github.com/user-attachments/assets/29afb197-f072-4239-a8f4-9051fee6db41" />

**storage Memory Usage**
vtrace
56GB * 6.77% * 9 = 34GB
<img width="2846" height="514" alt="Image" src="https://github.com/user-attachments/assets/5ea7ee7e-6196-4ac3-ba2a-aaa4cafffc56" />

vlog
56GB * 4.36% * 9 = 22GB
<img width="2856" height="510" alt="Image" src="https://github.com/user-attachments/assets/d99dfdc5-5596-4ac1-ac9f-36b1f9c35953" />

**Disk Space Usage(12 hours)**
vtrace
(2.26TB - 2.17TB) * 9 = 829GB
<img width="2858" height="496" alt="Image" src="https://github.com/user-attachments/assets/ac46cf6c-512d-4e07-a28a-df15f3995a08" />

vlog
(1.05TB - 985GB) * 9 = 812GB
<img width="2852" height="518" alt="Image" src="https://github.com/user-attachments/assets/6197641d-e0c1-49e0-a225-83bbd428bec3" />

The above comparison shows that, using vlog as a benchmark,
vtrace uses less disk space(73%) and more CPU resources(394%).


3. Query Performance Comparison
To improve query performance for trace_id scenarios, we deployed another vlog cluster. Based on the existing vlog data, we added an additional copy of each data entry indexed by trace_id (trace_id modulo 10000). The format is as follows:
<img width="1720" height="322" alt="Image" src="https://github.com/user-attachments/assets/9a04985f-2246-47d0-badd-b9be9673ff14" />


This solution stores an extra copy of trace_id index data, which significantly improves query performance but also significantly increases resource usage (approximately 60%).

Furthermore, in application monitoring scenarios, in addition to querying by trace_id, there are also scenarios where queries can be performed by other fields, such as user ID. These scenarios also exist in logging systems, such as querying all related logs by trace_id or user ID.


### Describe the solution you'd like

Based on the above comparison and analysis, neither vlog nor vtrace components can fully cover our monitoring use cases. Is there a solution that can support the following requirements:

vtrace:
1. Improve vtrace's write performance, making it as close to vlog's performance as possible.
2. Improve the efficiency of vtrace's query by trace_id, and support adding indexes for fields other than trace_id.

vlog:
1. Support adding indexes for specific fields in vlog.


Some of my thoughts:
To improve write performance,
1. Reduce the amount of data written. Place only the data that needs to be queried and filtered by parameters in the label, and store the rest of the data in _msg as rows. This requires support for setting _msg. The general storage format is:
```
{
    "_msg": "custom_value1;custom_value2;custom_value3;custom_value4;custom_value5;custom_value6;custom_value7;custom_value8;custom_value9;custom_value10;",
    "_time": "2025-08-24T20:07:10.009444528Z",
    "_stream_id": "0000000a000000001c3a3604e527859d9dccb13f6e64d312",
    "_stream": "{name=\"-\",resource_attr:service.name=\"demo\",resource_attr:service.namespace=\"jdjr\",span_attr:method=\"config(long,java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,javax.servlet.http.HttpServletResponse)\",span_attr:service=\"com.demo.jr.sgm.server.controller.AgentController\"}",
    "name": "-",
    "span_attr:custom_label1": "custom_value1",
    "span_attr:custom_label2": "custom_value2",
    "span_attr:custom_label3": "custom_value3"

    ...

}
```

2. We use the Loki API (/insert/loki/api/v1/push) to write data to the vlog. Is it possible to support this data ingestion HTTP API in vtrace to improve vtrace's write performance?

Regarding query performance,
1. Can vtrace improve query performance by increasing the PartitionCount of the trace_id, or can the user set the PartitionCount themselves?

### Describe alternatives you've considered

_No response_

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance comparison between VictoriaTraces and VictoriaLogs #46

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance comparison between VictoriaTraces and VictoriaLogs #46

Description

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions