Skip to content

Conversation

@JaySon-Huang
Copy link
Contributor

@JaySon-Huang JaySon-Huang commented Dec 30, 2025

What problem does this PR solve?

Issue Number: close #10633, ref #10557

Problem Summary:
Currently, TiFlash Compute’s FileCache downloads data from S3 into local block storage as follows: it obtains an std::istream object via S3’s GetObject API and requests permission from the RateLimiter based on the total size of the entire S3 object before writing (implemented by #10555). Once the rate limiter approves the request, the full object is written to the local block storage in one burst.
Although this approach may keep the average write throughput below the quota threshold—making it appear compliant—the instantaneous write rate during the actual download can temporarily exceed the quota. This burstiness can trigger throttling by the underlying block storage, leading to elevated I/O latency.

What is changed and how it works?

disagg: Fine-grained rate limiting for S3 cache downloads
  * Use 128KiB buffer for S3 cache downloads and rate limiting. Avoid burst IO that cause block storage throttling

Cloud vendors usually consider 16KiB as 1 IOPS

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
1. deploy a next-gen cluster
2. load chbenchmark-1500 dataset and add tiflash replica
go-tpc tpcc prepare --warehouses 1500 -T 30 -D chbenchmark -H 172.31.7.2 -P 4000
go-tpc ch prepare -H 172.31.7.2 -P 4000 -D chbenchmark1k5_clustered 
# deploy 2 tiflash-compute with limited resources and ebs bandwidth
# m7a.2xlarge (8c32G), IOPS 6000 Throughput 125
# patch 1 tiflash-compute with the master
tiup cluster patch ng ./tiflash-patch.965cc8.tar.gz -N 172.31.10.1:9000 -y
# patch 1 tiflash-compute with this branch
tiup cluster patch ng ./tiflash-patch.8096c4.tar.gz -N 172.31.10.2:9000 -y
# shutdown 172.31.10.2 and run chbenchmark on master branch tiflash-compte
go-tpc ch run -H 172.31.7.2 -P4000 --warehouses 1500 -D chbenchmark1k5_clustered -T 50 -t 2 --time 15m
# shutdown 172.31.10.1 and start 172.31.10.2 to run chbenchmark on this branch
go-tpc ch run -H 172.31.7.2 -P4000 --warehouses 1500 -D chbenchmark1k5_clustered -T 50 -t 2 --time 15m
# check the metrics

Performance changes

image For the master branch, the write latency will increased to 21ms image After improvement by this PR, the write latency and write throughput is more smooth. Max write latency is is 10ms, most of the time is only 2ms.

the read latency is also slightly reduced

# master branch
tpmC: 4218.7, tpmTotal: 9365.9, efficiency: 21.9%
[Summary] Q1     - Count: 133, Sum(ms): 4574862.2, Avg(ms): 34396.9
# this PR
tpmC: 4486.4, tpmTotal: 9974.4, efficiency: 23.3%
[Summary] Q1     - Count: 133, Sum(ms): 4547837.2, Avg(ms): 34194.5

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Summary by CodeRabbit

  • Bug Fixes

    • Safer S3 download flow using temp-file-then-rename to avoid partial/corrupt files and preserve capacity accounting.
  • Improvements

    • Consolidated download streaming into a single write-limited local path, improved timing precision, INFO-level logs, and updated download metrics.
    • Added debug-mode size sanity checks and clearer segment state transitions on failure/success.
  • Tests

    • Expanded eviction and background-download tests with additional scenarios, assertions, and instrumentation.
  • Public API

    • Renamed segment completion setter and added a status getter for clearer segment state handling.
  • Chores

    • Updated .gitignore to ignore a new secret file.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: JaySon-Huang <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 30, 2025
@JaySon-Huang
Copy link
Contributor Author

@coderabbitai

@JaySon-Huang
Copy link
Contributor Author

@coderabbitai review

Signed-off-by: JaySon-Huang <[email protected]>
@kennedy8312
Copy link

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Jan 4, 2026

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link

coderabbitai bot commented Jan 4, 2026

📝 Walkthrough

Walkthrough

Replaced direct OFStream writes with a rate-limited, buffered download helper downloadToLocal(Aws::IOStream&, const String&, Int64, const WriteLimiterPtr&); FileCache now downloads to a temp file via buffered read/write, fsyncs and renames atomically. Adjusted FileSegment API (setComplete, getStatus) and updated tests and logging around eviction/restore flows.

Changes

Cohort / File(s) Summary
S3 FileCache core
dbms/src/Storages/S3/FileCache.cpp
Added downloadToLocal(Aws::IOStream&, const String&, Int64, const WriteLimiterPtr&). downloadImpl() delegates streaming to this helper using ReadBufferFromIStream + WriteBufferFromWritableFile + copyData, applies write-rate limiting, fsyncs temp file, updates metrics/status, and atomically renames to final path. Removed prior Aws::OFStream write path and added required includes.
FileSegment API
dbms/src/Storages/S3/FileCache.h
Renamed setSize(UInt64)setComplete(UInt64) (sets size and marks status Complete, notifies readiness). Added Status getStatus() const and removed a duplicate declaration.
Tests & logging
dbms/src/Storages/S3/tests/gtest_filecache.cpp
Adjusted log levels/formatting and timing precision; added calculateCacheCapacity(StorageRemoteCacheConfig&, UInt64) helper; reworked eviction/restore test flow, waits, assertions and instrumentation to reflect background, temp-file + rate-limited download and eviction interactions.
Misc
tests/docker/next-gen-utils/.gitignore
Added ignore pattern master_key.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant FC as FileCache
    participant S3 as S3 (GetObject)
    participant Stream as Aws::IOStream
    participant ReadBuf as ReadBufferFromIStream
    participant WriteLim as PosixWritableFile + WriteLimiter
    participant Temp as Temp File
    participant Final as Final Cache File

    FC->>S3: GetObject() -> Aws::IOStream
    S3-->>FC: Aws::IOStream

    FC->>ReadBuf: wrap Aws::IOStream
    loop chunked transfer
      ReadBuf->>WriteLim: read chunk
      WriteLim->>Temp: write chunk (rate-limited)
    end

    WriteLim->>Temp: flush / fsync / close
    FC->>Temp: rename(temp -> final)
    FC-->>FC: setComplete / update metrics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble bytes from S3 with care,

paced hops and buffered air,
I write to temp, then fsync tight,
rename snug — the cache sleeps right,
whiskers twitch: the download’s fair.

🚥 Pre-merge checks | ✅ 2 | ❌ 3
❌ Failed checks (1 warning, 2 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The code changes implement fine-grained rate limiting during S3 downloads through a new downloadToLocal function that uses streaming buffers, but there is a discrepancy: the PR body states 16KiB buffer intent while the commit message indicates 128KiB was used. Clarify the actual buffer size used in the implementation and ensure it aligns with the 16KiB per-IOPS design mentioned in the PR description and linked issues.
Out of Scope Changes check ❓ Inconclusive Most changes focus on rate-limited S3 downloads, but modifications to gtest_filecache.cpp and the .gitignore file are tangential. The renaming of setSize to setComplete in FileSegment may be scope-related but lacks justification. Verify that FileSegment API changes and test modifications are essential to implementing fine-grained rate limiting, or separate them into a follow-up PR.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'disagg: Fine-grained rate limiting for S3 cache downloads' directly describes the main change: implementing fine-grained rate limiting for S3 cache downloads to avoid burst I/O.
Description check ✅ Passed The pull request description follows the template structure with all required sections completed including problem statement, changes explanation, commit message, and checklist items.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 965cc8c and 2bc6ff5.

📒 Files selected for processing (1)
  • dbms/src/Storages/S3/FileCache.cpp
🧰 Additional context used
🪛 Clang (14.0.6)
dbms/src/Storages/S3/FileCache.cpp

[error] 989-989: parameter 'istr' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 990-990: 3 adjacent parameters of 'downloadToLocal' of similar type are easily swapped by mistake

(bugprone-easily-swappable-parameters,-warnings-as-errors)


[error] 990-990: parameter 'fname' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 991-991: parameter 'content_length' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 992-992: parameter 'write_limiter' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 996-996: variable 'ofile' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 998-998: statement should be inside braces

(readability-braces-around-statements,-warnings-as-errors)


[error] 1002-1002: variable 'total_written' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 1004-1004: variable 'rbuf' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 1007-1007: variable 'count' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 1008-1008: if with identical then and else branches

(bugprone-branch-clone,-warnings-as-errors)


[error] 1008-1008: variable 'write_res' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

🔇 Additional comments (4)
dbms/src/Storages/S3/FileCache.cpp (4)

24-24: LGTM! New includes support the buffered download implementation.

The added headers provide the necessary functionality for implementing rate-limited, buffered writes to local files during S3 downloads.

Also applies to: 26-26, 35-35, 38-38, 42-42


1003-1004: Verify: Is the 16KB buffer size intentional for rate limiting granularity?

The buffer size of 16KB seems smaller than typical S3 download buffers (which are often 64KB-1MB). However, given the PR's objective to implement fine-grained rate limiting to prevent I/O bursts, this smaller buffer size may be intentional to provide better rate-limiting granularity.

Could you confirm whether:

  1. The 16KB size is a deliberate choice for finer-grained rate limiting control?
  2. Performance testing shows this size provides adequate throughput while preventing bursts?

1052-1055: LGTM! Proper temp-file-then-rename pattern for atomic downloads.

Downloading to a temporary file first and then renaming to the final destination is a good practice that ensures:

  • Atomic file replacement (on most filesystems)
  • Incomplete files are never visible to readers
  • Proper cleanup on failure (temp files are cleaned up during restore)

1065-1071: LGTM! Appropriate log level for download completion.

Changing the log level from DEBUG to INFO for successful download completion is appropriate, as this represents a significant operation that should be visible in standard operational logs for monitoring and troubleshooting.

Signed-off-by: JaySon-Huang <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 5, 2026
@JaySon-Huang JaySon-Huang changed the title [WIP] disagg: better cache downloading [WIP] disagg: Fine-grained rate limiting for S3 cache downloads Jan 5, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
dbms/src/Storages/S3/FileCache.cpp (2)

989-1008: Good refactor using buffered I/O with rate limiting.

The new downloadToLocal helper correctly uses copyData which handles partial writes and buffering internally, addressing the previous partial write concern. The rate limiting is properly applied through PosixWritableFile.

Two minor observations:

  1. When content_length == 0, the file is created but sync() is not called. For empty files this is benign, but for consistency you may want to sync in all cases.

  2. Consider adding braces around the early return for style consistency (per static analysis hint).

Optional: Add braces for style consistency
     // simply create an empty file
-    if (unlikely(content_length <= 0))
+    if (unlikely(content_length <= 0)) {
         return;
+    }

1041-1049: Consider removing redundant file size verification.

fsize is read from the filesystem immediately after rename, then verified against content_length which was already used to copy the exact number of bytes. Since copyData(rbuf, wbuf, content_length) copies exactly content_length bytes and wbuf.sync() ensures it's flushed, this filesystem read is redundant overhead.

Optional: Use content_length directly
     downloadToLocal(result.GetBody(), temp_fname, content_length, write_limiter);
     std::filesystem::rename(temp_fname, local_fname);
-    auto fsize = std::filesystem::file_size(local_fname);
 
-    capacity_metrics->addUsedSize(local_fname, fsize);
-    RUNTIME_CHECK_MSG(
-        fsize == static_cast<UInt64>(content_length),
-        "local_fname={}, file_size={}, content_length={}",
-        local_fname,
-        fsize,
-        content_length);
+    capacity_metrics->addUsedSize(local_fname, content_length);
     file_seg->setStatus(FileSegment::Status::Complete);

Alternatively, if the sanity check is intentionally defensive (e.g., against filesystem corruption), keeping it is reasonable—just note it adds I/O overhead.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2bc6ff5 and ee3f153.

📒 Files selected for processing (2)
  • dbms/src/Storages/S3/FileCache.cpp
  • dbms/src/Storages/S3/tests/gtest_filecache.cpp
🧰 Additional context used
🪛 Clang (14.0.6)
dbms/src/Storages/S3/FileCache.cpp

[error] 990-990: parameter 'istr' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 991-991: 3 adjacent parameters of 'downloadToLocal' of similar type are easily swapped by mistake

(bugprone-easily-swappable-parameters,-warnings-as-errors)


[error] 991-991: parameter 'fname' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 992-992: parameter 'content_length' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 993-993: parameter 'write_limiter' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 999-999: statement should be inside braces

(readability-braces-around-statements,-warnings-as-errors)


[error] 1004-1004: variable 'rbuf' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 1005-1005: variable 'wbuf' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

🔇 Additional comments (4)
dbms/src/Storages/S3/tests/gtest_filecache.cpp (2)

210-230: LGTM! Well-designed helper for precise cache capacity calculation.

The iterative adjustment with the forward && backward assertion is a good defensive approach to prevent infinite oscillation when the target capacity is unreachable due to rounding. This ensures deterministic test behavior.


303-326: LGTM! Clear separation of evict-fail vs evict-success test paths.

The reworked test flow with explicit waitForBgDownload calls and clearer logging makes the eviction scenarios easier to follow and debug. Holding file_seg pointers to mock in-progress reads is a valid approach to prevent eviction.

dbms/src/Storages/S3/FileCache.cpp (2)

1036-1041: Good use of temp file + rename pattern for atomic file creation.

This approach ensures that if a download is interrupted (crash, exception), the cache won't contain a partial/corrupted file. The existing restore logic already handles cleanup of .tmp files (lines 1243-1246).


1051-1057: LGTM! Log level change to INFO is appropriate for download completion events.

Making download completion visible at INFO level aligns with the PR objective of better observability for cache downloading behavior.

Signed-off-by: JaySon-Huang <[email protected]>
@JaySon-Huang
Copy link
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
dbms/src/Storages/S3/FileCache.cpp (1)

989-1008: Rate-limited streaming download looks correct; minor style suggestion.

The downloadToLocal function correctly implements fine-grained rate limiting by:

  1. Creating PosixWritableFile with the write_limiter
  2. Using buffered I/O via ReadBufferFromIStream and WriteBufferFromWritableFile
  3. Calling copyData which respects the limiter during writes
  4. Calling sync() before returning

This achieves the PR objective of controlling instantaneous write throughput during streaming.

One minor observation: the static analysis warnings about "unused parameters" appear to be false positives since all parameters are clearly used.

Optional: Add braces around single-statement if body for consistency
     if (unlikely(content_length <= 0))
+    {
         return;
+    }
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 965cc8c and a4ce045.

📒 Files selected for processing (3)
  • dbms/src/Storages/S3/FileCache.cpp
  • dbms/src/Storages/S3/FileCache.h
  • dbms/src/Storages/S3/tests/gtest_filecache.cpp
🧰 Additional context used
🪛 Clang (14.0.6)
dbms/src/Storages/S3/FileCache.cpp

[error] 990-990: parameter 'istr' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 991-991: 3 adjacent parameters of 'downloadToLocal' of similar type are easily swapped by mistake

(bugprone-easily-swappable-parameters,-warnings-as-errors)


[error] 991-991: parameter 'fname' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 992-992: parameter 'content_length' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 993-993: parameter 'write_limiter' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 999-999: statement should be inside braces

(readability-braces-around-statements,-warnings-as-errors)


[error] 1004-1004: variable 'rbuf' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 1005-1005: variable 'wbuf' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

🔇 Additional comments (8)
dbms/src/Storages/S3/FileCache.h (2)

92-98: LGTM! Atomic completion signaling is well-designed.

The setComplete method correctly combines size update, status transition, and notification in a single locked section. This prevents race conditions where a waiter could observe a stale status after the size is updated.


108-112: LGTM!

The getStatus() getter properly acquires the lock before returning the status, ensuring thread-safe reads.

dbms/src/Storages/S3/FileCache.cpp (3)

1030-1037: LGTM!

The failure handling correctly logs the issue and sets the segment status to Failed, which will notify any waiting threads via cv_ready.notify_all() in setStatus().


1041-1060: Atomic download pattern is correct; consider exception safety for rename.

The temp-file-then-rename approach ensures atomicity—readers never see partial files. The sequence (download → fsync → rename → update metrics → mark complete) is correct.

However, if std::filesystem::rename throws (e.g., cross-filesystem move, permission issue), the temp file remains orphaned and capacity metrics are not updated, but the reserved space is still consumed. This is a minor edge case since rename within the same directory rarely fails.

Consider whether a try-catch around rename with cleanup is warranted for robustness:

try {
    std::filesystem::rename(temp_fname, local_fname);
} catch (...) {
    std::filesystem::remove(temp_fname); // cleanup orphan
    throw;
}

1061-1067: LGTM!

Good observability with cost metrics logged at INFO level for successful downloads.

dbms/src/Storages/S3/tests/gtest_filecache.cpp (3)

210-230: LGTM! Useful test helper with proper oscillation guard.

The calculateCacheCapacity helper iteratively adjusts config.capacity to achieve the exact target dt_size. The ASSERT_FALSE(forward && backward) correctly detects if the algorithm oscillates (which would indicate a bug in the delta_rate calculation logic).


303-357: LGTM! Well-structured eviction tests.

The test cases clearly separate:

  1. Evict-fail path (lines 303-327): Holds file_seg references to simulate in-use files, preventing eviction
  2. Evict-success path (lines 329-357): Allows eviction and verifies new files are cached

The added waitForBgDownload calls ensure background operations complete before assertions, preventing flaky tests.


116-116: LGTM!

Improved log format with elapsed seconds using .3f precision for better readability.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This WIP pull request implements fine-grained rate limiting for S3 cache downloads in TiFlash. The main goal is to provide a safer download flow using a temp-file-then-rename pattern to reduce corrupted or partial files while maintaining proper capacity accounting.

Key Changes:

  • Refactored S3 download logic to use temp files with atomic rename operations
  • Introduced downloadToLocal helper function with write rate limiting support
  • Updated logging from DEBUG to INFO level with improved timing precision
  • Enhanced test coverage for cache eviction scenarios

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
dbms/src/Storages/S3/FileCache.cpp Implements new downloadToLocal function with rate limiting, refactors downloadImpl to use temp-file-then-rename pattern, improves logging and metrics
dbms/src/Storages/S3/FileCache.h Renames setSize to setComplete to combine size setting and status update, relocates getStatus method for better organization
dbms/src/Storages/S3/tests/gtest_filecache.cpp Refactors eviction test cases to properly test failure and success scenarios, improves logging detail and precision

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: JaySon-Huang <[email protected]>
@JaySon-Huang JaySon-Huang changed the title [WIP] disagg: Fine-grained rate limiting for S3 cache downloads disagg: Fine-grained rate limiting for S3 cache downloads Jan 5, 2026
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 5, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @dbms/src/Storages/S3/FileCache.cpp:
- Around line 989-1008: The downloadToLocal function uses a 128 KiB
MAX_BUFFER_SIZE and relies on WriteLimiter refill logic to control write rate;
update the function by adding a concise comment explaining that buffer size sets
I/O chunk granularity while WriteLimiter (with its refill period) enforces
overall throughput and prevents bursts, and note why 128 KiB was chosen
(tradeoff: system call overhead vs chunk size); validate rate-limiting through
unit/integration tests or metrics (reference GET_METRIC, WriteLimiter, copyData)
and document the validation approach; optionally reorder the downloadToLocal
parameters to group IO-related args (Aws::IOStream & istr and Int64
content_length together) to address the "easily swappable parameters" warning,
and add a code annotation or pragma to suppress the false-positive "parameter
unused" static analysis message for parameters used indirectly by constructors
(PosixWritableFile, ReadBufferFromIStream, WriteBufferFromWritableFile).
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8096c47 and 96aa3fd.

📒 Files selected for processing (2)
  • dbms/src/Storages/S3/FileCache.cpp
  • tests/docker/next-gen-utils/.gitignore
✅ Files skipped from review due to trivial changes (1)
  • tests/docker/next-gen-utils/.gitignore
🧰 Additional context used
🪛 Clang (14.0.6)
dbms/src/Storages/S3/FileCache.cpp

[error] 990-990: parameter 'istr' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 991-991: 3 adjacent parameters of 'downloadToLocal' of similar type are easily swapped by mistake

(bugprone-easily-swappable-parameters,-warnings-as-errors)


[error] 991-991: parameter 'fname' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 992-992: parameter 'content_length' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 993-993: parameter 'write_limiter' is unused

(misc-unused-parameters,-warnings-as-errors)


[error] 999-999: statement should be inside braces

(readability-braces-around-statements,-warnings-as-errors)


[error] 1004-1004: variable 'rbuf' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 1005-1005: variable 'wbuf' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

🔇 Additional comments (5)
dbms/src/Storages/S3/FileCache.cpp (5)

24-24: LGTM! Includes support the new buffered streaming approach.

The new includes are appropriate for implementing fine-grained rate limiting during S3 downloads using buffered I/O.

Also applies to: 26-27, 29-29, 37-37, 40-40


1028-1037: LGTM! Improved error handling for reservation failures.

The change to log at INFO level and set the file segment status to Failed provides graceful degradation when space reservation fails. This is consistent with the error handling pattern used elsewhere in the download flow.


1041-1045: LGTM! Atomic download pattern prevents partial file corruption.

The download-to-temp-then-rename pattern ensures that the cache never contains partially downloaded or corrupted files if the download fails mid-stream. This is a good practice for maintaining cache consistency.


1047-1056: LGTM! Debug-mode sanity check verifies download integrity.

The file size validation in debug builds helps catch issues during development without impacting production performance.


1058-1067: LGTM! Proper finalization of successful downloads.

The code correctly updates metrics, sets the file segment to complete with the actual content length, and logs the successful download with timing information.

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Jan 9, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jan 9, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CalvinNeo, JinheLin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jan 9, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jan 9, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-01-09 02:29:26.933129139 +0000 UTC m=+928522.751437561: ☑️ agreed by CalvinNeo.
  • 2026-01-09 02:32:37.945411507 +0000 UTC m=+928713.763719939: ☑️ agreed by JinheLin.

@ti-chi-bot ti-chi-bot bot merged commit 1a39c7f into pingcap:master Jan 9, 2026
8 checks passed
@JaySon-Huang JaySon-Huang deleted the better_cache_downloading branch January 9, 2026 02:38
@JaySon-Huang
Copy link
Contributor Author

/cherry-pick release-nextgen-20251011

@ti-chi-bot
Copy link
Member

@JaySon-Huang: new pull request created to branch release-nextgen-20251011: #10657.

Details

In response to this:

/cherry-pick release-nextgen-20251011

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fine grained rate limiting when downloading cache from S3 to local disk

5 participants