Skip to content

Conversation

@siamskoi
Copy link

@siamskoi siamskoi commented Jan 28, 2025

Reason for This PR

Static file compression functionality is required. Fileserver is not suitable for this, as it requires nginx or apache for its use. With nginx or apache, it makes no sense to use the RR functionality for working with static files.

Description of Changes

The functionality of compressing static files using gzip has been added, the corresponding header is added to the response, a .gz file is created to reduce the load on the CPU for re-compressing files. Added a gzip setting to enable this functionality.

curl -I -X GET --header "Accept-Encoding: gzip" http://127.0.0.1:8080/style.css

Answer for small files:

HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Encoding: gzip
Content-Type: text/css; charset=utf-8
Last-Modified: Tue, 28 Jan 2025 11:10:13 GMT
Vary: Accept-Encoding
Date: Tue, 28 Jan 2025 17:07:22 GMT
Content-Length: 46

And answer for big files:

HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Encoding: gzip
Content-Type: text/css; charset=utf-8
Last-Modified: Thu, 16 Jan 2025 20:09:56 GMT
Vary: Accept-Encoding
Date: Tue, 28 Jan 2025 17:32:31 GMT
Transfer-Encoding: chunked

License Acceptance

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

PR Checklist

[Author TODO: Meet these criteria.]
[Reviewer TODO: Verify that these criteria are met. Request changes if not]

  • All commits in this PR are signed (git commit -s).
  • The reason for this PR is clearly provided (issue no. or explanation).
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • Any user-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.

Summary by CodeRabbit

  • New Features

    • Added support for optional gzip compression when serving static files
    • Implemented file compression mechanism for improved content delivery efficiency
  • Bug Fixes

    • Enhanced error handling during file compression and serving processes
  • Performance

    • Introduced thread-safe compression caching to optimize file serving

@coderabbitai
Copy link

coderabbitai bot commented Jan 28, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The pull request introduces gzip compression support for static file serving in a plugin system. A new Gzip configuration option is added to enable compression, and corresponding methods are implemented in the plugin to handle gzip file generation and serving. The changes allow dynamic compression of static files based on client's Accept-Encoding header, with thread-safe file caching mechanisms.

Changes

File Changes
config.go Added Gzip bool field to Config struct with mapstructure:"gzip" tag
plugin.go - Added gzip bool field to Plugin struct
- Added cacheMutex sync.Mutex for thread-safe operations
- Implemented createCompressedFile method
- Implemented compressContent method for gzip compression

Sequence Diagram

sequenceDiagram
    participant Client
    participant Plugin
    participant FileSystem
    participant Compressor

    Client->>Plugin: Request static file
    Plugin->>Client: Check Accept-Encoding header
    alt Gzip supported
        Plugin->>FileSystem: Check for existing gzip file
        alt Gzip file not exists
            Plugin->>Compressor: Compress file
            Compressor->>FileSystem: Save compressed file
        end
        Plugin->>Client: Serve gzip file
    else Gzip not supported
        Plugin->>Client: Serve original file
    end
Loading

Poem

🐰 Gzip, oh gzip, a compression delight,
Bytes dancing swiftly, taking flight!
Files shrinking with magical ease,
Serving content with rabbit-like breeze
Performance leaps, bandwidth so light! 🚀


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
config.go (1)

34-35: Add documentation for the Gzip field.

Please add a comment explaining the purpose and behavior of the Gzip configuration option.

+	// Gzip enables compression of static files using gzip.
+	// When enabled, the plugin will create and serve .gz files for supported content.
 	Gzip bool `mapstructure:"gzip"`
plugin.go (1)

290-307: Add modification time check and disk space handling.

The compressed file cache should be invalidated when the original file changes, and disk space issues should be handled gracefully.

 func (s *Plugin) compressContent(originalFilePath string, originalFile http.File) (http.File, error) {
     compressedFilePath := originalFilePath + ".gz"
 
-    if _, err := os.Stat(compressedFilePath); os.IsNotExist(err) {
+    // Check if compressed file exists and is up to date
+    compressedInfo, err := os.Stat(s.cfg.Dir + compressedFilePath)
+    originalInfo, err := originalFile.Stat()
+    if err != nil || os.IsNotExist(err) || compressedInfo.ModTime().Before(originalInfo.ModTime()) {
+        // Check available disk space
+        var stat unix.Statfs_t
+        if err := unix.Statfs(s.cfg.Dir, &stat); err == nil {
+            // Require at least 100MB free
+            if stat.Bavail*uint64(stat.Bsize) < 100*1024*1024 {
+                s.log.Error("insufficient disk space for compression")
+                return originalFile, nil
+            }
+        }
+
         if err := s.createCompressedFile(originalFile, compressedFilePath); err != nil {
             s.log.Debug("Error creating compressed file: ", zap.Error(err))
-            return nil, err
+            // Fall back to serving uncompressed on error
+            return originalFile, nil
         }
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce54095 and 3297bdf.

📒 Files selected for processing (2)
  • config.go (1 hunks)
  • plugin.go (5 hunks)
🔇 Additional comments (1)
plugin.go (1)

58-58: LGTM!

The gzip field is properly initialized from the configuration.

Also applies to: 90-90

plugin.go Outdated
RootPluginName = "http"
)

var cacheMutex sync.Mutex
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Move cacheMutex to Plugin struct.

The global mutex could cause issues with testing and multiple plugin instances. Consider moving it to the Plugin struct.

-var cacheMutex sync.Mutex

 type Plugin struct {
     // ... other fields ...
+    cacheMutex sync.Mutex
     gzip bool
 }

Committable suggestion skipped: line range outside the PR's diff.

plugin.go Outdated
Comment on lines 237 to 252
if s.gzip && r.Header.Get("Accept-Encoding") == "gzip" {
s.log.Debug("gzip compression requested")
cf, err := s.compressContent(fp, f)
if err != nil {
s.log.Error("failed to compress content", zap.Error(err))
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
err = f.Close()
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
f = cf
// No new defer to close needed for cf. f is already closed, previous defer exist
w.Header().Set("Content-Encoding", "gzip")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve gzip compression handling.

Two issues need to be addressed:

  1. The Accept-Encoding check is too strict. Browsers can send multiple encodings.
  2. Missing check for already compressed content types (e.g., images, videos).

Apply this diff to fix the issues:

-        if s.gzip && r.Header.Get("Accept-Encoding") == "gzip" {
+        if s.gzip && strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
+            // Skip compression for already compressed formats
+            contentType := http.DetectContentType(make([]byte, 512))
+            if strings.Contains(contentType, "image/") ||
+               strings.Contains(contentType, "video/") ||
+               strings.Contains(contentType, "audio/") {
+                s.log.Debug("skipping compression for already compressed content", 
+                    zap.String("type", contentType))
+                http.ServeContent(w, r, finfo.Name(), finfo.ModTime(), f)
+                return
+            }

Committable suggestion skipped: line range outside the PR's diff.

plugin.go Outdated
Comment on lines 259 to 288
func (s *Plugin) createCompressedFile(originalFile http.File, compressedPath string) error {
cacheMutex.Lock()
defer cacheMutex.Unlock()

// We check again to avoid a race between threads.
if _, err := os.Stat(compressedPath); err == nil {
return nil
}
compressedFile, err := os.Create(s.cfg.Dir + compressedPath)
if err != nil {
return err
}
defer func() {
err := compressedFile.Close()
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
}()

gz := gzip.NewWriter(compressedFile)
defer func() {
err := gz.Close()
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
}()

_, err = io.Copy(gz, originalFile)
return err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Address security and resource management concerns.

Several issues need attention:

  1. No cleanup of partially written files on error
  2. Missing maximum file size check
  3. Potential path traversal vulnerability

Apply these improvements:

 func (s *Plugin) createCompressedFile(originalFile http.File, compressedPath string) error {
+    // Prevent path traversal
+    if !strings.HasPrefix(compressedPath, "/") {
+        compressedPath = "/" + compressedPath
+    }
+    cleanPath := path.Clean(compressedPath)
+    if !strings.HasPrefix(cleanPath, "/") {
+        return fmt.Errorf("invalid path")
+    }
+
     cacheMutex.Lock()
     defer cacheMutex.Unlock()
 
-    compressedFile, err := os.Create(s.cfg.Dir + compressedPath)
+    tempFile, err := os.CreateTemp(s.cfg.Dir, "gzip-*.tmp")
     if err != nil {
         return err
     }
+    tempName := tempFile.Name()
     defer func() {
-        err := compressedFile.Close()
+        err := tempFile.Close()
         if err != nil {
             s.log.Error("failed to close file", zap.Error(err))
         }
+        if err != nil {
+            _ = os.Remove(tempName)
+        }
     }()
 
-    gz := gzip.NewWriter(compressedFile)
+    // Limit the size of compressed files
+    maxSize := int64(10 * 1024 * 1024) // 10MB
+    limitedReader := io.LimitReader(originalFile, maxSize)
+
+    gz := gzip.NewWriter(tempFile)
     defer func() {
         err := gz.Close()
         if err != nil {
             s.log.Error("failed to close file", zap.Error(err))
         }
     }()
 
-    _, err = io.Copy(gz, originalFile)
-    return err
+    _, err = io.Copy(gz, limitedReader)
+    if err != nil {
+        return err
+    }
+
+    // Atomically rename the temp file to the final compressed file
+    return os.Rename(tempName, s.cfg.Dir+cleanPath)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (s *Plugin) createCompressedFile(originalFile http.File, compressedPath string) error {
cacheMutex.Lock()
defer cacheMutex.Unlock()
// We check again to avoid a race between threads.
if _, err := os.Stat(compressedPath); err == nil {
return nil
}
compressedFile, err := os.Create(s.cfg.Dir + compressedPath)
if err != nil {
return err
}
defer func() {
err := compressedFile.Close()
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
}()
gz := gzip.NewWriter(compressedFile)
defer func() {
err := gz.Close()
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
}()
_, err = io.Copy(gz, originalFile)
return err
}
func (s *Plugin) createCompressedFile(originalFile http.File, compressedPath string) error {
// Prevent path traversal
if !strings.HasPrefix(compressedPath, "/") {
compressedPath = "/" + compressedPath
}
cleanPath := path.Clean(compressedPath)
if !strings.HasPrefix(cleanPath, "/") {
return fmt.Errorf("invalid path")
}
cacheMutex.Lock()
defer cacheMutex.Unlock()
// We check again to avoid a race between threads.
if _, err := os.Stat(compressedPath); err == nil {
return nil
}
tempFile, err := os.CreateTemp(s.cfg.Dir, "gzip-*.tmp")
if err != nil {
return err
}
tempName := tempFile.Name()
defer func() {
err := tempFile.Close()
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
if err != nil {
_ = os.Remove(tempName)
}
}()
// Limit the size of compressed files
maxSize := int64(10 * 1024 * 1024) // 10MB
limitedReader := io.LimitReader(originalFile, maxSize)
gz := gzip.NewWriter(tempFile)
defer func() {
err := gz.Close()
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
}()
_, err = io.Copy(gz, limitedReader)
if err != nil {
return err
}
// Atomically rename the temp file to the final compressed file
return os.Rename(tempName, s.cfg.Dir+cleanPath)
}

@rustatian rustatian changed the title Add gzip support for static files feature: add gzip support for static files Jan 28, 2025
plugin.go Outdated
Comment on lines 30 to 31
var cacheMutex sync.Mutex

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No global variables (this is a bad practice in Go) allowed

plugin.go Outdated
if err != nil {
s.log.Error("failed to close file", zap.Error(err))
}
f = cf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is that. You're closing a file and then reassign another file to the closed one to be closed by the initial defer?

plugin.go Outdated
}

func (s *Plugin) createCompressedFile(originalFile http.File, compressedPath string) error {
cacheMutex.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No globals

plugin.go Outdated
// opentelemetry
prop propagation.TextMapPropagator

gzip bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option without explanation comment. Also I think this is a bad name for the option, because what a gzip=true should do in the gzip middleware?

plugin.go Outdated
if err != nil {
return err
}
defer func() {
Copy link
Member

@rustatian rustatian Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many unnecessarry defers. There're no errors returning by another function calls, to overload a stack with 2 defers in a row.

plugin.go Outdated
}()

gz := gzip.NewWriter(compressedFile)
defer func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second unnecessary defer

@rustatian
Copy link
Member

Hey @siamskoi 👋🏻
Thank you for the PR 👍🏻

A few additional notes:

  1. Tests are missing.
  2. PR with docs to the roadrunner-server/docs is also missing.

@rustatian rustatian added the enhancement New feature or request label Jan 29, 2025
@rustatian rustatian marked this pull request as draft January 29, 2025 05:41
Introduce `gzipEnabled` and `gzipMaxFileSize` for fine-grained gzip compression settings. Added logic to skip gzip for already compressed content and enforce file size limits. Improved thread safety and path validation in file compression.
Introduce `gzipEnabled` and `gzipMaxFileSize` for fine-grained gzip compression settings. Added logic to skip gzip for already compressed content and enforce file size limits. Improved thread safety and path validation in file compression.
Replaced custom gzip logic with `github.com/klauspost/compress` to simplify compression handling and improve maintainability. Removed unused attributes and methods related to custom gzip implementation, such as max file size and custom file compression logic. This change streamlines the code and leverages the capabilities of an external, well-maintained library.
Response map[string]string `mapstructure:"response"`

// GzipEnabled determines if gzip compression is enabled for serving static files.
GzipEnabled bool `mapstructure:"gzip_enabled"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a good name for the option. From my POV option name should be self-explanatory.
I think that we may remove that option at all and check the gzip header only.

if s.gzipEnabled && strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
s.log.Debug("gzip compression requested")
// Skip compression for already compressed formats
contentType := http.DetectContentType(make([]byte, 512))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you detecting the content-type for the empty slice of bytes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants