Skip to content

Conversation

@itsvyle
Copy link
Contributor

@itsvyle itsvyle commented Jan 6, 2026

Which issue does this PR resolve?

Resolves #1548 (the original fix only resolved the extract plugin, and not the archive plugin)

Rationale of this PR

The built-in preview of .tar.* files just shows a .tar file inside - this is because 7zip just decompresses the first compression layer, without extracting the tar

I acknowledge #1440, but I feel like until a proper way to walk through archives is implemented, it is nice to have a way to preview .tar.* archive contents.

The bulk of my changes is that if the file name resembles a tar file, I use either tar (if supported, on unix systems, and if the actual program exists) or 7z/7zz to preview the archive contents (I do so by extracting the .tar, and directly piping it back through 7-zip to list the contents). The reason I prefer to use "native" tar if possible is that tar supports extra formats, such as .tar.zst files, which 7-zip still doesn't support.

I also refactored the archive.lua file, to re-use code for the new tar support, as well as give more options for external plugins which do archive preview to use some of the functions in this plugin, such as the final display as a list of files in the preview window.

PS: I am quite new to Lua, and welcome any comments on how to make my code more idiomatic.

@sxyazi
Copy link
Owner

sxyazi commented Jan 7, 2026

Thank you for your patch!

This PR is quite large, 212 lines added and 15 lines deleted, while the entire archive plugin is only 173 lines, which makes it pretty hard to review.

Could you minimize the changes to only the necessary parts:

  1. Revert the original style at line 30.
  2. Revert the original style at line 43.
  3. Remove comment blocks like ========= to keep coding style consistent - there are no such comments elsewhere in the codebase.
  4. Revert the relocation of the M:seek method at line 49.
  5. Remove the tar-related code so this PR can focus on enhancing the existing 7zip functionality. Adding tar would require it to be listed as an optional dependency for Yazi, which increases the dependency surface and I'm not confident about its cross-platform support and I don't want to take on maintaining another tool in addition to 7zip (I want to maintain Yazi long-term and that would increase maintenance cost). That said, tar can be made as a good community plugin that interested users can use freely.

@itsvyle
Copy link
Contributor Author

itsvyle commented Jan 7, 2026

Hi,

I have implemented the changes you requested.

For the tar functionality - the plugin would be less than 100 lines (thanks to using functionality which is available in the API of archive); do I publish the plugin myself? Or should I make a PR to https://github.com/yazi-rs/plugins?

@sxyazi
Copy link
Owner

sxyazi commented Jan 7, 2026

Thank you for the change!

do I publish the plugin myself? Or should I make a PR to https://github.com/yazi-rs/plugins?

Either way is fine with me :)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the built-in archive plugin to support previewing compressed tar archives (e.g., .tar.gz, .tar.xz, .tar.zst). When the plugin detects an archive containing only a .tar file, it automatically decompresses and displays the tarball's contents instead of just showing the intermediate tar file.

Key changes:

  • Added logic to detect and decompress single-file tar archives using piped 7z commands
  • Refactored the 7z output parsing into reusable functions
  • Added packed_size metadata tracking for size-based filtering (1GB limit)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

local files, bound, err = self.list_archive({ "-p", tostring(job.file.path) }, job.skip, limit)

local first = (#files == 1 and files[1]) or (#files == 0 and M.list_if_only_one(job.file.path))
if first and M.should_decompress_tar(first) then
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic uses M.list_if_only_one(job.file.path) as a fallback when files array is empty or has one element, but list_if_only_one returns false (a boolean) when the condition isn't met. The first variable could be assigned false, and then passed to should_decompress_tar, which expects a table with packed_size and path fields. This will cause an error when trying to access false.packed_size. Add a nil/boolean check before calling should_decompress_tar.

Suggested change
if first and M.should_decompress_tar(first) then
if type(first) == "table" and M.should_decompress_tar(first) then

Copilot uses AI. Check for mistakes.
function M.is_encrypted(s) return s:find(" Wrong password", 1, true) end

function M.is_tar(path) return M.list_meta { "-p", tostring(path) } == "tar" end

Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function should_decompress_tar lacks documentation explaining the purpose of the 1GB limit on packed_size and why this threshold was chosen. This magic number should be documented to help future maintainers understand the performance/resource trade-off.

Suggested change
---Heuristic that decides whether to treat a single-file archive as a compressed tarball.
---We only attempt to decompress members whose packed size is at most 1 GiB (1024 * 1024 * 1024 bytes)
---to avoid spending excessive CPU time and memory on very large archives during preview. This value
---is a practical upper bound chosen to balance responsiveness and resource usage and can be adjusted
---if performance characteristics or typical archive sizes change.

Copilot uses AI. Check for mistakes.
goto continue
end

key, value = next:match("^(%u[%a ]+) = (.-)[\r\n]+")
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern has been updated from ^(%u%l+) to ^(%u[%a ]+) to support multi-word keys like "Packed Size". However, the new pattern %u[%a ]+ only ensures the first character is uppercase and allows any subsequent alphabetic characters or spaces. This could match malformed keys. Consider using a more precise pattern or add a comment explaining the expected key format from 7z output.

Suggested change
key, value = next:match("^(%u[%a ]+) = (.-)[\r\n]+")
-- 7z key lines look like "Path = ...", "Packed Size = ...":
-- one or more capitalized words (e.g. "Packed Size"), each %u%l*, separated by spaces.
local key_head, key_tail, val = next:match("^(%u%l*(%s%u%l*)*) = (.-)[\r\n]+")
if key_head then
key = key_head .. (key_tail or "")
value = val
else
key, value = nil, nil
end

Copilot uses AI. Check for mistakes.
while #files < 2 do
local next, event = child:read_line()
if event == 0 then
local attr, size, packed_size, path = next:sub(20):match("([^ ]+) +(%d+) +(%d+) +([^\r\n]+)")
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern next:sub(20):match("([^ ]+) +(%d+) +(%d+) +([^\r\n]+)") assumes a fixed column width format starting at character 20. This is fragile and could break if the output format changes slightly or with different locales. Consider using the -slt flag here as well for structured parsing, or add error handling to verify the parsed data is valid before adding it to the files table.

Suggested change
local attr, size, packed_size, path = next:sub(20):match("([^ ]+) +(%d+) +(%d+) +([^\r\n]+)")
-- Parse attribute, size, packed size, and path using flexible whitespace, without assuming fixed columns
local attr, size, packed_size, path = next:match("^([^%s]+)%s+(%d+)%s+(%d+)%s+(.+)$")

Copilot uses AI. Check for mistakes.
{ "x", "-so", table.unpack(args) },
{ "l", "-ba", "-slt", "-ttar", "-sccUTF-8", "-si" }
)
if not dst then
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When spawn_7z_piped returns an error (when dst is falsy), the code checks if not dst and returns. However, spawn_7z_piped may return src as non-nil even when dst is nil, potentially leaving the src process running if the error path is taken. The function should either return nil for both when there's an error, or the caller should handle cleanup of src explicitly.

Suggested change
if not dst then
if not dst then
if src then
src:start_kill()
end

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +75
-- Spawn a 7z instance which pipes a "7z {src_args}" into a "7z {dst_args}"
-- Used for previewing compressed tarballs, by doing "7z x -so .. | 7z l -si .."
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function comment says "Spawn a 7z instance which pipes a '7z {src_args}' into a '7z {dst_args}'", but the actual signature and usage pattern isn't clearly documented. The return values are not documented in the comment block, and it's unclear from the comment what happens in error cases. Consider adding proper documentation for parameters and return values, especially since this function has complex error handling behavior.

Suggested change
-- Spawn a 7z instance which pipes a "7z {src_args}" into a "7z {dst_args}"
-- Used for previewing compressed tarballs, by doing "7z x -so .. | 7z l -si .."
-- Spawn two 7z instances and pipe the stdout of "7z {src_args}" into the stdin of "7z {dst_args}".
-- Used for previewing compressed tarballs, e.g. "7z x -so .. | 7z l -si ..".
--
-- Parameters:
-- src_args: arguments passed to the first 7z invocation (typically a table accepted by Command:arg()).
-- dst_args: arguments passed to the second 7z invocation (typically a table accepted by Command:arg()).
--
-- Return values:
-- src: the child process handle for the source 7z instance (or nil on failure).
-- dst: the child process handle for the destination 7z instance whose stdout is used for reading
-- the final output (or nil on failure).
-- last_err: the last error message returned by the underlying Command:spawn() call, or nil if both
-- processes were spawned successfully.
--
-- Error handling:
-- The function tries to use "7zz" first and falls back to "7z". If it fails to spawn the required
-- processes with both binaries, it calls ya.err("Failed to start either `7zz` or `7z`, error: ...")
-- and returns the result of ya.err, with no valid child processes running.

Copilot uses AI. Check for mistakes.
files[#files] = nil
return files, bound, err
end

Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function list_if_only_one lacks documentation. It's not clear from the name alone what "only one" refers to, what the return value represents, or when it should be used. Add a docstring explaining that this function checks if an archive contains only one file and returns that file's metadata, or false otherwise.

Suggested change
---Check whether an archive contains exactly one file.
---This is a workaround for certain compressed tarballs (e.g. `.tar.xz`) where
---7-zip omits the inner `.tar` file when `-slt` is used. In such cases, this
---function lists the archive without `-slt` and returns the single file if
---and only if there is exactly one entry.
---@param path string Archive file path.
---@return table|false file A table containing the single file's metadata
---if the archive has exactly one file; `false` otherwise.

Copilot uses AI. Check for mistakes.
Comment on lines +84 to +88
local dst, err =
Command(name):arg(dst_args):stdin(src:take_stdout()):stdout(Command.PIPED):stderr(Command.PIPED):spawn()
if not dst then
last_err = err
end
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The src child process is not properly cleaned up when dst creation fails. In the error path (lines 86-88), if dst fails to spawn, the src process is left running. Consider adding src:start_kill() before returning the error.

Copilot uses AI. Check for mistakes.
src, dst = try("7z")
end
if not dst then
return ya.err("Failed to start either `7zz` or `7z`, error: " .. last_err)
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function returns only src when dst is not created (line 97), but the return signature at line 99 indicates it should return src, dst, last_err. This inconsistency could cause issues for callers expecting three return values. Ensure the error path returns consistent values.

Suggested change
return ya.err("Failed to start either `7zz` or `7z`, error: " .. last_err)
return nil, nil, ya.err("Failed to start either `7zz` or `7z`, error: " .. last_err)

Copilot uses AI. Check for mistakes.
function M.is_tar(path) return M.list_meta { "-p", tostring(path) } == "tar" end

function M.should_decompress_tar(file)
return file.packed_size <= 1024 * 1024 * 1024 and file.path:find(".+%.tar$") ~= nil
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern .+%.tar$ requires at least one character before .tar, which is correct. However, this only matches files ending with exactly .tar and won't match filenames that are case-insensitive variants like .TAR or .Tar. Consider using a case-insensitive pattern if the filesystem or archive format allows mixed-case extensions.

Suggested change
return file.packed_size <= 1024 * 1024 * 1024 and file.path:find(".+%.tar$") ~= nil
return file.packed_size <= 1024 * 1024 * 1024 and file.path:lower():find(".+%.tar$") ~= nil

Copilot uses AI. Check for mistakes.
@sxyazi sxyazi changed the title feat: add .tar.* support to built-in archive feat: support compressed tarballs (.tar.gz, .tar.bz2, etc.) in the preset archive previewer Jan 8, 2026
Copy link
Owner

@sxyazi sxyazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@sxyazi sxyazi merged commit 83674f1 into sxyazi:main Jan 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make the builtin extract and archive plugins handle *.tar formats better

2 participants