Skip to content

Conversation

@WorkingRobot
Copy link
Member

Summary

If merged this pull request will add support for a plethora of compression types and archive formats as a best-effort stopgap until we decide to add support for these formats manually (if necessary) to add metadata or version info (e.g. the RPM files that @matthewkelley22 is working on).

Proposed changes

Removes the .rar file decompression things and replaces it entirely with extractcode. Added an additional feature extractcode-full that adds 7zip and libarchive binaries as dependencies instead of needing to rely on them existing on the PATH. Ideally, we'd want to have this hook run last, but I don't think pluggy supports that. I also injected .wim support to extractcode just to see what Surfactant can do a Windows 11 iso.

Future Work

Seeing how extractcode identifies binaries, it seems to use typecode, which internally uses the same library that the linux file command uses. This may be a better approach to ID-ing files than what we're currently doing with manually checking header bytes (maybe only do this with more specific or lesser-known file types?). Depending on how extractcode behaves, we may want to also look at removing some of the tar/gz/zip/bz file extraction and letting extractcode handle it (unless we have a reason to look at the file metadata directly).

@WorkingRobot WorkingRobot marked this pull request as draft July 11, 2025 23:46
Copy link
Collaborator

@nightlark nightlark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR mostly needs conflicts resolved from the file type id changing to support a list of potential types.

From what I recall, we'd also talked about making a change to the upstream extractcode project?

@nightlark nightlark added the enhancement New feature or request label Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants