Skip to content

UniExtract - Universal Extractor Python API for Linux

Notifications You must be signed in to change notification settings

sourcekris/uniextract

Repository files navigation

Universal Extractor

A Linux, Python attempt at a universal extractor similar to the Windows UniExtract2.

The idea is to support the compressed archives that UniExtract2 supports but in a Linux environment with a Python API.

Configuration Format

{
    "name": "ACB",
    "comment": "A DOS compression utility by George Buyanovsky - http://fileformats.archiveteam.org/wiki/ACB_(compressed_archive)",
    "extensions": ["acb"],
    "install": {
        "method": "apt",
        "packages": ["dosbox"],
        "tool": "ACB.EXE",
        "container": "dos/acb/acb_200c.zip"
    },
    "pack": {
        "exe": "dosbox",
        "cmdline": "$tool -noconsole -c \"mount d $tools\" -c \"mount e $(pwd)\" -c \"e:\" -c \"d:acb b $file.acb $file\" -c \"exit\"",
        "type": "dosbox"
    },
    "unpack": {
        "exe": "dosbox",
        "cmdline": "$tool -noconsole -c \"mount d $tools\" -c \"mount e $destdir\" -c \"mount f $arcloc\" -c \"d:acb r f:$shortname e:\" -c \"exit\"",
        "extension": "acb",
        "force_extension": true
    },
    "test": {
        "blob": "H4sIAACF0WEAA+Nq4GDosSx+OvP57Ps8j/mCX3MBABFWmRcSAAAA",
        "file": "0",
        "content": "ACB",
        "padbyte": "0x41",
        "padlen": 260,
        "delete": true
    },
    "identification": {
        "file": "None",
        "trid": "None",
        "idarc": "ACB"
    }
}
  • Install methods supported:
    • apt (sudo apt install <packages[0]> <packages[1]> ...)
    • pip (sudo pip install <packages[0]> <packages[1]> ...)
    • source (git clone <repo>)
      • Adds more fields:
        • repo - the git repo - (currently only git is supported)
        • build - build script
        • exist_check - allows you to bypass cloning and installing if the tool is already on your system
        • Example:
              "install": {
              "method": "source",
              "repo": "https://github.com/kubo/snzip",
              "build": "apt install libsnappy-dev && ./autogen.sh && ./configure --with-static-snappy && make && cp snzip $tools",
              "exist_check": ["snzip -h", "Usage: snzip"]
          },
  • Pack field types:
    • archiver - a packer we can run natively exists to generate a testable archive file
    • dosbox - a packer we can run via DOSbox exists to generate a testable archive file
    • wine - a packer we can run via Wine exists to generate a testable archive file
    • blob - no archiver we can run exists but we have an extractor we can test with a base64 blob of data
    • Adds another field:
      • blob - the base64 encoded archive
      • Example:
        "pack":{
            "type":"blob",
            "blob":"RUdHQQABpuPSNQAAAAAiguII45CFCgAAAAADAA=="
        },
  • Representative Samples:
    • archiver type:
      • zip
        • Pretty standard Linux CLI tool that keeps a record of the original filename and can archive many files in one archive.
      • gzip
        • Pretty standard Linux CLI tool that has no notion of the pre-compression file metadata.
      • cru
        • A CP/M archiver that is supported using deark. Deark is an amazing extractor but its default output filename support is weird so even if the archive stores the filename 0 the output is like 0.000.0 or some incremental number up from there. There is a flag (-nonames) to support not doing this but I haven't experimented yet.
    • dosbox type:
      • gca
        • Standard DOS archiver, the config was autogenerated using dosarc.py.
    • wine type:
      • 777
        • Standard Win32 CLI driven archiver, configuration was autogenerated using winearc.py and tweaked slightly to use the -o. flag.
    • blob type:
      • egg
        • Windows 32/64bit archiver exists but nothing we can drive from the command line so a blob of an archive was created in windows. Linux extractors exist.
      • squeeze2
        • CP/M era archiver. CP/M binary executors exist on Linux like tnylpo but they weren't working for this use case somehow. So blobs it is.

Configuration Schema

See the schema.py for how we validate configurations with jsonschema at runtime.

How To Use

  1. Clone and install python requirements.

    $ git clone https://github.com/sourcekris/uniextract
    $ cd uniextract
    $ pip install -r requirements.txt
    
  2. If you wish for any of the Wine based archivers to work, ensure you have a functional Wine installation. On my machine I needed to run:

    dpkg --add-architecture i386 && apt-get update && apt-get install wine32:i386
    
  3. Install and build pre-requisite archiver tools.

    $ ./prereqs.py
    
  4. Use to extract whatever you want.

    $ ./extract.py -e <archive> -d <destination_folder>
    

Supported Formats

Archive type Common file extension(s)
7-zip 7z
777 777
ACB acb
ACE ace
ADF adf
AFIO afio, af
AIN ain
ALZip alz
AMG amg, oop
AR a, ar
AR7 ar7
ARC arc, ark
ARG arg
ARHANGEL lg
ARJ arj
ARJSoftwareJAR j
ARQ arq
ARX arx
ASD asd
AppleSingle as
AppleiWork iwa, snappy
BIX bix
BLINK bli
BSA bsa
CAB cab
CAR car
CAZIPXP caz
CPIO cpio
CPShrink cpz
CrLZH yyy
Crunch zzz
Crush cru
DGCA dgc, dgca
DWC dwc
EGG egg
ERI eri
ESP esp
FacebookZstandard zst
FacebookZstandardLz4 lz4
GCA gca
GSARCPAK pak, arc
Gzip gz, tgz
HA ha
HAP hap
HDF hdf
HYP hyp
Hex hex, b16, base16
IMP imp
IntelHex ihex
JARCS jar
JRchive jrc
KGB kgb, kge
LArc lzs
LBR lbr
LHARK lzh
LIMIT lim
LZA lza
LZOP lzo, lzop
LZWCOM lzw
LZX lzx
LZZ lzz
LotusCMZ cmz
MAr mar
MDCD md
MPC mp3
MSI msi
MSXiE xie
MWSqueeze mw
MacBinary bin
MicrognosisCompressionArchiver mar
NSK nsk
PACKER pak
PMArc pma
PSA psa
PUT put
Quantum q
RAR rar, r00
RAX rax
RK rk
RKV rkv
SAR sar
SITX sitx, sit, sit5
SKY sky
SLIM fb
SQX sqx
SnappyFraming sz, snappy
SnappyHadoop snappy
SnappyJava snappy
SnappyRaw raw, snappy
SnappySnzip snz, snappy
SquashARH arh
Squeeze qqq, sq, sqz
Squeeze2 sq2, qqq, zsq
SqueezeIt sqz
TSCOMP tsc
Tar tar, ctar
UFA ufa
UltraCompressor uc
X1 x
XPACK xpa
XXD xxd
YAC yac
YZ1 yz1
ZAR zar
ZET zet
ZIP zip, jar, xpi, wz, exe, imz, apk, docx, docm, maff
ZPAQ zpaq
ZPK zpk
Zoo zoo
base64 b64, mme, mime
bzip2 bz2
compress Z
lrzip lrz
lzh lzh, lha
lzip lz
lzma lzma
oPAQue paq
packARC pja
rzip rz
uuencode uue, uu
xxencode xxe, xxenc, xx
xz xz
yEnc yenc

FAQ

  • Q: Pre-reqs (prereqs.py) fails to install tools, for example:
    $ ./prereqs.py -s 777 
    trying to install archiver: 777: error opening unarchived test data: [Errno 2] No such file or directory: '/tmp/tmpw71rxjc8/0'
    archiver test file content mismatch:
    got: b''
    want: b'777'
    Failed
    
  • A: Ensure Wine works.
    $ wine
    it looks like wine32 is missing, you should install it.
    multiarch needs to be enabled first.  as root, please
    execute "dpkg --add-architecture i386 && apt-get update && apt-get install wine32:i386"
    $ dpkg --add-architecture i386 && apt-get update && apt-get install wine32:i386
    ...
    $ wine
    Usage: wine PROGRAM [ARGUMENTS...]   Run the specified program
       wine --help                   Display this help and exit
       wine --version                Output version information and exit
    
  • Q: Wine is installed but I get an error 53:
    $ ./prereqs.py -s 777
    trying to install archiver: 777: error running unarchive of test data: Command 'cd /tmp/tmp2fr_bm3b && wine /root/uniextract/tools//777.exe e -o. z:/tmp/tmpcfq8vzqe.777' returned non-zero exit status 53.
    cmdline: cd /tmp/tmp2fr_bm3b && wine /root/uniextract/tools//777.exe e -o. z:/tmp/tmpcfq8vzqe.777
    Failed
    
  • A: Rebuild your ~/.wine profile.
    $ mv ~/.wine ~/.wineold
    $ ./prereqs.py -s 777
    trying to install archiver: 777: OK
    

Author

  • Kris Hunt (@CTFKris)

About

UniExtract - Universal Extractor Python API for Linux

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages