Skip to content

The purpose of a library is to calculate entropy of a file or a bytes sequence. It may work as a detector of encrypted files, as they have the highest entropy. Entropy is calculated according to Shannon's definition, where 0.0 is the order, 8.0 is the chaos

Notifications You must be signed in to change notification settings

yuchdev/entropy_calculator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Calculate Shannon's entropy

  • C++ application, could accept a file or distribution with provided parameters, and calculate its entropy according to Shannon's defenition
  • We take a probability to meet any byte in the file, multiply to its logarithm and accumulate for all possible bytes (yeah, 0x0 to 0xFF)
  • See more detailed explanation in Wiki
  • In case of a file we can do estimation about format
  • We support uniform and normal distribution. In case of normal distribution we could assign mean and standard deviation
  • Application made with a research purpose

Explanation

Building and some specific

  • Application is CMake-based and could be compiled on any platform that have CMake 3.0+ installed
  • Just create build directory in the project catalog mkdir build, enter cd build it and execute cmake ..
  • Boost required for compilation, we use console-based progress-bar to make entropy calculation look pretty, Boost Test for unit-testing etc.
  • I had to provide std::codecvt<std::uint8_t> specialization for binary file streams. MS compiler provide one, but GCC does not (and doesn't have to as it's not a C++ Standard requirement)
  • Specific installation does not required, application is portable

Applications

  • Could be used to detect whether file was encrypted or archived (the value should be close to 8.0)
  • Could be used for building correlation between data format and its entropy (dataset should be big enough)
  • Could be used for estimation between distribution properties and entropy
  • Application is intentionaly as simple as possible

About

The purpose of a library is to calculate entropy of a file or a bytes sequence. It may work as a detector of encrypted files, as they have the highest entropy. Entropy is calculated according to Shannon's definition, where 0.0 is the order, 8.0 is the chaos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published