Generate malicious files using recently published homoglyph-attack vulnerability, which was discovered at least in C, C++, C#, Go, Python, Rust, JS, ...
Cite from cve.mitre.org
An issue was discovered in the character definitions of the Unicode Specification through 14.0. The specification allows an adversary to produce source code identifiers such as function names using homoglyphs that render visually identical to a target identifier. Adversaries can leverage this to inject code via adversarial identifier definitions in upstream software dependencies invoked deceptively in downstream software.
Raw data for homoglyphs (homoglyphs.txt
) taken from here and cleaned data to sort out italic and dissimilar characters.
See the original source from Camebridge University:
https://www.trojansource.codes/trojan-source.pdf
python3 codegen.py [-h] [-i INFILE] [-o OUTFILE] [-r] [-a]
arg | long arg | param | descrption |
---|---|---|---|
-h | --help | none | show this help message and exit |
-i | --infile | INFILE | Input file containing homoglyph placeholders |
-o | --outfile | OUTFILE | Output file to store the final code |
-r | --random | none | SET flag to choose random homoglyph; take first one if not set |
-a | --about | none | Print about text |
Examples were created by me or are takem from the referenced PDF. To run these examples, execute codegen.py
with the required arguments:
python3 codegen.py -i infile.xyz -o outfile.xyz
and run/compile outfile.xyz
.
Currently are only digits [0-9], as well as lower- and uppercase characters [a-zA-Z] supported. To replace a supported char within your template with a (random) homoglyph, simply enclose the char with dollar signs $
. See the examples to have a first impression on how a template could look like.