Skip to content

Trigram database written in C++, suited for malware indexing

License

Notifications You must be signed in to change notification settings

CERT-Polska/ursadb

Repository files navigation

UrsaDB

A 3gram search engine for querying terabytes of data in milliseconds. Optimized for working with binary files (for example, malware dumps).

Created in CERT.PL. Originally by Jarosław Jedynak (tailcall.net), extended and improved by Michał Leszczyński.

This repository is only for UrsaDB project (ngram database). See CERT-Polska/mquery for more user friendly UI.

Installation

See installation instructions

Quickstart

  1. Create new database:
mkdir /opt/ursadb
ursadb_new /opt/ursadb/db.ursa
  1. Run UrsaDB server:
ursadb /opt/ursadb/db.ursa
  1. Connect with UrsaCLI:
$ ursacli
[2020-04-13 18:16:36.511] [info] Connected to UrsaDB v1.3.0 (connection id: 006B8B4571)
ursadb>
  1. Index some files:
ursadb> index "/opt/samples" with [gram3, text4, wide8, hash4];
  1. Now you can perform queries. For example, match all files with three null bytes:
ursadb> select {00 00 00};

Read the syntax documentation to learn more about available commands.

Learn more

More documentation can be found in the docs directory.

You can also read the hosted version here: cert-polska.github.io/ursadb.

Contact

If you have any problems, bugs or feature requests related to UrsaDB, you're encouraged to create a GitHub issue.

Funding acknowledgement

Co-financed by the Connecting Europe Facility by of the European Union