Credential Digger - Supporting Keras Model

Credential Digger is a Github scanning tool that identifies hardcoded credentials (Passwords, API Keys, Secret Keys, Tokens, personal information, etc). Credential Digger has a clear advantage compared to the other Github scanners in terms of False Positive reduction in the scan reports. Credential Digger is using two Machine Learning Models to identify false positives, especially in Password identification:

Path Model: Identify the portion of code that contains fake credentials used for testing and example purposes (e.g., Unit tests).
Snippet Model: Identify the portion of code used to authenticate with passwords, and distinguish between real and fake passwords.

Architecture

Credential Digger finds credentials hardcoded in a repository. The tool is composed of:

Postgres database
Python client
User interface

Database

The database is structured in the following way (arrows point to foreign keys).

Project structure

The project includes 3 components: a db (sql folder), a client (credentialdigger folder), and a user interface (ui folder).

`sql`

create_table.sql defines the db schema.

Note that, given the file_name and commit_hash of a discovery, both the commit and the file can be accessible at addresses:

REPO_URL/commit/COMMIT_HASH
REPO_URL/blob/COMMIT_HASH/file_name

`credentialdigger`

This client can be used to easily interact with the db. It offers a scanner for git repositories, based on Hyperscan (others can be implemented).

Please note that the database must be up and running.

`ui`

The user interface can be used to easily perform scans and flag the discoveries.

Install

Prepare the .env file and edit it with the correct data

cp .env.sample .env
vim .env  # Insert real credentials

Run the db using docker-compose:
```
sudo docker-compose up --build postgres
```
Consider not to expose the db port in production.

Install the dependencies for the client.

sudo apt install libhyperscan-dev libpq-dev

Install the Python requirements from the requirements.txt file.
```
pip install -r requirements.txt
```
Set which models you want to use in ui/server.py

    MODELS = ['SnippetModel', 'PathModel']

Run the ui:

    python3 -m ui.server

The ui is available at http://localhost:5000/

Warning: To use the keras models, make sure the credentialdigger pypi package is NOT installed

Run the db on a different machine

In case the db and the client are run on different machines, then clone this repository on both of them.

Then, execute the steps 1. and 2. as described in the installation section above on the machine running the db, and execute the remaining steps on the machine running the client.

In case the db and the client/ui run on separate machines, the port of the db must be exposed.

Use machine learning models

Currently no pretrained keras models are provided.

If available, the models and their respective tokenizers are expected to be found in the models_data directory, in their respective subdirectories. Model hyperparameters can be found in the models/keras_support folder .

Note that snippet_extractor is still a fasttext model.

File Path Model

The File Path Model classifies a discovery as false positive according to its file path when it indicates that the code portion is used for test or example. A pre-trained Path Model is available here.

Code Snippet Model

The code Snippet model identifies the password based authentication in a code and differeciate between real and fake passwords.

WARNING: This Model is pre-trained with synthetic data in order to protect privacy. It will help to reduce the False Positives related to password recongnition but with a lower precision compared to a Model pre-trained with real data.

Usage (client)

from credentialdigger.cli import Client
c = Client(dbname='MYDB', dbuser='MYUSER', dbpassword='*****',
           dbhost='localhost', dbport=5432)

Wiki

Refer to the Wiki for further information.

News

Credential Digger announcement

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.idea		.idea
credentialdigger		credentialdigger
github_assets		github_assets
ground_truths		ground_truths
resources		resources
sql		sql
ui		ui
.env.sample		.env.sample
.env.save		.env.save
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
NOTICE		NOTICE
README.md		README.md
THIRD-PARTY-NOTICES		THIRD-PARTY-NOTICES
__init__.py		__init__.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
setup.py		setup.py
test_credential_digger.ipynb		test_credential_digger.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credential Digger - Supporting Keras Model

Architecture

Database

Project structure

`sql`

`credentialdigger`

`ui`

Install

Run the db on a different machine

Use machine learning models

File Path Model

Code Snippet Model

Usage (client)

Wiki

News

About

Releases

Packages

Languages

License

sigma-libra/credential-digger

Folders and files

Latest commit

History

Repository files navigation

Credential Digger - Supporting Keras Model

Architecture

Database

Project structure

sql

credentialdigger

ui

Install

Run the db on a different machine

Use machine learning models

File Path Model

Code Snippet Model

Usage (client)

Wiki

News

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`sql`

`credentialdigger`

`ui`

Packages