SE

File-based storage engine implemented entirely in Bash. This project demonstrates how to store data in files, build indexes, run queries, manage transactions, and create backups

Features
Getting Started
Directory Structure
Usage
Data Types
Known Limitations

Features

Stores data in plain text files, one file per record, in a data/<collection_name>/record_<record_id> format.
Maintains simple indexes to speed lookups by record ID or data type.
Provides both synchronous and asynchronous query modes.
Backup and restore of data directories.
Includes a minimal transaction manager, letting you stage changes and commit/rollback if desired.
Validates data types (e.g., string, integer, email, booleans).

Getting Started

Clone git clone https://github.com/umegbewe/se.git
Ensure that you have Bash ≥4.0 installed and sed, awk, grep, find are on your PATH.

Directory Structure

• storage-engine.sh:
Main entry point. Parses CLI arguments (e.g., --create-record, --backup, --restore, etc.) and calls the appropriate functions. Also loads all sub-scripts.

• record_operations.sh:
Create, read, update, and delete (CRUD) functions. Each record is a file record_<id> in data/<collection>/.

• query-engine.sh & query_parser.sh:
Parse queries (“SELECT * FROM … WHERE … ORDER BY … LIMIT …”) and evaluate them.

• index_operations.sh:
Functions to maintain the “index” files under data/<collection>/indexes/.

• backup_restore.sh:
Backs up or restores the “data” directory.

• transaction_manager.sh:
Minimal staging for commits/rollbacks.

• type-system/:
Type definitions, regex validation, and utility functions.

Usage

The storage engine is invoked via storage-engine.sh with various flags. For example:

bash storage-engine.sh --create-record <collection> <data> <data_type> [transaction_id]

Create a New Record

Use --create-record to insert data:

• collection: The name of the collection (directory) to store the record.
• data: The actual string data you want to store.
• data_type: A recognized type (e.g., string, integer, boolean, email, etc.).
• transaction_id: (optional) if you want to include this in a transaction.

Example:

bash storage-engine.sh --create-record users "John Doe" string

Output:

New record created with ID: 1734946832N_5678

You can confirm that something that looks like data/users/record_1734946832N_5678 file is created with contents John Doe|string.

Query Data

Use --query to perform a simple query.
• You can pass a WHERE field = value, ORDER BY field ASC, LIMIT n, etc. in your actual query string, which the query parser can interpret.
• By default, if the query matches multiple records, results will be printed line by line.

Example:

bash storage-engine.sh --query "SELECT FROM users WHERE name = 'John Doe'"

Output:

John Doe|string

You can optionally provide a page number (second argument) and limit (third argument). For instance:

bash storage-engine.sh --query "SELECT FROM users WHERE type = 'string'" 1 2

Output:

John Doe|string
Jane Doe|string

That will fetch the first page of up to 2 matching records.

Async Query

Useful for longer-running queries.

bash storage-engine.sh --async-query "SELECT FROM users WHERE age >= 21"

Output:

Async query submitted successfully: <query_id>

where query_id is an internal identifier tracking the background job

Check and Retrieve Async Results

bash storage-engine.sh --check-async-query <query_id>

Output:

Async query status: RUNNING

Or

Async query status: COMPLETE

Once it’s complete, fetch the results:

Async query result:
John Doe|string
Alice Johnson|string

Backup and Restore

Backup the entire data directory:

bash storage-engine.sh --backup my_backup

Creates a “backup/my_backup.tar.gz” (or similar)

Restore from a backup:

bash storage-engine.sh --restore my_backup

Transactions

Some record operations can be rolled into transactions. For instance, the “create_record” function can take a “transaction_id” param. If provided, the engine will stage the change in data/transactions/<transaction_id>.

You can then commit or rollback the transaction:

# create a transaction ID
TXN_ID="abc123"

# insert a record into “users”
bash storage-engine.sh --create-record users "Alice" string "$TXN_ID"

# Then at some point, commit or rollback.
bash storage-engine.sh --commit-transaction "$TXN_ID"

bash storage-engine.sh --rollback-transaction "$TXN_ID"

By default, if no transaction_id is given, the record is created immediately.

Data Types

Within type-system/type_definitions.sh, you’ll find definitions like:

string: ^.$
integer: ^[0-9]+$
boolean: ^(true|false)$
email: ^[a-zA-Z0-9.%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$
url: ^(https?://)?... etc.

When you do create_record users '[email protected]' email, the engine checks if [email protected] matches the email regex. If not, you are in error.

Limitations

This is not optimized for high performance. Looping over files in Bash can be slow with large data sets.
The query syntax is minimal and does not fully support complex joins or subqueries.
Transaction support is rudimentary; concurrent edits can lead to partial conflicts.
Because it’s file-based, you may want to migrate to a real database if your data grows large or concurrency becomes an issue.

Do not use this for anything serious. It's a learning exercise and a proof of concept.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
tests		tests
type-system		type-system
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
async_query.sh		async_query.sh
backup_restore.sh		backup_restore.sh
cache_manager.sh		cache_manager.sh
errors.sh		errors.sh
index_operations.sh		index_operations.sh
logging.sh		logging.sh
query-engine.sh		query-engine.sh
query_parser.sh		query_parser.sh
record_operations.sh		record_operations.sh
storage-engine.sh		storage-engine.sh
test-query.sh		test-query.sh
test.sh		test.sh
transaction_manager.sh		transaction_manager.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SE

Features

Getting Started

Directory Structure

Usage

Create a New Record

Query Data

Async Query

Check and Retrieve Async Results

Backup and Restore

Transactions

Data Types

Limitations

About

Releases

Packages

Languages

License

umegbewe/se

Folders and files

Latest commit

History

Repository files navigation

SE

Features

Getting Started

Directory Structure

Usage

Create a New Record

Query Data

Async Query

Check and Retrieve Async Results

Backup and Restore

Transactions

Data Types

Limitations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages