Skip to content

feat(protect): searchable encryption and protect schemas#96

Merged
calvinbrewer merged 32 commits intomainfrom
next
Mar 6, 2025
Merged

feat(protect): searchable encryption and protect schemas#96
calvinbrewer merged 32 commits intomainfrom
next

Conversation

@calvinbrewer
Copy link
Copy Markdown
Contributor

@calvinbrewer calvinbrewer commented Mar 3, 2025

This next version contains the following features and improvements. This will be a breaking change when released.

  • Added a schema strategy that's deeply embedded in the developer experience.
  • Implemented searchable encryption indexes

Check out the README for the usability changes.

The schema flow looks like this:

import { protect, csColumn, csTable } from '@cipherstash/protect'

export const users = csTable('users', {
  email_encrypted: csColumn('email_encrypted')
    .equality()
    .orderAndSort()
    .freeTextSearch(),
})

export const protectClient = await protect(users)

const encryptedResult = await protectClient.encrypt(email, {
  column: users.email_encrypted,
  table: users,
})

@calvinbrewer calvinbrewer marked this pull request as ready for review March 5, 2025 19:32
@calvinbrewer calvinbrewer changed the title feat(protect): add initial searchable encryption functionality feat(protect): searchable encryption and protect schemas Mar 5, 2025
- Follow the prompts to indicate the type of version bump (patch, minor, major).
- The [GitHub Actions](./.github/workflows/) (or other CI pipeline) will handle the **publish** step to npm once your PR is merged and the changeset is committed to `main`.

## Pre release process
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for documenting this (and for wiring up pre releases)

README.md Outdated

> [!IMPORTANT]
> Searching, sorting, and filtering on encrypted data is only supported in PostgreSQL at the moment.
> Read more about [searching encrypted data](./docs/searchable-encryption.md).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that changing the phrasing around Postgres support makes it less clear which databases we do actually support. Previously it sounded like Postgres was the only database supported, but now it's ambiguous. Not blocking for this PR, but I think we'll want to clarify which databases we support/plan on supporting and for which features.

// return sql`cs_ore_64_8_v1(${left}) < cs_ore_64_8_v1(${bindIfParam(right, left)})`;
// };

const csMatch: BinaryOperator = (left: SQLWrapper, right: unknown): SQL => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these functions can be extracted into a package for a Drizzle integration. Using functions like csEq or cs.Eq would feel pretty similar to the typical Drizzle API and would mean that users wouldn't have to be concerned about calling EQL functions themselves most of the time.

Probably already on your list, but wanted to mention it just in case.

</br>

[![Tests](https://github.com/cipherstash/protectjs/actions/workflows/tests.yml/badge.svg)](https://github.com/cipherstash/protectjs/actions/workflows/tests.yml)
[![Built by CipherStash](https://raw.githubusercontent.com/cipherstash/meta/refs/heads/main/csbadge.svg)](https://cipherstash.com)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this? I've been putting this on all our repos. Looks better than the non-transparent logo IMHO.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it above the fold!

```ts
import { csTable, csColumn } from '@cipherstash/protect'

export const users = csTable('users', {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in our call, I think the term csTable is confusing because (to me at least) it feels like this is defining the table. But what it's actually doing is protecting it.
Given the name of the library is Protect.js, why don't we call this protect?

export const users = protect.table('users', {
  email: protect.column('email'),
})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we've used protect to load the protect client. So how about this:

import { protected } from '@cipherstash/protect'

export const users = protected.table('users', {
  email: protected.column('email'),
})

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point! Here are my thoughts:

  • Using csTable and csColumn bring some CipherStash branding into the picture to tie Protect.js and CipherStash together.
  • Replicates building our your database schema, which in a way is kinda what we are doing. I actually really love how it replicates the Drizzle experience.
  • csTable is a much more JS way of doing things in this specific manner compared to protected.table

Copy link
Copy Markdown
Contributor Author

@calvinbrewer calvinbrewer Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason I really like the csTable and csColumn approach is because it is definitely more tied to the database than the core of Protect.js as it's bridging the gap between EQL, CipherStash, and the interface.

Another point on that is when we build out ORM specific examples. E.g.

This is how you do equality with Drizzle today

import { eq } from "drizzle-orm";
import { integer, pgTable, varchar } from "drizzle-orm/pg-core";

export const users = pgTable('users', {
  id: integer(),
  email: varchar('email')
})

db.select().from(table).where(eq(users.email, "test@example.com"));

To maintain consistency with Drizzle, TypeORM, and any other ORM we interface with it'd look like this and be an awesome dev ex

import { csEq } from "@cipherstash/drizzle-orm"
import { csTable, csColumn } from "@cipherstash/protect"

export cost protectedUsers = csTable('users', {
  email: csColumn('email').equality()
})

const searchTerm = protectClient.encrypt("test@example.com", {
  table: protectedUsers,
  column: protectedUsers.email
})

db.select().from(table).where(csEq(protectedUsers.email, searchTerm.data));

Comment on lines +214 to +215
column: users.email,
table: users,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the schema already knows what column and table are being protected, can we set the EQL identity (ie. column and table) automatically for the user? Its a bit of a pain having to provide these values every time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being declarative in this instance is a good idea

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could definitely be some room for improvement here but without major refactoring it'd be difficult to do that

}
}

const csEq: BinaryOperator = (left: SQLWrapper, right: unknown): SQL => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming things is hard but I wonder if the cs prefix will seem confusing to folks.
We all know that it stands for CipherStash but people are using the Protect library and the company that makes it might feel a bit abstract.

Another way to think about this: what is the difference in behaviour (or outcome) that this version of eq delivers over the standard one?
IMHO its safety/security.

Perhaps instead of csEq we call this secureEq or safeEq?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even secure.eq

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can thing about a Drizzle interface for Protect.js in a follow up feature 😎

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/schema.md Outdated

You can chain these methods to your column to configure them in any combination.

## Initializing the EQL client
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say "Protect client"?

Comment on lines +21 to +30
type AtLeastOneCsTable<T> = [T, ...T[]]
export const protect = async (
...tables: AtLeastOneCsTable<ProtectTable<ProtectTableColumn>>
): Promise<ProtectClient> => {
if (!tables.length) {
throw new Error(
'[protect]: At least one csTable must be provided to initialize the protect client',
)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Copy Markdown
Contributor

@coderdan coderdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a massive step forward. Well done.

I've left a number of comments that I think we should address quickly but I don't want them to block the merge. Noting that addressing the changes could well result in breaking changes.

@calvinbrewer calvinbrewer merged commit 2176cff into main Mar 6, 2025
1 check passed
@calvinbrewer calvinbrewer deleted the next branch March 6, 2025 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants