Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking the design #126

Open
shaddeykatri opened this issue Aug 14, 2023 · 1 comment
Open

Benchmarking the design #126

shaddeykatri opened this issue Aug 14, 2023 · 1 comment

Comments

@shaddeykatri
Copy link

shaddeykatri commented Aug 14, 2023

Hi, I have a few designs to implement the same functionality in my code.

One involves using a tuple and storing everything in the same directory.
Like
Employees
("/employees/IT/","employee1"):{employeeDetails} ("/employees/Marketing/","employee2"):{employeeDetails}
Where '/employees/IT/' and '/employees/Marketing' is a part of a tuple.

Another involves using different directories for different departments.

Like
Employees
--IT
-----employee1:{employeeDetails}
--Marketing
-----employee2:{employeeDetails}

Now each design has its own pros and cons and there are a certain set of requirements for them.

Like
Get an employee
Get all employees
Get employees from one dept
Get all employees from X depts
and some more

I want to benchmark which design is the better one.

So far I have gone through the benchmark on the FDB official website.

I have also checked out benchmarking using YSCB.

I have also recorded the net trace while running the test code and I feel there is too much information for me to comprehend.

Please point me in the direction on how to benchmark certain design patterns in foundation db and I'll be more than grateful.

@KrzysFR
Copy link
Member

KrzysFR commented Aug 14, 2023

First I think you should focus on a more simple "schema" for your keys. There is no point in trying to optimize a "bad" schema (except to realize that it is a bad schema!), and my experience is to always start to create a simple version by focusing on the simplest way to encode keys that will give me the ability to perform range reads as much as possible, and then be able to parallelize reads by batch as much as possible. This usually gets you 80% of the way. Then if you have either a lot of data, or a lot of read/writes per second then you can "break" this nice and simple schema into something more involved, to achieve your target performance.

Talking about target performance, I don't think that your requirement is to be able to insert a million employees per second, or to guarantee that fetching the list of employees of a single department takes less than 5 milliseconds? If your requirements are more "relaxed" like for a typical internal web portal, where you would be happy with response time in < 200ms, usually you don't need to do more work!

--

I think that you should not use sub-directories for each employee, unless if you need to be able to store a lot of data for each employee (like a mini database with multiple "tables", specific to each person), or need very strict "tenant" isolation.

Also, it depends on the number of employees, and keys per employee, combined with the expected number of requests per second. Since there is a finite number of humans on the planet, and probably even less working for the same company, I don't think that this dataset will have a high traffic. Probably mostly reading the employee details, and maybe adding/updating a handful of entry per day?

If this is the case, then I think you can achieve all of your requirements using a "main" table to store employee details, using a single directory, by using the employee unique ID (depends on your model, maybe an integer? a string? a guid?):

(..., 0, <EMPLOYEE_ID>) = <employee details as JSON or any other format>

Here the '...' in the tuple refers to the prefix of the main directory that you open, so if you have opened it as subspace, you would use subspace.Encode(0, employee.Id) to generate the full key. This subspace would be the result of calling Resolve(...) on the employees directory location that would be a parameter of your application. The "path" could be something like ("Dev", "Sandbox", "MyAppName", "Employees") when you are working on your machine, or ("Test", "MyAppName", Environment.MachineName) for you CI, and of course ("Production", "MyAppName", "1.0.3.4567", "Employees") in production. Doesn't matter for the rest.

This gives your first requirement of being able to get an employee knowing it's ID (just read back the key).

Also, the '0' as the first part is here to prepare the way for adding indexes. I usually use 0 for the 'main' content (here the details for each person), and then use 1, 2, etc... for the rest. Note that integer 0 is encoded as a single byte (0x14) with the tuple encoding, while 1, 2, 3 will use two bytes (0x15 ##) so if you know that one "subspace" will have a lot more data than the other, then you could use 0 for this one, and >= 1 for the other, but other than that it's only a convention that you use). In my code I usually use constants to encode these prefixes, in order to give them a nice name.

private const int EMPLOYEES_BY_ID = 0; // main table with employee details
private const int EMPLOYEES_BY_DEPARTMENT = 1; // index by department id (non-unique)
private const int EMPLOYEES_BY_EMAIL = 2; // index by email (unique)

That way the code to generate keys looks like subspace.Encode(EMPLOYEES_BY_ID, employee.Id) which is a lot more maintainable.

So now with this, we have all the employee details that are store as in a single range that is contiguous, and has the benefit of being already sorted by employee ID. So if you need to scan through all employees, you need a single range read under (..., 0).

--

To handle departments, you need an additional index, so that's where we use the integer prefix constants introduced above. Here we have a non-unique index, meaning multiple employees belong to the same department. The typical pattern for this, is to encode both the indexed value and the document id in the key, and leave the value empty.

(..., <index_prefix>, <indexed_value>, <document_id>) = ''

In this case, the indexed value is the unique id of the department (could be anything: an integer, string, Guid, but I would try not to use the actual name of the entity: it is prone to typos, casing issues, and also a department could be renamed!).

The document id here will be the unique id of the employee, the same that is used in the "main" table (prefix 0). When you range read under (..., <index_prefix>, <index_value>) you will get all the keys that have this value, and by decoding only the last item of the tuple, that will give you the id of the corresponding employee.

To "add" an employee to a department, you simply create the key via tr.Set(subspace.Encode(EMPLOYEES_BY_DEPARTMENT, employee.DepartmentId, employee.Id), Slice.Empty).

To "remove" the employee from a department, you Clear the same key.

DO NOT forget to properly update the indexes whenever the user is updated! Especially if the user changes department, you have to clear the old key (with previous dept id) and create a new one (with new dept id) otherwise the index will be corrupted! Ideally, you should always update the employee details and the indexes in the same transaction, never split across multiple transactions!

Once you have indexed all the employees, you will have a first range under (..., 0) with all employee details, and then next to it a second range of (..., 1) with the index. This index is itself sub-divided into small ranges of all the employees for each department.

So if you want to know all the employees of a single department, you just need to tr.GetRange(subspace.EncodeRange(EMPOYEES_BY_DEPARTMENT, departmentId)).Select(kv => subspace.DecodeLast<string>(kv.Key)).ToList() and you will get a list of ids of employees (I'm assuming the type of the id is string but it could be int or Guid, does'nt matter).

If you only need the list of IDs you are done, but if you want the details, you need to do a second set of read, to read all the keys under (...., 0,) for each id in the list. Please don't do a single await tr.GetValueAsync(...) for each, because then you will fall under the "1+N" pattern and it will be very slow !

The best practice is to use GetValuesAsync(...) which take an enumeration of keys to read, and reads them in parallel, by using the natural batching properties of the FDB client, which will send requests by batch to the storage nodes! GetValues takes a list of keys, and will return a list of values in the same order, so you can easily combine with a previous range read in an index to fetch the corresponding employee details in 2 rounds, instead of 1+N.

If you need to get all employees from a list of one or more departments, then you simply need to do the same thing but for all of them and combine the results. If the number of depts is always a handful, you can do this sequentially (read in the index to fetch the employees id for first, then second, then third, ...) and merge all ids in an HashSet<...> then use this to call GetValues to fetch the details. Or you can do this in parallel using Task.WhenAll(....) to perform each "index query" in // and merge the results after that.

--

You can add more indexes, like for example by email, by manager id, by date of hiring, by role, etc... simply add new prefixes (2, 3, 4, ...).

For 'unique' indexes (meaning for each indexed value there is only one entity), you can simplify the key format by storing the document id in the value instead of in the key:

(..., <index_prefix>, <indexed_value>) = <document_id>

This is a little bit more compact, but using the generic "non-unique" key format would work as well.

--

Please remember that FoundationDB is very slow if you do everything sequentially (reading a key, awaiting the result, and from that get the name of another key to read, etc...), but very fast if you can combine range reads (read keys that are contiguous in the key space) and parallelized reads (via GetValues or by spawning a lot of tasks in c# and merging with Task.WhenAll).

But when you have something working and that handle most cases, then you may want to be able to precisly see all the queries that are performed, and be able to measure them.

That's where the embedded profiler comes in. You can easily enable the capture of all requests, and produce a log for each transaction that you can either dump in the console, write to a file, or send to some other service.

On each transaction, you can set a "log handler" which is callback that is invoked with an FdbTransactionLog object, once the transaction has completed, that contains all the recorded reads and writes. If you want to profile all transactions, you can also set a default log handler on the database object that will be set for all new transactions. The simplest way is to set the DefaultLogHandler property on the FdbDatabaseProviderOptions options during setup, see

public Action<FdbTransactionLog>? DefaultLogHandler { get; set; }

To simplify your life, this log object has a convenient method GetTimingsReport(true) that will produce a text-friendly dump of the transaction profile, that you can easily write to the console or file. I use this all the time when working on a layer.

This is also very useful in unit tests, where you can dump the exact queries that failed to the test output (see for example https://github.com/Doxense/foundationdb-dotnet-client/blob/master/FoundationDB.Tests/Layers/DirectoryFacts.cs#L60 )

This method produces outputs like this:

Transaction #190055 (read/write, 5 operations, '#' = 0.5 ms, started 12:43:28.6451670Z, ended 12:43:28.6471734Z)
┌  oper. ┬──────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   // │      │ T+  0.002                        │     -     - │ // WaitForNextCommands(SRV000, 10)
│:0   op │      │ T+  0.002                        │     -     - │ SetOption PrioritySystemImmediate
│:0   R °│ ###: │ T+  0.005 ~   1.808 (  1,804 µs) │    60     0 │ Snapshot.GetRange fGT{(58, 10, "server_commands", "SRV000").<00>} <= k < fGE{(58, 10, "server_commands", "SRV000").<FF>}, limit(10), Exact => 0 result(s)
│ 1   W  │ ___` │ T+  1.810                        │     -     - │ Watch (58, 10, "server_commands", "SRV000")
│:1   Co │ ___X │ T+  1.819 ~   2.227 (    409 µs) │             │ Commit
└────────┴──────┴──────────────────────────────────┴─────────────┘
> Completed in 2.231 ms and 1 attempt(s)

Transaction #190087 (read/write, 7 operations, '#' = 0.5 ms, started 12:43:28.6602071Z, ended 12:43:28.6682276Z)
┌  oper. ┬──────────────────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   // │                  │ T+  0.003                        │     -     - │ // WaitForNextCommands(SRV000, 10)
│:0   op │                  │ T+  0.003                        │     -     - │ SetOption PrioritySystemImmediate
│:0   R °│ ##&              │ T+  0.005 ~   1.551 (  1,546 µs) │    60   489 │ Snapshot.GetRange fGT{(58, 10, "server_commands", "SRV000").<00>} <= k < fGE{(58, 10, "server_commands", "SRV000").<FF>}, limit(10), Exact => 3 result(s)
│ 1   c  │ __`              │ T+  1.553 ~   1.555 (      2 µs) │    42       │ Clear (58, 10, "server_commands", "SRV000", @164222401665-0#1)
│ 2   c  │ __`              │ T+  1.555 ~   1.555 (      1 µs) │    42       │ Clear (58, 10, "server_commands", "SRV000", @164222401753-1#1)
│ 3   c  │ __`              │ T+  1.555 ~   1.556 (      1 µs) │    42       │ Clear (58, 10, "server_commands", "SRV000", @164222401753-2#1)
│ 4   Co°│ __.############# │ T+  1.558 ~   8.449 (  6,892 µs) │             │ Commit
└────────┴──────────────────┴──────────────────────────────────┴─────────────┘
> Read 489 bytes and Committed 339 bytes in 8.455 ms and 1 attempt(s)

You can add "comments" or "labels" to the trace at any point by calling tr.Annotate("hello, there!") at various point of your transaction handler, to add comments to the trace, which will show up as "// hello, there!" in the log. This can help you break down a complex handler into multiple steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants