Skip to content
This repository has been archived by the owner on May 2, 2023. It is now read-only.

Cassandra Database Preparation

Gonçalo Tomás edited this page Dec 3, 2018 · 18 revisions

FMKe configuration: {target_database, cassandra}.
Dependencies: erlcass

Be sure to follow the instructions here in order to verify that you have installed all development tools required to build the Cassandra C++ Driver.

FMKe also supports Cassandra and other databases that use the same protocol, such as Rocksandra. Of course this assumes that you know how to set up a Cassandra cluster. There is plenty of documentation available, but we recommend the official single datacenter and multiple datacenter documentation from Datastax. Once the cluster is deployed and properly configured with FMKe's keyspace, you can then spawn one or more application servers.

Keyspace

FMKe does not try to micro manage its different entities throughout different keyspaces, and thus FMKe assumes that there should be a fmke keyspace available. You can configure the keyspace as you see fit. Experimenting with different keyspace configurations will most likely provide the best insight into how well Cassandra can perform in your testing environment scenario. Here is a common keyspace definition statement for the FMKe keyspace when testing out in a single datacenter:

CREATE KEYSPACE IF NOT EXISTS fmke
WITH REPLICATION = {
    'class': 'SimpleStrategy',
    'replication_factor': 1
};

For multiple datacenter deployments, you need to specify the 'NetworkTopologyStrategy', providing replication factor values for each individual datacenters. Use the official documentation page in order to properly set up a keyspace for multiple datacenters.

Tables

In Cassandra tables have to be created with a reference to the keyspace they will be associated with. Here is an example of how you could create FMKe tables:

CREATE TABLE IF NOT EXISTS fmke.patients (
    ID int PRIMARY KEY,
    Name text,
    Address text,
);

CREATE TABLE IF NOT EXISTS fmke.pharmacies (
    ID int PRIMARY KEY,
    Name text,
    Address text,
);

CREATE TABLE IF NOT EXISTS fmke.medical_staff (
    ID int PRIMARY KEY,
    Name text,
    Address text,
    Speciality text,
);

CREATE TABLE IF NOT EXISTS fmke.treatment_facilities (
    ID int PRIMARY KEY,
    Name text,
    Address text,
    Type text,
);

CREATE TABLE IF NOT EXISTS fmke.prescriptions (
    ID int,
    PatID int,
    DocID int,
    PharmID int,
    DatePrescribed timestamp,
    DateProcessed timestamp,
    PRIMARY KEY (ID)
);

CREATE TABLE IF NOT EXISTS fmke.patient_prescriptions (
    PatientID int,
    PrescriptionID int,
    PRIMARY KEY (PatientID, PrescriptionID)
);

CREATE TABLE IF NOT EXISTS fmke.pharmacy_prescriptions (
    PharmacyID int,
    PrescriptionID int,
    PRIMARY KEY (PharmacyID, PrescriptionID)
);

CREATE TABLE IF NOT EXISTS fmke.staff_prescriptions (
    StaffID int,
    PrescriptionID int,
    PRIMARY KEY (StaffID, PrescriptionID)
);

CREATE TABLE IF NOT EXISTS fmke.prescription_drugs (
    PrescriptionID int,
    Drug text,
    PRIMARY KEY (PrescriptionID, Drug)
);

Executing Commands in Docker

Let us assume that we were using Docker to spawn a single instance of Cassandra:

docker run -d --name cassandra -p "9042:9042" rinscy/cassandra

In order to be able to add the keyspace and create the required tables, execute the following command:

docker exec -it cassandra /bin/bash

This will give you a root shell where you will be able to access the cqlsh utility, where you can then paste in the above statements:

root@84dedb6f4117:/# cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.0 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>

Once you see the cqlsh> prompt it means that you can now use the commands to define the keyspace and tables.

Final notes

Note that different configurations of the cluster (e.g. read/write quorum size) may significantly impact cluster performance. In order to get a relevant performance reading, use a configuration that closely resembles that which you may need to use. Use the metrics you obtain to compare with other supported data storage solutions under similar circumstances in order to effectively determine which storage solution is best for your particular needs.