diff --git a/content/en/apps/guides/data/analytics/_index.html b/content/en/apps/guides/data/analytics/_index.html index c7fbfa05a..171b36838 100644 --- a/content/en/apps/guides/data/analytics/_index.html +++ b/content/en/apps/guides/data/analytics/_index.html @@ -2,6 +2,6 @@ title: Data Synchronization and Analytics weight: 150 description: > - Using CHT Sync and CHT Pipeline for data synchronization and analytics + Using CHT Sync for data synchronization and analytics --- diff --git a/content/en/apps/guides/data/analytics/environment-variables.md b/content/en/apps/guides/data/analytics/environment-variables.md index 814e98c38..f8568c479 100644 --- a/content/en/apps/guides/data/analytics/environment-variables.md +++ b/content/en/apps/guides/data/analytics/environment-variables.md @@ -22,7 +22,7 @@ All the variables in the `.env` file: | `POSTGRES_TABLE` | `couchdb` | PostgreSQL table where the CouchDB data is copied | | `POSTGRES_HOST` | `postgres` | PostgreSQL instance | | `POSTGRES_PORT` | `5432` | PostgreSQL port | -| `CHT_PIPELINE_BRANCH_URL` | `"https://github.com/medic/cht-pipeline.git#main"` | CHT Pipeline branch containing the `DBT` models | +| `CHT_PIPELINE_BRANCH_URL` | `"https://github.com/medic/cht-pipeline.git#main"` | cht-pipeline branch containing the dbt models | | `DATAEMON_INTERVAL` | `5` | Interval (in minutes) for looking for new changes in the CouchDB data | | `COUCHDB_USER` | `medic` | Username of the CouchDB instance | | `COUCHDB_PASSWORD` | `password` | Password of the CouchDB instance | diff --git a/content/en/apps/guides/data/analytics/introduction.md b/content/en/apps/guides/data/analytics/introduction.md index 2b08b67f3..dc7feee75 100644 --- a/content/en/apps/guides/data/analytics/introduction.md +++ b/content/en/apps/guides/data/analytics/introduction.md @@ -13,7 +13,7 @@ relatedContent: > The pages in this section apply to both CHT 3.x (beyond 3.12) and CHT 4.x. {{% /pageinfo %}} -Most CHT deployments require some sort of analytics so that stakeholders can make data driven decisions. CouchDB, which is the database used by the CHT, is not designed for analytics. It is a document database, which means that it is optimized for storing and retrieving documents, and not for aggregating data. For example, if you wanted to know how many patients were registered in a particular area, you would have to query the database for all the patients in that area, and then count them. This is not a very efficient process. It is much more efficient to store the number of patients in a particular area in a separate database, and update that number whenever a patient is registered or unregistered. This is what CHT Sync paired with CHT Pipeline is designed to do. +Most CHT deployments require some sort of analytics so that stakeholders can make data driven decisions. CouchDB, which is the database used by the CHT, is not designed for analytics. It is a document database, which means that it is optimized for storing and retrieving documents, and not for aggregating data. For example, if you wanted to know how many patients were registered in a particular area, you would have to query the database for all the patients in that area, and then count them. This is not a very efficient process. It is much more efficient to store the number of patients in a particular area in a separate database, and update that number whenever a patient is registered or unregistered. This is what CHT Sync is designed to do. ## CHT Sync Introduction diff --git a/content/en/apps/guides/data/analytics/production.md b/content/en/apps/guides/data/analytics/production.md index a3b8b1f60..1b08211ac 100644 --- a/content/en/apps/guides/data/analytics/production.md +++ b/content/en/apps/guides/data/analytics/production.md @@ -69,7 +69,7 @@ couchdbs: password: "password3" ``` -- Set the CHT Pipeline Branch URL in the `values.yaml` file. +- Set the cht-pipeline branch URL in the `values.yaml` file. ```yaml cht_pipeline_branch_url: "https://github.com/medic/cht-pipeline.git#main" ``` diff --git a/content/en/apps/guides/data/analytics/testing-dbt-models.md b/content/en/apps/guides/data/analytics/testing-dbt-models.md index ee9a3fdf8..1fbfe156a 100644 --- a/content/en/apps/guides/data/analytics/testing-dbt-models.md +++ b/content/en/apps/guides/data/analytics/testing-dbt-models.md @@ -3,7 +3,7 @@ title: "Testing dbt models" linkTitle: "Testing dbt models" weight: 6 description: > - Guide for testing dbt models in CHT Pipeline + Guide for testing dbt models --- ## Overview @@ -32,8 +32,8 @@ Unit tests are essential for validating complex SQL logic and transformations in For more details on formatting unit tests, refer to the [official dbt documentation](https://docs.getdbt.com/reference/resource-properties/unit-tests). -## Guidelines for CHT Pipeline tests -To ensure data integrity and the reliability of the dbt models in the CHT Pipeline, it is essential to follow these testing guidelines: +## Guidelines for dbt tests +To ensure data integrity and the reliability of the dbt models in the [cht-pipeline](https://github.com/medic/cht-pipeline), it is essential to follow these testing guidelines: - **Basic generic tests** for all models: Every model should have generic tests to enforce critical constraints and relationships. Use the generic tests provided in dbt core. @@ -50,10 +50,10 @@ Unit tests are not strictly required but are highly recommended, especially for - Custom calculations: When creating functions or applying unique data processing logic. - Edge cases: To handle scenarios that are not typically found in actual data but may arise unexpectedly. -## Writing CHT Pipeline tests +## Writing dbt tests -[CHT Pipeline](https://github.com/medic/cht-pipeline) contains a `/models` directory containing SQL files and YAML files for generic tests and a `/test` directory with folders for fixtures and singular tests. +cht-pipeline contains a `/models` directory containing SQL files and YAML files for generic tests and a `/test` directory with folders for fixtures and singular tests. ``` ./ diff --git a/content/en/contribute/code/hall-of-fame.md b/content/en/contribute/code/hall-of-fame.md index a80f0d405..3bb1bd8c4 100644 --- a/content/en/contribute/code/hall-of-fame.md +++ b/content/en/contribute/code/hall-of-fame.md @@ -10,14 +10,14 @@ description: > Thank you to everyone who has contributed to the CHT codebase over the years! To see the full list, visit each repo on GitHub. -- [CHT Android](https://github.com/medic/cht-android/graphs/contributors) -- [CHT Conf](https://github.com/medic/cht-conf/graphs/contributors) -- [CHT Core](https://github.com/medic/cht-core/graphs/contributors) -- [CHT Docs](https://github.com/medic/cht-docs/graphs/contributors) -- [CHT Interoperability](https://github.com/medic/cht-interoperability/graphs/contributors) -- [CHT Pipeline](https://github.com/medic/cht-pipeline/graphs/contributors) -- [CHT Sync](https://github.com/medic/cht-sync/graphs/contributors) -- [CHT Watchdog](https://github.com/medic/cht-watchdog/graphs/contributors) +- [cht-android](https://github.com/medic/cht-android/graphs/contributors) +- [cht-conf](https://github.com/medic/cht-conf/graphs/contributors) +- [cht-core](https://github.com/medic/cht-core/graphs/contributors) +- [cht-docs](https://github.com/medic/cht-docs/graphs/contributors) +- [cht-interoperability](https://github.com/medic/cht-interoperability/graphs/contributors) +- [cht-pipeline](https://github.com/medic/cht-pipeline/graphs/contributors) +- [cht-sync](https://github.com/medic/cht-sync/graphs/contributors) +- [cht-watchdog](https://github.com/medic/cht-watchdog/graphs/contributors) #### Security diff --git a/content/en/core/overview/cht-sync.md b/content/en/core/overview/cht-sync.md index a4402a03d..c152931a4 100644 --- a/content/en/core/overview/cht-sync.md +++ b/content/en/core/overview/cht-sync.md @@ -1,5 +1,5 @@ --- -title: "CHT Sync and CHT Pipeline" +title: "CHT Sync" linkTitle: "CHT Sync" weight: 2 description: > @@ -21,10 +21,10 @@ Read more about setting up [CHT Sync]({{< relref "apps/guides/data/analytics/set {{< figure src="cht-sync.png" link="cht-sync.png" class=" center col-8 col-lg-6" >}} -[CHT Sync](https://github.com/medic/cht-sync) uses `couch2pg` to replicate data from CouchDB to PostgreSQL in a near real-time manner. It listens to changes in the CHT database, and updates the analytics database accordingly. +[CHT Sync](https://github.com/medic/cht-sync) replicates data from CouchDB to PostgreSQL in a near real-time manner. It listens to changes in the CHT database, and updates the analytics database accordingly. It is not designed to be accessed by users, and it does not have a user interface. It is designed to be run on the same server as the CHT, but it can be run on a separate server if necessary. -As CHT Sync puts all new data into a PostgreSQL database into a single table that has a `jsonb` column, this is not very useful for analytics. [CHT Pipeline](https://github.com/medic/cht-pipeline) is a set of SQL queries that transform the data in the `jsonb` column into a more useful format. It uses [dbt](https://www.getdbt.com/) to define the models that are translated into PostgreSQL tables or views, which makes it easier to query the data in the analytics platform of choice. +As CHT Sync puts all new data into a PostgreSQL database into a single table that has a `jsonb` column, this is not very useful for analytics. [cht-pipeline](https://github.com/medic/cht-pipeline) contains a set of SQL queries that transform the data in the `jsonb` column into a more useful format. It uses [dbt](https://www.getdbt.com/) to define the models that are translated into PostgreSQL tables or views, which makes it easier to query the data in the analytics platform of choice. #### couch2pg @@ -36,7 +36,7 @@ A free and open source SQL database used for analytics queries. See more at the #### dbt -Once the data is synchronized and stored in PostgreSQL, it undergoes transformation using predefined [dbt](https://www.getdbt.com/) models from the [CHT Pipeline](https://github.com/medic/cht-pipeline). dbt is used to ingest raw JSON data from the PosgtreSQL database (`jsonb` column) and normalize it into a relational schema to make it easier to query. A daemon runs CHT Pipeline, and it updates the database whenever the data in the `jsonb` column changes. +Once the data is synchronized and stored in PostgreSQL, it undergoes transformation using predefined [dbt](https://www.getdbt.com/) models from the [cht-pipeline](https://github.com/medic/cht-pipeline). dbt is used to ingest raw JSON data from the PosgtreSQL database (`jsonb` column) and normalize it into a relational schema to make it easier to query. A daemon runs the dbt models, and it updates the database whenever the data in the `jsonb` column changes. #### Data Visualization