Skip to content

Commit

Permalink
chore(#1521): remove CHT Pipeline mentions (#1522)
Browse files Browse the repository at this point in the history
Co-authored-by: Gareth Bowen <[email protected]>
  • Loading branch information
andrablaj and garethbowen authored Sep 4, 2024
1 parent 1afbf39 commit e02b485
Show file tree
Hide file tree
Showing 7 changed files with 21 additions and 21 deletions.
2 changes: 1 addition & 1 deletion content/en/apps/guides/data/analytics/_index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@
title: Data Synchronization and Analytics
weight: 150
description: >
Using CHT Sync and CHT Pipeline for data synchronization and analytics
Using CHT Sync for data synchronization and analytics
---

Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ All the variables in the `.env` file:
| `POSTGRES_TABLE` | `couchdb` | PostgreSQL table where the CouchDB data is copied |
| `POSTGRES_HOST` | `postgres` | PostgreSQL instance |
| `POSTGRES_PORT` | `5432` | PostgreSQL port |
| `CHT_PIPELINE_BRANCH_URL` | `"https://github.com/medic/cht-pipeline.git#main"` | CHT Pipeline branch containing the `DBT` models |
| `CHT_PIPELINE_BRANCH_URL` | `"https://github.com/medic/cht-pipeline.git#main"` | cht-pipeline branch containing the dbt models |
| `DATAEMON_INTERVAL` | `5` | Interval (in minutes) for looking for new changes in the CouchDB data |
| `COUCHDB_USER` | `medic` | Username of the CouchDB instance |
| `COUCHDB_PASSWORD` | `password` | Password of the CouchDB instance |
Expand Down
2 changes: 1 addition & 1 deletion content/en/apps/guides/data/analytics/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ relatedContent: >
The pages in this section apply to both CHT 3.x (beyond 3.12) and CHT 4.x.
{{% /pageinfo %}}

Most CHT deployments require some sort of analytics so that stakeholders can make data driven decisions. CouchDB, which is the database used by the CHT, is not designed for analytics. It is a document database, which means that it is optimized for storing and retrieving documents, and not for aggregating data. For example, if you wanted to know how many patients were registered in a particular area, you would have to query the database for all the patients in that area, and then count them. This is not a very efficient process. It is much more efficient to store the number of patients in a particular area in a separate database, and update that number whenever a patient is registered or unregistered. This is what CHT Sync paired with CHT Pipeline is designed to do.
Most CHT deployments require some sort of analytics so that stakeholders can make data driven decisions. CouchDB, which is the database used by the CHT, is not designed for analytics. It is a document database, which means that it is optimized for storing and retrieving documents, and not for aggregating data. For example, if you wanted to know how many patients were registered in a particular area, you would have to query the database for all the patients in that area, and then count them. This is not a very efficient process. It is much more efficient to store the number of patients in a particular area in a separate database, and update that number whenever a patient is registered or unregistered. This is what CHT Sync is designed to do.

## CHT Sync Introduction

Expand Down
2 changes: 1 addition & 1 deletion content/en/apps/guides/data/analytics/production.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ couchdbs:
password: "password3"
```

- Set the CHT Pipeline Branch URL in the `values.yaml` file.
- Set the cht-pipeline branch URL in the `values.yaml` file.
```yaml
cht_pipeline_branch_url: "https://github.com/medic/cht-pipeline.git#main"
```
Expand Down
10 changes: 5 additions & 5 deletions content/en/apps/guides/data/analytics/testing-dbt-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Testing dbt models"
linkTitle: "Testing dbt models"
weight: 6
description: >
Guide for testing dbt models in CHT Pipeline
Guide for testing dbt models
---

## Overview
Expand Down Expand Up @@ -32,8 +32,8 @@ Unit tests are essential for validating complex SQL logic and transformations in

For more details on formatting unit tests, refer to the [official dbt documentation](https://docs.getdbt.com/reference/resource-properties/unit-tests).

## Guidelines for CHT Pipeline tests
To ensure data integrity and the reliability of the dbt models in the CHT Pipeline, it is essential to follow these testing guidelines:
## Guidelines for dbt tests
To ensure data integrity and the reliability of the dbt models in the [cht-pipeline](https://github.com/medic/cht-pipeline), it is essential to follow these testing guidelines:

- **Basic generic tests** for all models:
Every model should have generic tests to enforce critical constraints and relationships. Use the generic tests provided in dbt core.
Expand All @@ -50,10 +50,10 @@ Unit tests are not strictly required but are highly recommended, especially for
- Custom calculations: When creating functions or applying unique data processing logic.
- Edge cases: To handle scenarios that are not typically found in actual data but may arise unexpectedly.

## Writing CHT Pipeline tests
## Writing dbt tests


[CHT Pipeline](https://github.com/medic/cht-pipeline) contains a `/models` directory containing SQL files and YAML files for generic tests and a `/test` directory with folders for fixtures and singular tests.
cht-pipeline contains a `/models` directory containing SQL files and YAML files for generic tests and a `/test` directory with folders for fixtures and singular tests.

```
./
Expand Down
16 changes: 8 additions & 8 deletions content/en/contribute/code/hall-of-fame.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ description: >

Thank you to everyone who has contributed to the CHT codebase over the years! To see the full list, visit each repo on GitHub.

- [CHT Android](https://github.com/medic/cht-android/graphs/contributors)
- [CHT Conf](https://github.com/medic/cht-conf/graphs/contributors)
- [CHT Core](https://github.com/medic/cht-core/graphs/contributors)
- [CHT Docs](https://github.com/medic/cht-docs/graphs/contributors)
- [CHT Interoperability](https://github.com/medic/cht-interoperability/graphs/contributors)
- [CHT Pipeline](https://github.com/medic/cht-pipeline/graphs/contributors)
- [CHT Sync](https://github.com/medic/cht-sync/graphs/contributors)
- [CHT Watchdog](https://github.com/medic/cht-watchdog/graphs/contributors)
- [cht-android](https://github.com/medic/cht-android/graphs/contributors)
- [cht-conf](https://github.com/medic/cht-conf/graphs/contributors)
- [cht-core](https://github.com/medic/cht-core/graphs/contributors)
- [cht-docs](https://github.com/medic/cht-docs/graphs/contributors)
- [cht-interoperability](https://github.com/medic/cht-interoperability/graphs/contributors)
- [cht-pipeline](https://github.com/medic/cht-pipeline/graphs/contributors)
- [cht-sync](https://github.com/medic/cht-sync/graphs/contributors)
- [cht-watchdog](https://github.com/medic/cht-watchdog/graphs/contributors)

#### Security

Expand Down
8 changes: 4 additions & 4 deletions content/en/core/overview/cht-sync.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "CHT Sync and CHT Pipeline"
title: "CHT Sync"
linkTitle: "CHT Sync"
weight: 2
description: >
Expand All @@ -21,10 +21,10 @@ Read more about setting up [CHT Sync]({{< relref "apps/guides/data/analytics/set
<!-- https://docs.google.com/presentation/d/1j4jPsi-gHbiaLBfgYOyru1g_YV98PkBrx2zs7bwhoEQ/ -->
{{< figure src="cht-sync.png" link="cht-sync.png" class=" center col-8 col-lg-6" >}}

[CHT Sync](https://github.com/medic/cht-sync) uses `couch2pg` to replicate data from CouchDB to PostgreSQL in a near real-time manner. It listens to changes in the CHT database, and updates the analytics database accordingly.
[CHT Sync](https://github.com/medic/cht-sync) replicates data from CouchDB to PostgreSQL in a near real-time manner. It listens to changes in the CHT database, and updates the analytics database accordingly.
It is not designed to be accessed by users, and it does not have a user interface. It is designed to be run on the same server as the CHT, but it can be run on a separate server if necessary.

As CHT Sync puts all new data into a PostgreSQL database into a single table that has a `jsonb` column, this is not very useful for analytics. [CHT Pipeline](https://github.com/medic/cht-pipeline) is a set of SQL queries that transform the data in the `jsonb` column into a more useful format. It uses [dbt](https://www.getdbt.com/) to define the models that are translated into PostgreSQL tables or views, which makes it easier to query the data in the analytics platform of choice.
As CHT Sync puts all new data into a PostgreSQL database into a single table that has a `jsonb` column, this is not very useful for analytics. [cht-pipeline](https://github.com/medic/cht-pipeline) contains a set of SQL queries that transform the data in the `jsonb` column into a more useful format. It uses [dbt](https://www.getdbt.com/) to define the models that are translated into PostgreSQL tables or views, which makes it easier to query the data in the analytics platform of choice.

#### couch2pg

Expand All @@ -36,7 +36,7 @@ A free and open source SQL database used for analytics queries. See more at the

#### dbt

Once the data is synchronized and stored in PostgreSQL, it undergoes transformation using predefined [dbt](https://www.getdbt.com/) models from the [CHT Pipeline](https://github.com/medic/cht-pipeline). dbt is used to ingest raw JSON data from the PosgtreSQL database (`jsonb` column) and normalize it into a relational schema to make it easier to query. A daemon runs CHT Pipeline, and it updates the database whenever the data in the `jsonb` column changes.
Once the data is synchronized and stored in PostgreSQL, it undergoes transformation using predefined [dbt](https://www.getdbt.com/) models from the [cht-pipeline](https://github.com/medic/cht-pipeline). dbt is used to ingest raw JSON data from the PosgtreSQL database (`jsonb` column) and normalize it into a relational schema to make it easier to query. A daemon runs the dbt models, and it updates the database whenever the data in the `jsonb` column changes.

#### Data Visualization

Expand Down

0 comments on commit e02b485

Please sign in to comment.