|
| 1 | +--- |
| 2 | +title: Memgraph in mission-critical workloads |
| 3 | +description: Suggestions on how to bring your Memgraph to production in mission-critical and high-availability workloads. |
| 4 | +--- |
| 5 | + |
| 6 | +import { Callout } from 'nextra/components' |
| 7 | +import { CommunityLinks } from '/components/social-card/CommunityLinks' |
| 8 | + |
| 9 | +# Memgraph in mission-critical workloads |
| 10 | + |
| 11 | +<Callout type="info"> |
| 12 | +Before diving into this guide, we recommend starting with the [Best |
| 13 | +practices](/deployment/best-practices) |
| 14 | + |
| 15 | +page. It provides **foundational, use-case-agnostic advice** for deploying |
| 16 | +Memgraph in production. |
| 17 | + |
| 18 | +This guide builds on that foundation, offering **additional recommendations |
| 19 | +tailored to critical and high-availability workloads**. In cases where guidance |
| 20 | +overlaps, consider the information here as **complementary or overriding**, |
| 21 | +depending on the unique needs of your use case. |
| 22 | +</Callout> |
| 23 | + |
| 24 | +## Is this guide for you? |
| 25 | + |
| 26 | +This guide is for you if you're building **mission-critical systems** where |
| 27 | +uptime, data consistency, and fault tolerance are essential. You’ll benefit |
| 28 | +from this content if: |
| 29 | + |
| 30 | +- You require **high availability** and **automatic failover** for your |
| 31 | + application. |
| 32 | +- You need **strong consistency guarantees** even under heavy loads. |
| 33 | +- You must **recover gracefully** from unexpected failures without data loss. |
| 34 | +- You need to **support multi-tenant environments** securely across multiple |
| 35 | + projects or customers. |
| 36 | + |
| 37 | + |
| 38 | +If this matches your needs, this guide will help you configure and operate |
| 39 | +Memgraph to meet the demands of **always-on production environments**. |
| 40 | + |
| 41 | +## Why choose Memgraph for mission-critical use cases? |
| 42 | + |
| 43 | +When stability, consistency, and resilience matter most, Memgraph is built to |
| 44 | +deliver. Here's why Memgraph is a great fit for mission-critical workloads: |
| 45 | + |
| 46 | +- **In-memory storage engine with persistence** Memgraph keeps the working set |
| 47 | + fully in memory for fast access, while ensuring |
| 48 | + [durability](/fundamentals/data-durability) through **periodic snapshots** and |
| 49 | + **write-ahead logging (WALs)** in transactional mode. |
| 50 | + |
| 51 | + Keeping the data in memory ensures lightning speed in times when you expect |
| 52 | + everything to function seamlessly and without issues during peak times of your |
| 53 | + critical service. |
| 54 | + |
| 55 | +- **High availability with automatic failover** Memgraph supports full [high |
| 56 | + availability clustering](/clustering/high-availability), allowing you to |
| 57 | + deploy **multiple instances** with automatic **leader election** and |
| 58 | + **failover** when needed. |
| 59 | + |
| 60 | + Deploying Memgraph with high availability will ensure Memgraph is up and |
| 61 | + running at all times, without compromising uptime of your services. |
| 62 | + |
| 63 | +- **Multi-version concurrency control (MVCC)** Built on **MVCC**, Memgraph |
| 64 | + allows **non-blocking reads and writes**, ensuring that your system remains |
| 65 | + **responsive** even under concurrent access. Writes are not blocking reads, |
| 66 | + and vice versa. |
| 67 | + |
| 68 | +- **Snapshot isolation by default** Memgraph uses **snapshot isolation** instead |
| 69 | + of **read-committed** isolation, preventing dirty reads and guaranteeing a |
| 70 | + **consistent view** of the graph at all times. |
| 71 | + |
| 72 | +- **Replication for read scaling and redundancy** Memgraph supports |
| 73 | + **asynchronous replication**, enabling you to **scale read workloads** |
| 74 | + independently while ensuring **failover readiness**. For a more consistent |
| 75 | + view of the data, it also supports **synchronous replication** which |
| 76 | + prioritizes consistency over scalability. |
| 77 | + |
| 78 | +- **Fine-grained access control and security** Secure your system with |
| 79 | + [**role-based access |
| 80 | + control**](/database-management/authentication-and-authorization/role-based-access-control) |
| 81 | + and [**label-based access |
| 82 | + control**](/database-management/authentication-and-authorization/role-based-access-control#label-based-access-control) |
| 83 | + to ensure only the right users see and manipulate data. |
| 84 | + |
| 85 | +## What is covered? |
| 86 | + |
| 87 | +The suggestions for mission-critical workloads **complement** several key |
| 88 | +sections in the [general suggestions guide](/deployment/best-practices), with |
| 89 | +additional best practices to ensure uptime and data protection: |
| 90 | + |
| 91 | +- [Choosing the right Memgraph flag set](#choose-the-right-memgraph-flag-set) |
| 92 | + Memgraph offers flags to enhance recovery, snapshot management, and failover |
| 93 | + capabilities. |
| 94 | + |
| 95 | +- [Choosing the right Memgraph storage |
| 96 | + mode](#choose-the-right-memgraph-storage-mode) |
| 97 | + Guidance on selecting the **safest** and **most consistent** storage |
| 98 | + configurations. |
| 99 | + |
| 100 | +- [Enterprise features you might |
| 101 | + require](#enterprise-features-you-might-require) |
| 102 | + Overview of **replication**, **multi-tenancy**, and **automatic failover** |
| 103 | + tools that are critical in production. |
| 104 | + |
| 105 | +- [Backup and recovery mechanisms](#backup-and-recovery-mechanisms) |
| 106 | + Best practices to protect your data through snapshots, WALs, and external |
| 107 | + backup strategies. |
| 108 | + |
| 109 | +- [Queries that best suit your workload](#queries-that-best-suit-your-workload) |
| 110 | + Designing queries that maintain consistent, safe, and predictable behavior in |
| 111 | + high-availability systems. |
| 112 | + |
| 113 | +## Choose the right Memgraph flag set |
| 114 | + |
| 115 | + |
| 116 | +For mission-critical setups, you should configure Memgraph to optimize for **durability, fast recovery**, and **stability**. Some important flags include: |
| 117 | + |
| 118 | +- `--storage-snapshot-interval-sec=x` |
| 119 | + Set how often snapshots are created. In mission-critical systems, you may want |
| 120 | + **frequent snapshots** to minimize recovery time. |
| 121 | + |
| 122 | +- `--storage-wal-enabled=true` |
| 123 | + Ensure **WALs (write-ahead logs)** are enabled to protect all transactions |
| 124 | + between snapshots. |
| 125 | + |
| 126 | +- `--storage-parallel-schema-recovery=true` and |
| 127 | + `--storage-recovery-thread-count=x` |
| 128 | + Enable **parallel recovery** to speed up startup time after a crash by using |
| 129 | + multiple cores. |
| 130 | + |
| 131 | +- `--query-execution-timeout-sec=x` |
| 132 | + Set reasonable query timeouts to **avoid stuck queries** and prevent resource |
| 133 | + exhaustion.` |
| 134 | + |
| 135 | +## Choose the right Memgraph storage mode |
| 136 | + |
| 137 | + |
| 138 | +For mission-critical deployments: |
| 139 | + |
| 140 | +- Always use `IN_MEMORY_TRANSACTIONAL` mode. |
| 141 | +- This mode provides **full ACID guarantees**, **WAL support**, and **snapshot |
| 142 | + consistency**. |
| 143 | + |
| 144 | +<Callout type="info"> |
| 145 | +`IN_MEMORY_ANALYTICAL` is optimized for high-speed ingestion but does **not |
| 146 | +provide transactional durability**. It is not recommended for mission-critical |
| 147 | +workloads. |
| 148 | +</Callout> |
| 149 | + |
| 150 | +## Importing mechanisms |
| 151 | + |
| 152 | +Importing mechanisms are best described in the [guide for high-throughput |
| 153 | +workloads](/deployment/workloads/memgraph-in-high-throughput-workloads). The |
| 154 | +rule of thumb is to always setup the drivers to perform retries if you're doing |
| 155 | +heavy amount of writes, in order to avoid read conflicts. The high throguhput |
| 156 | +guide also outlines the need for idempotent queries, to ensure data consistency |
| 157 | +if writes fail for any reason. |
| 158 | + |
| 159 | +## Enterprise features you might require |
| 160 | + |
| 161 | +For robust production environments, consider enabling: |
| 162 | + |
| 163 | +- [High availability clustering](/clustering/high-availability): Deploy multiple |
| 164 | + Memgraph instances with automatic leader election and failover. |
| 165 | + |
| 166 | +- [Replication for resilience](/clustering/replication): Distribute replicas |
| 167 | + geographically or across availability zones to minimize the risk of localized |
| 168 | + outages. |
| 169 | + |
| 170 | +- [Role-based and label-based access |
| 171 | + control](/database-management/authentication-and-authorization/role-based-access-control): |
| 172 | + Protect sensitive graph data and ensure only authorized operations are |
| 173 | + performed. |
| 174 | + |
| 175 | +- [Multi-tenancy](/database-management/multi-tenancy): Securely isolate data and |
| 176 | + permissions between different teams, projects, or customers. |
| 177 | + |
| 178 | +## Backup and recovery mechanisms |
| 179 | + |
| 180 | +Data durability is critical in mission-critical environments. Memgraph supports: |
| 181 | + |
| 182 | +- [Snapshots](/fundamentals/data-durability#snapshots) |
| 183 | + Automatically or manually triggered full-database snapshots. |
| 184 | + |
| 185 | +- [Write-ahead logging |
| 186 | + (WALs)](/fundamentals/data-durability#write-ahead-logging) |
| 187 | + Transaction logs that enable you to **replay changes** made after the last |
| 188 | + snapshot. |
| 189 | + |
| 190 | +- **Manual backup and offloading** |
| 191 | + Use external tools (like [`rclone`](https://rclone.org/)) to back up snapshots |
| 192 | + and WALs to **cloud storage** or **remote servers** for additional redundancy. |
| 193 | + |
| 194 | + |
| 195 | +<Callout type="info"> |
| 196 | +Memgraph currently does not automate backing up data to 3rd party locations, so |
| 197 | +integrating a backup process into your system is highly recommended. |
| 198 | +</Callout> |
| 199 | + |
| 200 | +<Callout type="info"> |
| 201 | +Learn more about backup and restore [on our backup and restore documentation |
| 202 | +page](/database-management/backup-and-restore). |
| 203 | +</Callout> |
| 204 | + |
| 205 | +## Queries that best suit your workload |
| 206 | + |
| 207 | +In mission-critical workloads: |
| 208 | + |
| 209 | +- Prefer **idempotent writes** (`MERGE`) to avoid inconsistent state during |
| 210 | + retries. |
| 211 | +- Optimize long-running queries and [profile](/querying/clauses/profile) them |
| 212 | + regularly. |
| 213 | + |
| 214 | +- Avoid complex, unpredictable queries inside critical transactional paths. |
| 215 | +- Use **schema constraints** and **indexes** wisely to enforce data integrity |
| 216 | + without hurting performance. |
| 217 | + |
| 218 | +Example of safe, idempotent data ingestion: |
| 219 | + |
| 220 | +```cypher |
| 221 | +MERGE (n:Customer {id: $id}) |
| 222 | +SET n += $props; |
| 223 | +``` |
| 224 | +<CommunityLinks/> |
0 commit comments