Concurrency control is critical in any system where accuracy and trust in the record needs to be maintained. It is the way of ensuring that correct results are generated, even when multiple concurrent transactions are taking place. This page also includes a number of demos that you can try out for yourself, with more information and the code available at QLDB Concurrency Demo
ACID Properties and Isolation Levels
QLDB is fully ACID-compliant when it comes to database transactions. These four properties cover the following:
- Atomic - either all operations in a transaction are committed, or they are rolled back.
- Consistent - all data written to the database must be valid according to defined rules such as constraints
- Isolated - transactions cannot interfere with each other
- Durable - successfully committed transactions are persisted and can survive a system failure
Ensuring the correct isolation is the main goal of concurrency control. In ANSI SQL there are four standard isolation levels:
- Read Uncommitted - one transaction may read uncommitted changes from another transaction
- Read Committed - briefly acquires locks during reads, whilst maintaining write locks until transaction is committed
- Repeatable Read - acquires and holds both read and write locks until transaction is committed
- Serializable - uses range-locks to prevent new rows being added if they match a where clause to prevent phantom reads
The higher the level of isolation, the worse the performance is due to lock contention. QLDB provides a form of serializable isolation level, but uses Optimistic Concurrency Control (OCC), which operates on the principle that multiple transactions can frequently compete without interfering with each other. Using OCC, transactions in QLDB don’t acquire locks on database resources and operate with full serializable isolation. QLDB executes concurrent transactions in a serial manner, such that it produces the same effect as if those transactions were executed serially.
Before committing, the QLDB Driver performs a validation check to ensure that no other committed transaction has modified the data being accessed within the transaction. If the check reveals a conflict, the QLDB driver automatically retries the entire transaction again for a configurable number of attempts. If this is exceeded, the QLDB driver returns an exception back to the client.
When a transaction writes to QLDB, the validation checks of the OCC model are implemented by QLDB itself. If a transaction can’t be written to the journal due to a failure in the verification phase of OCC, QLDB returns an OccConflictException to the application layer. The application software is responsible for ensuring that the transaction is restarted. The application should abort the rejected transaction and then retry the whole transaction from the start.
So why might data have changed between when it is first read to when an update is applied? QLDB is a journal-first database. Transactions are first written to the journal, and then ‘indexed storage’ is updated to reflect these changes. All writes go to the journal-first, whereas reads go from indexed storage. In the examples that follow, when a query is first made, it is querying indexed storage, which may or may not yet reflect the result of a write. In practice, the QLDB indexed storage is usually instantly up to date. This means as a developer, there is no reason to take a lock or check rows updated with a condition. Concurrency in QLDB is simple - write your code as if you're the only actor in the system and ensure the operations are committed.
Executing a transaction against QLDB
The primary method of executing a transaction against a QLDB ledger is the
executeLambda method on the QLDB driver.
All database operations within this method are atomic, and will only be committed once all the code is executed. If there is a failure, the driver will retry the entire execution block. It is possible to specify the maximum number of retries, as well as backoff strategies. For a great article on these concepts, read Timeouts, retries, and backoff with jitter from the Amazon Builders Library.
Custom Retry and Backoff config
There are 3 options for providing custom retry and backoff config. The QLDB driver will use the following precedence when selecting the Retry Config for a given transaction:
- Use custom Retry Config provided in the
This allows you to specify unique config for each function as shown below.
- Use custom Retry Config provided during the
This allows you to specify behaviour that will apply where #1 is not set
- Use the default Retry Config provided by the driver
The current default value of
retryLimit is 4. This will be used when #1 and #2 are not set
QLDB Concurrency Control Examples
Now with this understanding, we get to the fun part of executing operations against QLDB to understand the behaviour.
Demo 1 - Create Duplicate Customer Record
In this first demo, we create a table called Customer that has no indexes on it. We want to make sure that an email address is unique, so when we create a new customer record, we check to see if that email address already exists, and if not we create the customer record. In addition, we also add the unique id of the document as a guid attribute to the document. This results in 3 separate database operations. After checking that the email is unique, we pause execution for 10 seconds.
During this period, we kick off another transaction that inserts this record. The result of this
We can show this visually in the flow diagram below:
What we can see from this is that the original commit fails, as the QLDB driver recognises the snapshot of data its accessing has been changed, and so retries the entire transaction block. This time around, the first select operation identifies the email is not unique, and so no updates are made.
Note that if we explicitly change the RetryConfig to 0, to instruct the QLDB driver not to retry the transaction, we will receive back an
OccConflictException that the client needs to handle:
Demo 2 - Create Customer Record
The second demo, is the same as the first demo, apart from the second transaction is creating a different customer record. This results in the following output:
In QLDB, each transaction is committed as a block to the journal. This journal block contains transaction metadata, along with entries for each document revision made, and the corresponding PartiQL statements that committed them. You can query the committed view of a document to find out its
sequenceNo in the block, and then use these details to call the QLDB
get-block CLI command to retrieve the contents of that block. The following is a subset of the information contained in the block record for this operation:
You can see that all PartiQL statements are shown. You can also see the 10 second different in the statement start times as a result of the sleep.
The biggest issue is the fact the QLDB driver retried the whole transaction as a result of another update to the same table taking place at the same time, even though it was for a different record. This is important to bear in mind when searching on an unindexed attribute. Other transactions can carry out select statements on the same table which will not cause a retry. However, if another transaction has created a new record or updated an existing record in the same table that this transaction has searched against, it will cause a retry to take place. Now lets see what happens if we add an index to the table.
Demo 3 - Create Customer Record with Index
The third demo, is the same as the second demo, apart from the fact an index is first created on the
Customer table on the
This time, with an index on the Customer table, the QLDB driver does not attempt to retry the transaction. This reinforces the critical recommendation to always use
select queries that make use of indexes. Without this, it will result in a full table scan which will result in slow queries. Futhermore, as we saw in demo 2, it may result in a higher number of OCC Exceptions.
Demo 4 - Timeout
The fourth demo is similar to any of the first 3, but the delay is set to a minimum of 30 seconds. This will result in the following:
A transaction can run for up to 30 seconds before being committed. After this timeout, any work done on the transaction is rejected, and QLDB discards the session. This limit protects the client from leaking sessions by starting transactions and not committing or cancelling them.
So there we have it. Hopefully you have learnt more about how QLDB handles concurrency control. The key takeaways are:
- You can configure custom retry and backoff behaviour for all transactions when instantiating the QLDB driver, which can be overridden for each individual transaction in the
executeLambdafunction. Otherwise, the default settings are used
- All database operations in the QLDB driver
executeLambdafunction are committed or rolled back atomically
- The QLDB driver will perform a validation check before committing the transaction. If not valid, the driver will retry all database operations in the transaction again up to the retry limit
- Always use an index when selecting a record from a table. Otherwise, not only will a full table scan take place adding latency, but it will result in more OCC Exceptions as all records in the table will be in the snapshot of data validated by the driver
- A transaction can be active for up to 30 seconds, after which time it will be rejected. Make sure to limit the number of long running database operations within a transaction to prevent this.