libmongocrypt is a C library meant to assist drivers in supporting client side encryption. libmongocrypt acts as a state machine and the driver is responsible for I/O between mongod, mongocryptd, crypt_shared, and KMS.
There are two major parts to integrating libmongocrypt into your driver:
- Writing a language-specific binding to libmongocrypt
- Using the binding in your driver to support client side encryption
The library interface is intended to be used with multiple languages.
The API tries to be minimal. Most structs are opaque. Global initialization is lazy.
Much of the API passes and returns BSON since all drivers can produce and parse BSON.
libmongocrypt deliberately does not do I/O to avoid poor behavior with some language runtimes. Example: in Go a blocking C call may block an OS thread, rather than a goroutine.
The binding is the glue between your driver's native language and libmongocrypt.
The binding uses the native language's foreign function interface to C. For example, Java can accomplish this with JNA, CPython with extensions, Node.js with add-ons, etc.
The libmongocrypt library files (.so/.dll) are pre-built on its Evergreen project. Click the variant's "built-and-test-and-upload" tasks to download the attached files.
libmongocrypt describes all API that needs to be called from your driver in the main public header mongocrypt.h.
There are many types and functions in mongocrypt.h to bind. Consider as
a first step binding to only mongocrypt_version.
Once you have that working, proceed to write bindings for the remaining
API. Here are a few things to keep in mind:
- "ctx" is short for context, and is a generic term indicating that the object stores state.
- By C convention, functions are named like:
mongocrypt_<type>_<method>. For examplemongocrypt_ctx_idcan be thought of as a class method "id" on the class "ctx". mongocrypt_binary_tis a non-owning view of data. Callingmongocrypt_binary_destroyfrees the view, but does nothing to the underlying data. When amongocrypt_binary_tis returned (e.g.mongocrypt_ctx_mongo_op), the lifetime of the data is tied to the type that returned it (so the data returned will be freed when themongocrypt_ctx_t) is freed.
Once you have full bindings for the API, it's time to do a sanity
check. The crux of libmongocrypt's API is the state machine represented
by mongocrypt_ctx_t. This state machine is exercised in the
example-state-machine
executable included with libmongocrypt. It uses mock responses from
mongod, mongocryptd, and KMS. Reimplement the state machine loop
(_run_state_machine) in example-state-machine with your binding.
Seek help in the slack channel #dbx-encryption.
After you have a binding, integrate libmongocrypt in your driver to support client side encryption.
See the driver spec for a reference of the user-facing API. libmongocrypt is needed for:
- Automatic encryption/decryption (enabled with
AutoEncryptionOpts) - ClientEncryption (explicit encryption/decryption + key management)
It is recommended to start by integrating libmongocrypt to support automatic encryption/decryption. Then reuse the implementation to implement the ClientEncryption.
A MongoClient enabled with client side encryption MUST have one shared
mongocrypt_t handle (important because keys + JSON Schemas are cached
in this handle). Each ClientEncryption also has its own mongocrypt_t.
Any encryption or decryption operation is done by creating a
mongocrypt_ctx_t and initializing it for the appropriate operation.
mongocrypt_ctx_t is a state machine, and each state requires the
driver to perform some action. This may be performing I/O on one of the
following:
- the encrypted MongoClient to which the operation is occurring (for auto encrypt).
- the key vault MongoClient (which may be the same as the encrypted MongoClient).
- KMS (via a TLS socket).
- the MongoClient to the local mongocryptd process.
Call one of the following on a mongocrypt_ctx_t:
- auto encrypt (
mongocrypt_ctx_encrypt_init) - auto decrypt (
mongocrypt_ctx_decrypt_init) - explicit encrypt (
mongocrypt_ctx_explicit_encrypt_init) - explicit decrypt (
mongocrypt_ctx_explicit_decrypt_init) - create data key (
mongocrypt_ctx_datakey_init) - rewrap data key (
mongocrypt_ctx_rewrap_many_datakey_init)
Below is a list of the various states a mongocrypt ctx can be in. For each state, there is a description of what the driver is expected to do to advance the state machine. Not all states will be entered for all types of contexts. But one state machine runner can be used for all types of contexts.
Driver needs to...
Throw an exception based on the status from mongocrypt_ctx_status.
Applies to...
All contexts.
Important
Multi-collection commands: prior to 1.13.0, drivers were expected to pass at most one result from listCollections. In 1.13.0, drivers are expected to pass all results from listCollections to support multi-collection commands (e.g. aggregate with $lookup).
Drivers must call mongocrypt_setopt_enable_multiple_collinfo to indicate the new behavior is implemented and opt-in to support for multi-collection commands. This opt-in is to prevent the following bug scenario:
A driver upgrades to 1.13.0, but does not update prior behavior which passes at most one result of a multi-collection command. A multi-collection command requests schemas for both
db.c1anddb.c2. The driver only passes the result fordb.c1even thoughdb.c2also has a result. Therefore, libmongocrypt incorrectly believesdb.c2has no schema.
libmongocrypt needs...
A result from a listCollections cursor.
Driver needs to...
- Run listCollections on the encrypted MongoClient with the filter
provided by
mongocrypt_ctx_mongo_op - Pass all results (if any) with calls to
mongocrypt_ctx_mongo_feedor proceed to the next step if nothing was returned. Results may be passed in any order. - Call
mongocrypt_ctx_mongo_done
Applies to...
auto encrypt
See note about multi-collection commands.
libmongocrypt needs...
Results from a listCollections cursor from a specified database.
Driver needs to...
- Run listCollections on the encrypted MongoClient with the filter
provided by
mongocrypt_ctx_mongo_opon the database provided bymongocrypt_ctx_mongo_db. - Pass all results (if any) with calls to
mongocrypt_ctx_mongo_feedor proceed to the next step if nothing was returned. Results may be passed in any order. - Call
mongocrypt_ctx_mongo_done
Applies to...
A context initialized with mongocrypt_ctx_encrypt_init for automatic encryption. This state is only entered when mongocrypt_setopt_use_need_mongo_collinfo_with_db_state is called to opt-in.
libmongocrypt needs...
A reply from mongocryptd indicating which values in a command need to be encrypted.
Driver needs to...
- Use db.runCommand to run the command provided by
mongocrypt_ctx_mongo_opon the MongoClient connected to mongocryptd. - Feed the reply back with
mongocrypt_ctx_mongo_feed. - Call
mongocrypt_ctx_mongo_done.
Applies to...
auto encrypt
libmongocrypt needs...
Documents from the key vault collection.
Driver needs to...
- Use MongoCollection.find on the MongoClient connected to the key
vault client (which may be the same as the encrypted client). Use
the filter provided by
mongocrypt_ctx_mongo_op. - Feed all resulting documents back (if any) with repeated calls to
mongocrypt_ctx_mongo_feed. - Call
mongocrypt_ctx_mongo_done.
Applies to...
All contexts except for create data key.
libmongocrypt needs...
The responses from one or more messages to KMS.
Ensure mongocrypt_setopt_retry_kms is called on the mongocrypt_t to enable retry.
Driver needs to...
-
For each context returned by
mongocrypt_ctx_next_kms_ctx:a. Delay the message by the time in microseconds indicated by
mongocrypt_kms_ctx_usleepif returned value is greater than 0.b. Create/reuse a TLS socket connected to the endpoint indicated by
mongocrypt_kms_ctx_endpoint. The endpoint string is a host name with a port number separated by a colon. E.g. "kms.us-east-1.amazonaws.com:443". A port number will always be included. Drivers may assume the host name is not an IP address or IP literal.c. Write the message from
mongocrypt_kms_ctx_messageto the > socket.d. Feed the reply back with
mongocrypt_kms_ctx_feedormongocrypt_kms_ctx_feed_with_retry. Repeat untilmongocrypt_kms_ctx_bytes_neededreturns 0. If theshould_retryoutparam returns true, the request may be retried by feeding the new response into the same context.If any step encounters a network error, call
mongocrypt_kms_ctx_fail. Ifmongocrypt_kms_ctx_failreturns true, retry the request by continuing to the next KMS context or by feeding the new response into the same context. Ifmongocrypt_kms_ctx_failreturns false, abort and report an error. Consider wrapping the error reported inmongocrypt_kms_ctx_statusto include the last network error. -
When done feeding all replies, call
mongocrypt_ctx_kms_done.
Call mongocrypt_setopt_retry_kms to enable retry behavior.
There are two options for retry:
- Lazy retry: After processing KMS contexts, iterate again by calling
mongocrypt_ctx_next_kms_ctx. KMS contexts needing a retry will be returned. - In-place retry: If a KMS context indicates retry, retry the KMS request and feed the new response to the same KMS
context. Use
mongocrypt_kms_ctx_feed_with_retryand check the return ofmongocrypt_kms_ctx_failto check if a retry is indicated.
The driver MAY fan out KMS requests in parallel. It is not safe to iterate KMS contexts (i.e. call
mongocrypt_ctx_next_kms_ctx) while operating on KMS contexts (e.g. calling mongocrypt_kms_ctx_feed). Drivers are
recommended to do an in-place retry on KMS requests.
Applies to...
All contexts.
MONGOCRYPT_CTX_NEED_KMS_CREDENTIALS was added in libmongocrypt 1.4.0 as part of MONGOCRYPT-382.
MONGOCRYPT_CTX_NEED_KMS_CREDENTIALS can only be entered if mongocrypt_setopt_use_need_kms_credentials_state is called. This prevents breaking drivers that do not handle the MONGOCRYPT_CTX_NEED_KMS_CREDENTIALS state.
If a KMS provider is configured with an empty document (e.g. { "aws": {} }), the MONGOCRYPT_CTX_NEED_KMS_CREDENTIALS is entered before KMS requests are made.
libmongocrypt needs...
Credentials for one or more KMS providers.
Driver needs to...
Fetch credentials for supported KMS providers. See the Client Side Encryption specification for details.
Pass credentials to libmongocrypt using mongocrypt_ctx_provide_kms_providers.
Applies to...
All contexts.
Driver needs to...
Call mongocrypt_ctx_finalize to perform the encryption/decryption and
get the final result.
Applies to...
All contexts except for create data key.
Driver needs to...
Exit the state machine loop.
Applies to...
All contexts.