Vault
Seal/Unseal
When a Vault server is started, it starts in a sealed state. In this state, Vault is configured to know where and how to access the physical storage, but doesn't know how to decrypt any of it.
Unsealing is the process of obtaining the plaintext root key necessary to read the decryption key to decrypt the data, allowing access to the Vault.
Prior to unsealing, almost no operations are possible with Vault. For example authentication, managing the mount tables, etc. are all not possible. The only possible operations are to unseal the Vault and check the status of the seal.
Why?
The data stored by Vault is encrypted. Vault needs the encryption key in order to decrypt the data. The encryption key is also stored with the data (in the keyring), but encrypted with another encryption key known as the root key.
Therefore, to decrypt the data, Vault must decrypt the encryption key which requires the root key. Unsealing is the process of getting access to this root key. The root key is stored alongside all other Vault data, but is encrypted by yet another mechanism: the unseal key.
To recap: most Vault data is encrypted using the encryption key in the keyring; the keyring is encrypted by the root key; and the root key is encrypted by the unseal key.
Shamir seals
The default Vault config uses a Shamir seal. Instead of distributing the unseal key as a single key to an operator, Vault uses an algorithm known as Shamir's Secret Sharing to split the key into shares. A certain threshold of shares is required to reconstruct the unseal key, which is then used to decrypt the root key.
This is the unseal process: the shares are added one at a time (in any order) until enough shares are present to reconstruct the key and decrypt the root key.
Unsealing
The unseal process is done by running vault operator unseal
or via the API.
This process is stateful: each key can be entered via multiple mechanisms from
multiple client machines and it will work. This allows each shares of the root
key to be on a distinct client machine for better security.
Note that when using the Shamir seal with multiple nodes, each node must be unsealed with the required threshold of shares. Partial unsealing of each node is not distributed across the cluster.
Once a Vault node is unsealed, it remains unsealed until one of these things happens:
It is resealed via the API (see below).
The server is restarted.
Vault's storage layer encounters an unrecoverable error.
Note: Unsealing makes the process of automating a Vault install difficult. Automated tools can easily install, configure, and start Vault, but unsealing it using Shamir is a very manual process. For most users Auto Unseal will provide a better experience.
Sealing
There is also an API to seal the Vault. This will throw away the root key in memory and require another unseal process to restore it. Sealing only requires a single operator with root privileges.
This way, if there is a detected intrusion, the Vault data can be locked quickly to try to minimize damages. It can't be accessed again without access to the root key shares.
Auto unseal
Auto unseal was developed to aid in reducing the operational complexity of keeping the unseal key secure. This feature delegates the responsibility of securing the unseal key from users to a trusted device or service. At startup Vault will connect to the device or service implementing the seal and ask it to decrypt the root key Vault read from storage.
There are certain operations in Vault besides unsealing that require a quorum of users to perform, e.g. generating a root token. When using a Shamir seal the unseal keys must be provided to authorize these operations. When using Auto Unseal these operations require recovery keys instead.
Just as the initialization process with a Shamir seal yields unseal keys, initializing with an Auto Unseal yields recovery keys.
It is still possible to seal a Vault node using the API. In this case Vault will remain sealed until restarted, or the unseal API is used, which with Auto Unseal requires the recovery key fragments instead of the unseal key fragments that would be provided with Shamir. The process remains the same.
For a list of examples and supported providers, please see the seal documentation.
When DR replication is enabled in Vault Enterprise, Performance Standby nodes on the DR cluster will seal themselves, so they must be restarted to be unsealed.
Recovery keys cannot decrypt the root key
Recovery keys cannot decrypt the root key and thus are not sufficient to unseal Vault if the auto unseal mechanism isn't working. They are purely an authorization mechanism. Using auto unseal creates a strict Vault lifecycle dependency on the underlying seal mechanism. This means that if the seal mechanism (such as the Cloud KMS key) becomes unavailable, or deleted before the seal is migrated, then there is no ability to recover access to the Vault cluster until the mechanism is available again. If the seal mechanism or its keys are permanently deleted, then the Vault cluster cannot be recovered, even from backups. To mitigate this risk, we recommend careful controls around management of the seal mechanism, for example using AWS Service Control Policies or similar. With Vault Enterprise secondary clusters (disaster or performance) can have a seal configured independently of the primary, and when properly configured guards against some of this risk. Unreplicated items such as local mounts could still be lost.
Recovery key
When Vault is initialized while using an HSM or KMS, rather than unseal keys being returned to the operator, recovery keys are returned. These are generated from an internal recovery key that is split via Shamir's Secret Sharing, similar to Vault's treatment of unseal keys when running without an HSM or KMS.
Details about initialization and rekeying follow. When performing an operation
that uses recovery keys, such as generate-root
, selection of the recovery
keys for this purpose, rather than the barrier unseal keys, is automatic.
Initialization
When initializing, the split is performed according to the following CLI flags and their API equivalents in the /sys/init endpoint:
recovery-shares
: The number of shares into which to split the recovery key. This value is equivalent to therecovery_shares
value in the API endpoint.recovery-threshold
: The threshold of shares required to reconstruct the recovery key. This value is equivalent to therecovery_threshold
value in the API endpoint.recovery-pgp-keys
: The PGP keys to use to encrypt the returned recovery key shares. This value is equivalent to therecovery_pgp_keys
value in the API endpoint, although as withpgp_keys
the object in the API endpoint is an array, not a string.
Additionally, Vault will refuse to initialize if the option has not been set to generate a key, and no key is found. See Configuration for more details.
Rekeying
Unseal key
Vault's unseal key can be rekeyed using a normal vault operator rekey
operation from the CLI or the matching API calls. The rekey operation is
authorized by meeting the threshold of recovery keys. After rekeying, the new
barrier key is wrapped by the HSM or KMS and stored like the previous key; it is not
returned to the users that submitted their recovery keys.
Enterprise
Seal wrapping requires Vault Enterprise
Recovery key
The recovery key can be rekeyed to change the number of shares/threshold or to
target different key holders via different PGP keys. When using the Vault CLI,
this is performed by using the -target=recovery
flag to vault operator rekey
.
Via the API, the rekey operation is performed with the same parameters as the
normal /sys/rekey
endpoint; however, the
API prefix for this operation is at /sys/rekey-recovery-key
rather than
/sys/rekey
.
Seal migration
The seal migration process cannot be performed without downtime, and due to the technical underpinnings of the seal implementations, the process requires that you briefly take the whole cluster down. While experiencing some downtime may be unavoidable, we believe that switching seals is a rare event and that the inconvenience of the downtime is an acceptable trade-off.
NOTE: A backup should be taken before starting seal migration in case something goes wrong.
NOTE: Seal migration operation will require both old and new seals to be available during the migration. For example, migration from auto unseal to Shamir seal will require that the service backing the auto unseal is accessible during the migration.
NOTE: Seal migration from auto unseal to auto unseal of the same type is supported since Vault 1.6.0. However, there is a current limitation that prevents migrating from AWSKMS to AWSKMS; all other seal migrations of the same type are supported. Seal migration from one auto unseal type (AWS KMS) to different auto unseal type (HSM, Azure KMS, etc.) is also supported on older versions as well.
Migration post Vault 1.16.0 via Seal HA for Auto Seals (Enterprise)
Beta feature
Beta functionality is stable but possibly incomplete and subject to change. We strongly discourage using beta features in production deployments of Vault
With Seal HA, migration between auto-unseal types (not including any Shamir seals) can be done fully online using Seal High Availability (Seal HA) without any downtime.
- Edit the Vault configuration, and add the new, target seal configuration.
- Send the Vault process the SIGHUP signal, triggering a configuration reload.
- Monitor the
sys/sealwrap/rewrap
endpoints, to see that rewrap is running, and/orsys/seal-backend-status
, endpoints, waiting forfully_wrapped
to be true, indicating all seal wrapped values are now wrapped by the new seal. The logs also contain information about the rewrap progress. - Edit the Vault configuration, removing the old seal configuration.
- Send the Vault process the SIGHUP signal, again allowing re-wrapping to complete.
Migration post Vault 1.5.1
These steps are common for seal migrations between any supported kinds and for any storage backend.
Take a standby node down and update the seal configuration.
- If the migration is from Shamir seal to Auto seal, add the desired new Auto seal block to the configuration.
- If the migration is from Auto seal to Shamir seal, add
disabled = "true"
to the old seal block. - If the migration is from Auto seal to another Auto seal, add
disabled = "true"
to the old seal block and add the desired new Auto seal block.
Now, bring the standby node back up and run the unseal command on each key, by supplying the
-migrate
flag.- Supply Shamir unseal keys if the old seal was Shamir, which will be migrated as the recovery keys for the Auto seal.
- Supply recovery keys if the old seal is one of Auto seals, which will be migrated as the recovery keys of the new Auto seal, or as Shamir unseal keys if the new seal is Shamir.
Perform step 1 for all the standby nodes, one at a time. It is necessary to bring back the downed standby node before moving on to the other standby nodes, specifically when Integrated Storage is in use for it helps to retain the quorum.
Step down the active node. One of the standby nodes will become the new active node. When using Integrated Storage, ensure that quorum is reached and a leader is elected.
The new active node will perform the migration. Monitor the server log in the active node to witness the completion of the seal migration process. Wait for a little while for the migration information to replicate to all the nodes in case of Integrated Storage. In enterprise Vault, switching an Auto seal implies that the seal wrapped storage entries get re-wrapped. Monitor the log and wait until this process is complete (look for
seal re-wrap completed
).
Warning
Any change to the seal
stanza in your Vault configuration invokes seal-rewrap,
even "migrations" from the same auto-unseal type like pkcs11
to pkcs11
.
Seal migration is now completed. Take down the old active node, update its configuration to use the new seal blocks (completely unaware of the old seal type) ,and bring it back up. It will be auto-unsealed if the new seal is one of the auto seals, or will require unseal keys if the new seal is Shamir.
At this point, configuration files of all the nodes can be updated to only have the new seal information. Standby nodes can be restarted right away and the active node can be restarted upon a leadership change.
Migration pre 1.5.1
Migration from shamir to auto unseal
To migrate from Shamir keys to Auto Unseal, take your server cluster offline and
update the seal configuration with the appropriate
seal configuration. Bring your server back up and leave the rest of the nodes
offline if using multi-server mode, then run the unseal process with the
-migrate
flag and bring the rest of the cluster online.
All unseal commands must specify the -migrate
flag. Once the required
threshold of unseal keys are entered, unseal keys will be migrated to recovery
keys.
$ vault operator unseal -migrate
Migration from auto unseal to shamir
To migrate from auto unseal to Shamir keys, take your server cluster offline
and update the seal configuration and add disabled
= "true"
to the seal block. This allows the migration to use this information
to decrypt the key but will not unseal Vault. When you bring your server back
up, run the unseal process with the -migrate
flag and use the Recovery Keys
to perform the migration. All unseal commands must specify the -migrate
flag.
Once the required threshold of recovery keys are entered, the recovery keys
will be migrated to be used as unseal keys.
Migration from auto unseal to auto unseal
NOTE: Migration between same Auto Unseal types is supported in Vault 1.6.0 and higher. For these pre-1.5.1 steps, it is only possible to migrate from one type of auto unseal to a different type (ie Transit -> AWSKMS).
To migrate from auto unseal to a different auto unseal configuration, take your
server cluster offline and update the existing seal
configuration and add disabled = "true"
to the seal
block. Then add another seal block to describe the new seal.
When you bring your server back up, run the unseal process with the -migrate
flag and use the Recovery Keys to perform the migration. All unseal commands
must specify the -migrate
flag. Once the required threshold of recovery keys
are entered, the recovery keys will be kept and used as recovery keys in the new
seal.
Migration with integrated storage
Integrated Storage uses the Raft protocol underneath, which requires a quorum of servers to be online before the cluster is functional. Therefore, bringing the cluster back up one node at a time with the seal configuration updated, will not work in this case. Follow the same steps for each kind of migration described above with the exception that after the cluster is taken offline, update the seal configurations of all the nodes appropriately and bring them all back up. When the quorum of nodes are back up, Raft will elect a leader and the leader node that will perform the migration. The migrated information will be replicated to all other cluster peers and when the peers eventually become the leader, migration will not happen again on the peer nodes.
Seal high availability Enterprise
Beta feature
Beta functionality is stable but possibly incomplete and subject to change. We strongly discourage using beta features in production deployments of Vault
Seal high availability (Seal HA) allows the configuration of more than one auto seal mechanism such that Vault can tolerate the temporary loss of a seal service
or device for a time. With Seal HA configured with at least two and no more than three auto seals, Vault can also start up and unseal if one of the configured seals is still available (though Vault will remain in a degraded mode in this case). While seals are unavailable, seal wrapping and entropy augmentation can still occur using the remaining seals, and values produced while a seal is down will be re-wrapped with all the seals when all seals become healthy again.
An operator should choose two seals that are unlikely to become unavailable at the same time. For example, they may choose KMS keys in two cloud regions, from two different providers; or a mix of HSM, KMS, or Transit seals.
When an operator configures an additional seal or removes a seal (one at a time)
and restarts Vault, Vault will automatically detect that it needs to re-wrap
CSPs and seal wrapped values, and will start the process. Seal re-wrapping can
be monitored via the logs or via the sys/seal-status
endpoint. While a
re-wrap is in progress (or could not complete successfully), changes to the
seal configuration are not allowed.
In additional to high availability, seal HA can be used to migrate between two auto seals in a simplified manner. To migrate in this way:
In additional to high availability, Seal HA can be used to migrate between two auto seals in a simplified manner.
Note that Shamir seals are not auto seals and cannot be included in a Seal HA setup. This is because auto seals support seal wrap while Shamir seals do not, so the loss of the auto seal does not necessarily leave Vault in a fully available state.
Use and Configuration
Refer to the configuration section for details on configuring Seal HA.
Seal Re-Wrapping
Whenever seal configuration changes, Vault must re-wrap all CSPs and seal wrapped values, to ensure each value has an entry encrypted by all configured seals. Vault detects these configuration changes automatically, and triggers a re-wrap. Re-wraps can take some time, depending on the number of seal wrapped values. While re-wrapping is in progress, no configuration changes to the seals can be made.
Progress of the re-wrap can be monitored using
the sys/sealwrap/rewrap
endpoint.
Limitations and Known Issues
In order to limit complexity and increase safety, there are some limitations to the use and configuration of Seal HA:
- Vault must be configured for a single seal at the time of initialization. Extra seals can then be added.
- Seals must be added or removed one at a time.
- Only auto seals can be used in HA configurations. Shamir and auto cannot be mixed.
- A maximum of three seals can be configured.
- As seal wrapped values must be wrapped by all configured seals, it is possible that large values may fail to persist as the size of the entry is multiplied by the number of seals causing it to exceed the storage entry size limit. An example would be storing a large document in KVv2 with seal wrapping enabled.
- It is not possible to rotate the data encryption key nor the recovery keys while unless all seals are healthy.