Vault Secure Introduction Client Security

This page discusses the various security-sensitive aspects of the Vault Secure Introduction Client along with advice for secure usage.

General Advice

Use A Dedicated User

We recommend that the VSI client be run by a dedicated user that does not run other services. This reduces the chance that an attack on another service can lead to accessing sensitive information (for instance, if the ability of the client to store its nonce is used). This also ensures that tokens written with the file serving type are not accessible to processes run by other users.

Write Tokens To Ephemeral Storage With A Dedicated Group

When using the file serving type to write tokens to paths on the filesystem, the directories containing those paths should only be accessible to a group or groups of services that require access to the Vault token. The tokens are always written with permissions 0640.

In addition, this should be a location that does not persist across reboots, such as a ramdisk. After a reboot, the client will fetch a new token, so there is no need to store the old one.

Firewall Instance Metadata

If possible (based on operating system and/or distribution), use features of the OS firewall to restrict access to the instance metadata (specifically the signed pkcs7 document) from the http://169.254.169.254 endpoint to only the users/groups that need it.

Nonces

The AWS Authentication Backend operates on a Trust On First Use (TOFU) principle, using nonces generated by a backend client to identify repeat authentication requests by the same client.

A nice benefit of this approach is that if a bad actor is able to acquire machine instance metadata and authenticate before the VSI client, the errors from the VSI client logs indicating a client nonce mismatch can be used to trigger an alarm.

The drawback is that reboot survivability is impacted. However, combinations of options on the AWS Authentication Backend and the VSI client provide flexible methods for managing this problem, allowing the security policy of nearly any organization to be accommodated.

Following are the various strategies of nonce management.

Immutable Instances

If your EC2 instances are running in Auto Scaling Groups (ASGs), one strategy is to enable the disallow_reauthentication option on a configured AMI in the AWS Authentication Backend (or an associated Role Tag). This allows only a single token to be granted to any particular instance ID (unless cleared via the whitelist/identity endpoint in the backend), regardless of nonce. As a result, rather than reboot an instance running in an ASG, the instance can simply be terminated; when the ASG brings up a new instance, the instance ID will be different and the new instance will be allowed to authenticate.

Manual/Automated Whitelist Management

This approach relies on either manual or automated intervention, perhaps keyed by reboot notifications or notifications from parsing the VSI client's error log. In this approach, knowledge of the reboot of an instance provides assurance to an operator that they can clear the instance and its nonce from the backend's whitelist via the whitelist/identity endpoint, allowing the client to use its new generated nonce to authenticate.

Instance Migration

If your EC2 instances do not rely on ephemeral storage across reboots, one approach is to stop/start the instance rather than reboot it, in conjunction with enabling the allow_instance_migration option on a configured AMI in the AWS Authentication Backend (or an associated Role Tag).

When an instance is stopped and started, this causes a new placement of the instances in the AWS infrastructure; this results in an updated value of pendingTime in the instance metadata document. When the allow_instance_migration option is turned on, a client is allowed to authenticate for the same instance ID with a new nonce if the value of pendingTime is later than the previously seen value.

Nonce Storage

A final option for managing reboot survivability is to use the client's option to store its nonce on the instance's filesystem and read this nonce the next time it starts up.

Although this option provides the best automated reboot survivability guarantees, it does require storing the nonce in persistent storage. If using this option, filesystem permissions should be used to ensure that only the user running the client has access to the directory where the nonce will be stored (the nonce will always be stored with permissions 0600).

This is a very security-sensitive option; so long as the nonce and instance remain valid, disclosure of the nonce on a machine can allow any user or service with access to the instance metadata to authenticate as the machine and gain access to all Vault policies associated with the machine. For this reason, you should also ensure that if you are having the client store its nonce, you do not duplicate this nonce across instances (for instance, by baking it into an AMI), as this would allow any user or service that learns this nonce with access to any machine's instance metadata to authenticate as that instance.