When Hashicorp Vault, Systemd, and SELinux collide | Writing about tech and anything else I find interesting

When Hashicorp Vault, Systemd, and SELinux collide

Some time ago, I found myself struggling to get the Vault agent to work as I expected it to after installing it from the HashiCorp repository. What came next was a journey into systemd sandboxing behaviour, Linux capabilities, and my personal favourite - SELinux.

TLDR - Useful Takeaways

I learned a lot through this journey, but below are a couple of really useful commands that I never knew existed.

  1. systemctl cat vault.service shows both the default job file and any configured overrides.
  2. ausearch -c 'vault' checks the audit logs and returns a match based on the provided search term.
  3. audit2allow -M vault generates an SELinux ruleset as a module that can be loaded into the SELinux policy.

With that, let’s get into it the detail.

Background

There are two things I wanted to achieve:

  1. Starting Vault as an agent, not as a server.
  2. Restarting a different systemd job when the secret (certificate) I’m watching rotates. This means I need to figure out how to give child processes of the Vault agent the ability to use sudo.

Below is the default systemd job before modifications:

[Unit]
Description="HashiCorp Vault - A tool for managing secrets"
Documentation=https://www.vaultproject.io/docs/
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/vault.d/vault.hcl
StartLimitIntervalSec=60
StartLimitBurst=3

[Service]
User=vault
Group=vault
ProtectSystem=full
ProtectHome=read-only
PrivateTmp=yes
PrivateDevices=yes
SecureBits=keep-caps
AmbientCapabilities=CAP_IPC_LOCK
CapabilityBoundingSet=CAP_SYSLOG CAP_IPC_LOCK
NoNewPrivileges=yes
ExecStart=/usr/bin/vault server -config=/etc/vault.d/vault.hcl
ExecReload=/bin/kill --signal HUP $MAINPID
KillMode=process
KillSignal=SIGINT
Restart=on-failure
RestartSec=5
TimeoutStopSec=30
StartLimitInterval=60
StartLimitBurst=3
LimitNOFILE=65536
LimitMEMLOCK=infinity

[Install]
WantedBy=multi-user.target

Let’s dig in and figure this out!

Overriding Systemd Job Defaults

To make changes to the job file, I run systemctl edit vault.service. This command opens a blank file in a simple text editor, and drops it into the right location so I don’t need to think too hard about it. For what it’s worth, the location is /etc/systemd/system/vault.service.d/override.conf. The settings in this file will be combined with the default job.

Starting Vault as an Agent

When a setting in systemd supports more than one value, the override settings are combined with the original values. To get past that, we set the value to null. That is what the first empty ExecStart value is doing below.

[Service]
ExecStart=
ExecStart=/usr/bin/vault agent -config=/etc/vault.d/vault.hcl

That’s the first goal taken care of. If the second is this easy it’s going to be a short blog post!

Managing Privileges with Capabilities

The next thing I had to deal with was how to give the Vault agent the ability to restart another service running under systemd. Maybe you’re thinking that doesn’t sound all that hard - the Vault agent supports issuing a command, so a quick sudo systemctl restart nginx in the config file should take care of that right? As I learned, the answer to that is a resounding no. There are many controls in systemd to sandbox services. This is a great security measure and removing them altogether would make me uncomfortable. Instead, I’m going to make the minimum set of required changes to do what I need.

Here is a snippet of the job file again, containing the settings that need some work.

ProtectSystem=full
ProtectHome=read-only
PrivateTmp=yes
PrivateDevices=yes
SecureBits=keep-caps
AmbientCapabilities=CAP_IPC_LOCK
CapabilityBoundingSet=CAP_SYSLOG CAP_IPC_LOCK
NoNewPrivileges=yes

Looking at that list, there is one that stands out as a clear blocker. How can I use sudo when NoNewPrivileges is enabled?
Setting this value to no is a good start, but it turns out that enabling other settings can override this value. As a result, PrivateDevices=no must also be set.
On top of these changes, there are a few Linux capabilities that are required to execute sudo. I’m not an expert in this area, but combining my stubborness, the logs, and the Linux capabilities man page I was able to find the right combination of capabilities required to get the job done.

Here are a few snippets from the log as I did a bit of debugging, and the associated capability that I had to set to get past each one.

Jan 27 21:35:04 ip-10-0-103-5 vault[17012]: sudo: unable to change to root gid: Operation not permitted
Jan 27 21:35:04 ip-10-0-103-5 vault[17012]: sudo: unable to initialize policy plugin

Required capability: CAP_SETGID

Jan 27 21:37:13 ip-10-0-103-5 vault[12309]: sudo: PERM_SUDOERS: setresuid(-1, 1, -1): Operation not permitted
Jan 27 21:37:13 ip-10-0-103-5 vault[12309]: sudo: no valid sudoers sources found, quitting
Jan 27 21:37:13 ip-10-0-103-5 vault[12309]: sudo: setresuid() [0, 0, 0] -> [997, -1, -1]: Operation not permitted

Required capability: CAP_SETUID

Jan 27 21:39:10 ip-10-0-103-5 vault[13081]: sudo: unable to send audit message
Jan 27 21:39:10 ip-10-0-103-5 vault[13081]: sudo: pam_open_session: System error
Jan 27 21:39:10 ip-10-0-103-5 vault[13081]: sudo: policy plugin failed session initialization

Capability: CAP_AUDIT_WRITE

With those three capabilities in hand, I was ready to go back and edit the CapabilityBoundingSet in my system override file.

<snip>
[Service]
ExecStart=
ExecStart=/usr/bin/vault agent -config=/etc/vault.d/vault.hcl
CapabilityBoundingSet=CAP_SETGID CAP_SETUID CAP_AUDIT_WRITE
NoNewPrivileges=no
PrivateDevices=no
ReadWritePaths=/etc/ssl/certs/nginx /etc/ssl/private/nginx
</snip>

The final thing in the snippet above that I haven’t spoken about yet is ReadWritePaths. I am setting this because of where certificate and key are rendered to in my vault configuration file. Since ProtectSystem=full is set, /etc and all sub-directories (among others) are mounted as read-only. ReadWritePaths provides an exception list for that.

Modifying sudoers

Blanket sudo is a bad thing, so we also configure a sudoers file to constrain that behaviour.
sudo visudo /etc/sudoers.d/vault

In this case, we are going to allow the Vault agent (running as the vault user) to issue a restart on the nginx systemd job, without needing to pass in a password.
vault ALL=(root) NOPASSWD: /bin/systemctl restart nginx.service

By this point in time, I was pretty happy with myself. The Vault agent was running as expected on Ubuntu. My joy turned to tears when my buddy Mark got hold of this and told me it didn’t work on CentOS 8.4.

SELinux Enters the Fray

Once I had deployed a CentOS box I began seeing the same errors as Mark.

Jan 31 00:13:53 ip-10-0-101-228 vault[1283]: 2022-01-31T00:13:53.032Z [INFO] (child) spawning: sh -c sudo /bin/systemctl restart nginx.service
Jan 31 00:13:53 ip-10-0-101-228 vault[1283]: sudo: PAM account management error: Authentication service cannot retrieve authentication info

The audit log showed some more information to point the way, but I couldn’t really decipher it. I’ve dropped the raw logs below for the masochists out there.

type=USER_ACCT msg=audit(1643595658.356:27879): pid=31345 uid=990 auid=4294967295 ses=4294967295 subj=system_u:system_r:unconfined_service_t:s0 msg='op=PAM:accounting grantors=? acct="vault" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=failed'UID="vault" AUID="unset"
type=USER_CMD msg=audit(1643595658.356:27880): pid=31345 uid=990 auid=4294967295 ses=4294967295 subj=system_u:system_r:unconfined_service_t:s0 msg='cwd="/" cmd=2F62696E2F73797374656D63746C2072657374617274206E67696E782E73657276696365 exe="/usr/bin/sudo" terminal=? res=failed'UID="vault" AUID="unset"

My research (aka Googling) turned up a tool called ausearch, that allows you to search audit logs.

# sudo ausearch -c 'vault' --raw
----
time->Mon Jan 31 00:13:35 2022
type=AVC msg=audit(1643588015.991:131): avc:  denied  { mounton } for  pid=1283 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0
----
time->Mon Jan 31 00:23:23 2022
type=AVC msg=audit(1643588603.662:3482): avc:  denied  { mounton } for  pid=4664 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0
----
time->Mon Jan 31 00:51:05 2022
type=AVC msg=audit(1643590265.406:5710): avc:  denied  { mounton } for  pid=6983 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0
----
time->Mon Jan 31 01:18:48 2022
type=AVC msg=audit(1643591928.851:6506): avc:  denied  { mounton } for  pid=8181 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0
----
time->Mon Jan 31 01:49:50 2022
type=AVC msg=audit(1643593790.448:16942): avc:  denied  { mounton } for  pid=18666 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0

A bit easier to read, but still not amazing. Further reading suggested that I could use audit2why to make more sense of things.

# sudo ausearch -c 'vault' --raw | audit2why
type=AVC msg=audit(1643588015.991:131): avc:  denied  { mounton } for  pid=1283 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0

	Was caused by:
		Missing type enforcement (TE) allow rule.

		You can use audit2allow to generate a loadable module to allow this access.

type=AVC msg=audit(1643588603.662:3482): avc:  denied  { mounton } for  pid=4664 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0

	Was caused by:
		Missing type enforcement (TE) allow rule.

		You can use audit2allow to generate a loadable module to allow this access.

type=AVC msg=audit(1643590265.406:5710): avc:  denied  { mounton } for  pid=6983 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0

	Was caused by:
		Missing type enforcement (TE) allow rule.

		You can use audit2allow to generate a loadable module to allow this access.

type=AVC msg=audit(1643591928.851:6506): avc:  denied  { mounton } for  pid=8181 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0

	Was caused by:
		Missing type enforcement (TE) allow rule.

		You can use audit2allow to generate a loadable module to allow this access.

type=AVC msg=audit(1643593790.448:16942): avc:  denied  { mounton } for  pid=18666 comm="(vault)" path="/run/systemd/unit-root/etc/pki/tls/certs/nginx" dev="xvda2" ino=8698630 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:cert_t:s0 tclass=dir permissive=0

	Was caused by:
		Missing type enforcement (TE) allow rule.

		You can use audit2allow to generate a loadable module to allow this access.

The last line really caught my eye - auto generate a module? Yes!

sudo ausearch -c 'vault' --raw | sudo audit2allow -M vault
sudo semodule -i vault.pp

A quick restart of the agent to test things out and….

Jan 31 04:13:50 ip-10-0-101-228.us-west-2.compute.internal vault[898]: 2022-01-31T04:13:50.019Z [INFO] (runner) executing command "sudo /bin/systemctl restart nginx.service" from "(dynamic)" => "/etc/ssl/certs/nginx/nginx.crt"
Jan 31 04:13:50 ip-10-0-101-228.us-west-2.compute.internal vault[898]: 2022-01-31T04:13:50.019Z [INFO] (child) spawning: sh -c sudo /bin/systemctl restart nginx.service
Jan 31 04:13:50 ip-10-0-101-228.us-west-2.compute.internal sudo[1402]:    vault : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/systemctl restart nginx.service
Jan 31 04:13:50 ip-10-0-101-228.us-west-2.compute.internal vault[898]: 2022-01-31T04:13:50.110Z [DEBUG] (runner) watching 1 dependencies
Jan 31 04:13:50 ip-10-0-101-228.us-west-2.compute.internal vault[898]: 2022-01-31T04:13:50.110Z [DEBUG] (runner) all templates rendered

Finally, a functional Vault agent systemd job that can render out secrets to a file, and restart the systemd service that is going to make use of them. I hope that helps you when it comes to finding how to tune your systemd job and SELinux config to run Vault!