tl;dr

Good Unix tools based on SSH offer ways to tunnel over a sudo user switch. Samples for the basic tools rsync and sshfs are discussed in this article, as well as configuration for advanced tools like ansible and ServerSpec.

Why sudo? Why ssh?

In contrast to other consultancies, metamorphant does not focus on sepecific vendor products. Instead, we solve problems. We prefer simple solutions over complex solutions.

Most often, simple and powerful solutions

don’t have a fancy name,
don’t include machine learning / blockchain / «insert other fancy buzzwords here» (we do these as well, but only when it makes sense),
are ancient and tested by time,
don’t reinvent the wheel, but build on existing generic knowledge and skills.

In short: They aren’t sexy – unless you share my fetish –, but they are effective. Enterprise solutions can follow KISS principles. We prove it daily together with our customers. For example, you can cover quite a distance with good old-fashioned Unix technology: sudo + ssh

Using sudo over ssh for command line operations

Imagine you have a VM running in your company’s data center. All your software runs as a technical user. This technical user has the proper permissions on file shares etc. You don’t. You can connect to the VM using SSH. However, you may not do anything there. You are especially not allowed to access the files that are handled by your application. That’s the sole privilege of the technical user.

So, as soon as you connected, you will want to escalate privileges and become the technical user. sudo enables you to do that. Because a picture says more than 1000 words:

schematic drawing: sudo tunneled through an SSH session

A typical manual session could look like this:

ssh myuser@remotesystem
sudo -u teamuser -i
do_stuff

As it happens, executing commands and accessing files on remote systems is a typical requirement during operations, e.g. during incident resolution or for handling bug reports and difficult support cases.

This article demonstrates how you can ease your journey and operate everything directly from your host’s command line; 1 command at a time.

Firing commands with ssh

In the first example, you will create a directory:

ssh -n remotesystem sudo -u teamuser -i mkdir -vp /some/path/that/doesnt/exist/yet

By default, i.e. when used without further options, ssh starts an interactive session. However, ssh takes an optional command without further ado. The same applies to sudo.

An important caveat is openssh client’s default setting to consume STDIN. This becomes a problem during scripting. The purpose of -n is to prevent this and provide a fresh, empty STDIN from /dev/null (see man ssh). Imagine you want to loop over a list of hosts like:

cat hosts | while read hostname; do ssh -n remotesystem sudo -u teamuser -i mkdir -vp /some/path/that/doesnt/exist/yet; done

Without the -n you would only address the first host and your command would receive the remaining hosts as STDIN. You can easily test that on your own:

# this will print hallo once; only the first line will be used
echo -e 'localhost\nlocalhost' | while read hostname; do ssh ${hostname} echo hallo; done

# this will print hallo twice; both lines will be looped
echo -e 'localhost\nlocalhost' | while read hostname; do ssh -n ${hostname} echo hallo; done

# this will print hallo once; the first line is consumed by read, the second by cat
echo -e 'localhost\nlocalhost' | while read hostname; do ssh ${hostname} cat; done

# this will print nothing; both lines are consumed by while read, i.e. 2 SSH commands are executed, but for both executions cat receives an empty STDIN
echo -e 'localhost\nlocalhost' | while read hostname; do ssh -n ${hostname} cat; done

Running a local bash script on the remote server

You can use the STDIN consumption behaviour of ssh for running local bash scripts on a remote server:

# a harmless, artificial example that does not require preconditions
ssh remotesystem sudo -u teamuser -i bash -s -- <<<'echo hello $2 $1' world great

The -s causes bash to read from STDIN. The -- terminates the bash options. All following options will be interpreted as options to the script. The <<<'...' provides a literal to STDIN. The more typical case will be running a script file:

ssh remotesystem sudo -u teamuser -i bash -s -- <somescript.sh param1 param2 ...

Diffing a local file with a remote file

You can also use the STDOUT, pipe it somewhere else, mangle it, wrangle it, … just like you are used to. Something I need quite often is diffing a local file with a remote file:

ssh -n remotesystem sudo -u teamuser -i cat /some/path/to/file | diff -u - some/local/file

Syncing files with rsync

In the second example, you will sync files and directories back and forth between the remote host and your local host:

# sync from remote to local
rsync --rsync-path='sudo -u teamuser -i rsync' -Prvltp remotehost:/some/path/to/an/interesting/directory/ synced_directory/

# sync from local to remote
rsync --rsync-path='sudo -u teamuser -i rsync' -Prvltp directory/ remotehost:/some/path/to/an/interesting/synced_directory/

The -Prvltp is just a sensible default. If your want to know more, consult man rsync. The important thing to note here is the --rsync-path.

rsync works by starting a server command on the remote end of the SSH connection. --rsync-path allows you to override this command.

Mounting remote directories using sshfs

sshfs works in a similar way. It builds on SFTP, which uses an SSH session as its transport. SFTP works similar to rsync. It starts a server process on the other end of the SSH connection. Just like rsync, sshfs allows you to override this command for spawning the server process:

sshfs remotehost:/some/path/to/an/interesting/some_directory/ /some/mountpoint/ -o sftp_server='/usr/bin/sudo -u teamuser -i /usr/libexec/openssh/sftp-server'

Small caveat: The exact path to sftp-server varies by Linux distribution.

The background

Now that we are familiar with the basic functionality, let’s have a look at the concrete use case we had with our customer. I will show you how we used these techniques to get rid of a bottleneck in the delivery process.

A word about DevOps…

Whenever the term “DevOps team” is used in this article, I am not talking about the perversion of having a single centralized “DevOps team”. I am thinking of the original sense of the word: A DevOps team is an autonomous team, building and running one or more independent applications resp. services. Think of it as a team of engineers who take ownership and responsibility of their product – instead of tossing artifacts over the fence to a distinct operations department.

Jumping over SSH + Sudo

One of our customers, a global mobility provider, is running a hybrid setup: The more modern systems run in the cloud, while the legacy still runs on-premise. On-Premise installations are VM-based, as is still common practise in the industry. The VMs are provided by an Infrastructure team, the Base Operating System (a CentOS linux installation provisioned by Puppet) is maintained by another specialised team.

schematic drawing: team setups before and after introducing the clear boundary and ssh+sudo provisioning

Former setups required everybody to negotiate any change to the systems with this central team. This made them a severe bottleneck. It created high costs and long waiting times. How could we overcome this situation and gain more autonomy for the mature DevOps teams using these VMs? We established a clean interface between this DevOps team and the team providing the base OS installation. The autonomous teams started to maintain their setup completely in code and deliver their platform continuously. Suddenly, it felt like consuming the provisioned VM as a service.

To enable this:

Each team has a distinct technical user.
Administrators of a team are allowed to log in with their personal user to all hosts by SSH.
Technical users are not allowed to log in via SSH.
Administrators’ personal users may become the corresponding technical user of a Team using sudo (i.e. sudo with NOPASSWD).
Administrators’ personal users are not allowed to to anything else.

The interface between the teams contains a few more mutual expectations, but they are not relevant for this article.

But why so complicated?

The obvious alternative setup would be to just allow SSH login for the technical user.

The reasoning behind our setup is:

People leave an audit trail by logging in using their personal users.
People log in using their private credentials. For using the technical user directly you have 2 options:
1. Manage shared credentials (including key rotation, secure storage and exchange, …).
2. Implement a solution for managing the ssh authorized_keys of the technical user and include 1 personal key per team member.
Restricting personal users’ freedom and always escalating to the technical team user ensures that team members don’t create shadow solutions in their private user home. Most importantly, the command history is team-public.
sudo allows fine-grained control over further privileges of users (personal and technical).
sudo allows NOPASSWD operations for using automation tools like ansible and serverspec (as compared to su).
The setup builds on the status quo of this very customer. It was an evolutionary step and can be implemented with minimally invasive means. YMMV.

People implementing such a scheme should be aware, that sudo is known to be a security risk (see e.g. the latest sudo disaster), as it runs SUID root and has a rich feature set (i.e. too many lines of code to maintain).

Use for automation tools

As mentioned above, the intention of the setup was more autonomy and better automation for the DevOps teams. It should help them in building, deploying and maintaining their apps. That’s their core competence.

Hence, KISS (keep it simple, stupid) was the most important design criterion. Teams should not have to bear with complex, idiosyncratic tooling. They should be able to build on existing knowledge – to be more precise: on ordinary Unix knowledge. Also, the solution should not create a vendor lock-in for the company. Switching to other implementations should be cheap.

From an architectural point of view, we decided for push-based provisioning instead of installing pull-agents on the hosts. One design goal was to get rid of infrastructural components that need long-running processes (which – without additional measures – have a higher risk of invisible failure, i.e. a higher need for human intervention).

Provisioning with ansible

For the purpose of provisioning the apps on the servers, teams can in principle use any tooling they like – as long as it supports the ‘sudo over SSH’ setup. I, for my part, would love to play around with Clojure-based tools like epiccastle’s spire. Clojure’s super-concise modeling capabilities are without comparison. It could save a lot of the maintenance costs that come with inexpressive models and bad abstractions.

Stop dreaming! In an enterprise environment, mainstream compliance is key. Even unimaginative morons should be able to operate and develop the solution. Resources should be readily available on the web. Junior developers should be able to prove their copy-paste-from-Stackoverflow competence. Sounds like a job for ansible…

Actually, ansible was one of our starting points for creating the ‘sudo over SSH’ setup. It provides this feature by default. You just have to add a ansible.cfg configuration file to your ansible project and add the following section:

...
[privilege_escalation]
become = True
become_method = sudo
become_user = teamuser
become_flags = -i
...

There are no caveats. The design is supported well by ansible.

Performing acceptance tests with serverspec

At some point in your DevOps journey you will probably want to perform some automated checks on your target systems. Typical use cases are:

Acceptance testing the pre-conditions you expect from a system provided by a different team.
Regression testing of the very same pre-conditions after changes by the other team.
Testing the properties of your own provisioning routines.

In our projects we often use the RSpec-based ServerSpec for that purpose. Its strength is its concise Ruby-DSL for modeling system conditions.

Luckily, ServerSpec is also well-prepared for the ‘sudo over SSH’ setup. In a default ServerSpec project you configure it in your spec/spec_helper.rb:

...
set :disable_sudo, false
set :sudo_options, ['-u', 'teamuser', '-n', '-i']    # -n: make serverspec fail, if NOPASSWD sudo is not possible
...

Please note the use of -n. For our other use cases this is irrelevant, but in the case of ServerSpec we want a proper error message if sudo NOPASSWD does not work.

Closing remarks

Have fun with your ‘sudo over SSH’ setup. It’s certainly a powerful pattern for VM-based infrastructure with separate teams for base OS and application.

Post header background image by Peter H from Pixabay.

Tunneling rsync and sshfs over ssh + sudo