@title = "Keys and Certificates"
@summary = "Working with SSH keys, secrets, and X.509 certificates."

Working with SSH
================================

Whenever the `leap` command nees to push changes to a node or gather information from a node, it tunnels this command over SSH. Another way to put this: the security of your servers rests entirely on SSH. Because of this, it is important that you understand how `leap` uses SSH.

SSH related files
-------------------------------

Assuming your provider directory is called 'provider':

* `provider/nodes/crow/crow_ssh.pub` -- The public SSH host key for node 'crow'.
* `provider/users/alice/alice_ssh.pub` -- The public SSH user key for user 'alice'. Anyone with the private key that corresponds to this public key will have root access to all nodes.
* `provider/files/ssh/known_hosts` -- An autogenerated known_hosts, built from combining `provider/nodes/*/*_ssh.pub`. You must not edit this file directly. If you need to change it, remove or change one of the files that is used to generate `known_hosts` and then run `leap compile`.
* `provider/files/ssh/authorized_keys` -- An autogenerated list of all the user SSH keys with root access to the notes. It is created from `provider/users/*/*_ssh.pub`. You must not edit this file directly. If you need to change it, remove or change one of the files that is used to generate `authorized_keys` and then run `leap compile`.

All of these files should be committed to source control.

If you rename, remove, or add a node with `leap node [mv|add|rm]` the SSH key files and the `known_hosts` file will get properly updated.

SSH and local nodes
-----------------------------

Local nodes are run as Vagrant virtual machines. The `leap` command handles SSH slightly differently for these nodes.

Basically, all the SSH security is turned off for local nodes. Since local nodes only exist for a short time on your computer and can't be reached from the internet, this is not a problem.

Specifically, for local nodes:

1. `known_hosts` is never updated with local node keys, since the SSH public key of a local node is different for each user.
2. `leap` entirely skips the checking of host keys when connecting with a local node.
3. `leap` adds the public Vagrant SSH key to the list of SSH keys for a user. The public Vagrant SSH key is a shared and insecure key that has root access to most Vagrant virtual machines.

When SSH host key changes
-------------------------------

If the host key for a node has changed, you will get an error "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED".

To fix this, you need to remove the file `files/nodes/stompy/stompy_ssh.pub` and run `leap node init stompy`, where the node's name is 'stompy'. **Only do this if you are ABSOLUTELY CERTAIN that the node's SSH host key has changed**.

Changing the SSH port
--------------------------------

Suppose you have a node `blinky` that has SSH listening on port 22 and you want to make it port 2200.

First, modify the configuration for `blinky` to specify the variable `ssh.port` as 2200. Usually, this is done in `common.json` or in a tag file.

For example, you could put this in `tags/production.json`:

    {
      "ssh": {
        "port": 2200
      }
    }

Run `leap compile` and open `hiera/blinky.yaml` to confirm that `ssh.port` is set to 2200. The port number must be specified as a number, not a string (no quotes).

Then, you need to deploy this change so that SSH will bind to 2200. You cannot simply run `leap deploy blinky` because this command will default to using the variable `ssh.port` which is now `2200` but SSH on the node is still bound to 22.

So, you manually override the port in the deploy command, using the old port:

    leap deploy --port 22 blinky

Afterwards, SSH on `blinky` should be listening on port 2200 and you can just run `leap deploy blinky` from then on.

Sysadmins with multiple SSH keys
-----------------------------------

The command `leap add-user --self` allows only one SSH key. If you want to specify more than one key for a user, you can do it manually:

    users/userx/userx_ssh.pub
    users/userx/otherkey_ssh.pub

All keys matching 'userx/*_ssh.pub' will be usable.

Removing sysadmin access
--------------------------------

Suppose you want to remove `userx` from having any further ssh access to the servers. Do this:

    rm -r users/userx
    leap deploy

X.509 Certificates
================================

Configuration options
-------------------------------------------

The `ca` option in provider.json provides settings used when generating CAs and certificates. The defaults are as follows:

    {
      "ca": {
        "name": "= global.provider.ca.organization + ' Root CA'",
        "organization": "= global.provider.name[global.provider.default_language]",
        "organizational_unit": "= 'https://' + global.provider.domain",
        "bit_size": 4096,
        "digest": "SHA256",
        "life_span": "10y",
        "server_certificates": {
          "bit_size": 2048,
          "digest": "SHA256",
          "life_span": "1y"
        },
        "client_certificates": {
          "bit_size": 2048,
          "digest": "SHA256",
          "life_span": "2m",
          "limited_prefix": "LIMITED",
          "unlimited_prefix": "UNLIMITED"
        }
      }
    }

You should not need to override these defaults in your own provider.json, but you can if you want to. To see what values are used for your provider, run `leap inspect provider.json`.

NOTE: A certificate `bit_size` greater than 2048 will probably not be recognized by most commercial CAs.

Certificate Authorities
-----------------------------------------

There are three x.509 certificate authorities (CA) associated with your provider:

1. **Commercial CA:** It is strongly recommended that you purchase a commercial cert for your primary domain. The goal of platform is to not depend on the commercial CA system, but it does increase security and usability if you purchase a certificate. The cert for the commercial CA must live at `files/cert/commercial_ca.crt`.
2. **Server CA:** This is a self-signed CA responsible for signing all the **server** certificates. The private key lives at `files/ca/ca.key` and the public cert lives at `files/ca/ca.crt`. The key is very sensitive information and must be kept private. The public cert is distributed publicly.
3. **Client CA:** This is a self-signed CA responsible for signing all the **client** certificates. The private key lives at `files/ca/client_ca.key` and the public cert lives at `files/ca/client_ca.crt`. Neither file is distribute publicly. It is not a big deal if the private key for the client CA is compromised, you can just generate a new one and re-deploy.

To generate both the Server CA and the Client CA, run the command:

    leap cert ca

Server certificates
-----------------------------------

Most every server in your service provider will have a x.509 certificate, generated by the `leap` command using the Server CA. Whenever you modify any settings of a node that might affect it's certificate (like changing the IP address, hostname, or settings in provider.json), you can magically regenerate all the certs that need to be regenerated with this command:

    leap cert update

Run `leap help cert update` for notes on usage options.

Because the server certificates are generated locally on your personal machine, the private key for the Server CA need never be put on any server. It is up to you to keep this file secure.

Client certificates
--------------------------------

Every leap client gets its own time-limited client certificate. This cert is use to connect to the OpenVPN gateway (and probably other things in the future). It is generated on the fly by the webapp using the Client CA.

To make this work, the private key of the Client CA is made available to the webapp. This might seem bad, but compromise of the Client CA simply allows the attacker to use the OpenVPN gateways without paying. In the future, we plan to add a command to automatically regenerate the Client CA periodically.

There are two types of client certificates: limited and unlimited. A client using a limited cert will have its bandwidth limited to the rate specified by `provider.service.bandwidth_limit` (in Bytes per second). An unlimited cert is given to the user if they authenticate and the user's service level matches one configured in `provider.service.levels` without bandwidth limits. Otherwise, the user is given a limited client cert.

Commercial certificates
-----------------------------------

We strongly recommend that you use a commercial signed server certificate for your primary domain (in other words, a certificate with a common name matching whatever you have configured for `provider.domain`). This provides several benefits:

1. When users visit your website, they don't get a scary notice that something is wrong.
2. When a user runs the LEAP client, selecting your service provider will not cause a warning message.
3. When other providers first discover your provider, they are more likely to trust your provider key if it is fetched over a commercially verified link.

The LEAP platform is designed so that it assumes you are using a commercial cert for the primary domain of your provider, but all other servers are assumed to use non-commercial certs signed by the Server CA you create.

To generate a CSR, run:

    leap cert csr

This command will generate the CSR and private key matching `provider.domain` (you can change the domain with `--domain=DOMAIN` switch). It also generates a server certificate signed with the Server CA. You should delete this certificate and replace it with a real one once it is created by your commercial CA.

The related commercial cert files are:

    files/
      cert/
        domain.org.crt    # Server certificate for domain.org, obtained by commercial CA.
        domain.org.csr    # Certificate signing request
        domain.org.key    # Private key for you certificate
        commercial_ca.crt # The CA cert obtained from the commercial CA.

The private key file is extremely sensitive and care should be taken with its provenance.

If your commercial CA has a chained CA cert, you should be OK if you just put the **last** cert in the chain into the `commercial_ca.crt` file. This only works if the other CAs in the chain have certs in the debian package `ca-certificates`, which is the case for almost all CAs.

If you want to add additional fields to the CSR, like country, city, or locality, you can configure these values in provider.json like so:

      "ca": {
        "server_certificates": {
          "country": "US",
          "state": "Washington",
          "locality": "Seattle"
        }
      }

If they are not present, the CSR will be created without them.