From 6a4e2efe9c57dea50119506b3c86b8277c5b5bd0 Mon Sep 17 00:00:00 2001 From: kwadronaut Date: Mon, 4 Nov 2013 00:56:17 +0100 Subject: update documentation of the platform. Todo: known-issues --- doc/development.md | 272 +++++++++++++++++++++++++++++++++++++++ doc/en.md | 4 +- doc/faq.md | 53 ++++++++ doc/guide.md | 57 +++++---- doc/known-issues.md | 6 +- doc/quick-start.md | 336 +++++++++++++++++++++++++++++++++++-------------- doc/troubleshooting.md | 147 ++++++++++++++++++++++ 7 files changed, 754 insertions(+), 121 deletions(-) create mode 100644 doc/development.md create mode 100644 doc/faq.md create mode 100644 doc/troubleshooting.md (limited to 'doc') diff --git a/doc/development.md b/doc/development.md new file mode 100644 index 00000000..7a761418 --- /dev/null +++ b/doc/development.md @@ -0,0 +1,272 @@ +@title = "Development Environment" +@toc = true + +If you are wanting to make local changes to your provider, or want to contribute some fixes back to LEAP, we recommend that you follow this guide to build up a development environment to test your changes first. Using this method, you can quickly test your changes without deploying them to your production environment, while benefitting from the convenience of reverting to known good states in order to retry things from scratch. + +This page will walk you through setting up nodes using [Vagrant](http://www.vagrantup.com/) for convenient deployment testing, snapshotting known good states, and reverting to previous snapshots. + +Requirements +============ + +* Be a real machine with virtualization support in the CPU (VT-x or AMD-V). In other words, not a virtual machine. +* Have at least 4gb of RAM. +* Have a fast internet connection (because you will be downloading a lot of big files, like virtual machine images). + +Install prerequisites +-------------------------------- + +For development purposes, you will need everything that you need for deploying the LEAP platform: + +* LEAP cli +* A provider instance + +You will also need to setup a virtualized Vagrant environment, to do so please make sure you have the following +pre-requisites installed: + +*Debian & Ubuntu* + +Install core prerequisites: + + sudo apt-get install git ruby ruby-dev rsync openssh-client openssl rake make + +Install Vagrant in order to be able to test with local virtual machines (typically optional, but required for this tutorial): + + sudo apt-get install vagrant virtualbox + + + + +Adding development nodes to your provider +========================================= + +Now you will add local-only Vagrant development nodes to your provider. + +You do not need to setup a different provider instance for development, in fact it is more convenient if you do not, but you can if you wish. If you do not have a provider already, you will need to create one and configure it before continuing (it is recommended you go through the [Quick Start](quick-start) before continuing down this path). + + +Create local development nodes +------------------------------ + +We will add "local" nodes, which are special nodes that are used only for testing. These nodes exist only as virtual machines on your computer, and cannot be accessed from the outside. Each "node" is a server that can have one or more services attached to it. We recommend that you create different nodes for different services to better isolate issues. + +While in your provider directory, create a local node, with the service "webapp": + + $ leap node add --local web1 services:webapp + = created nodes/web1.json + = created files/nodes/web1/ + = created files/nodes/web1/web1.key + = created files/nodes/web1/web1.crt + +This command creates a node configuration file in `nodes/web1.json` with the webapp service. + +Starting local development nodes +-------------------------------- + +In order to test the node "web1" we need to start it. Starting a node for the first time will spin up a virtual machine. The first time you do this will take some time because it will need to download a VM image (about 700mb). After you've downloaded the base image, you will not need to download it again, and instead you will re-use the downloaded image (until you need to update the image). + +NOTE: Many people have difficulties getting Vagrant working. If the following commands do not work, please visit the [Vagrant page](vagrant) to troubleshoot your Vagrant install before proceeding. + + $ leap local start web + = created test/ + = created test/Vagrantfile + = installing vagrant plugin 'sahara' + Bringing machine 'web1' up with 'virtualbox' provider... + [web1] Box 'leap-wheezy' was not found. Fetching box from specified URL for + the provider 'virtualbox'. Note that if the URL does not have + a box for this provider, you should interrupt Vagrant now and add + the box yourself. Otherwise Vagrant will attempt to download the + full box prior to discovering this error. + Downloading or copying the box... + Progress: 3% (Rate: 560k/s, Estimated time remaining: 0:13:36) + ... + Bringing machine 'web1' up with 'virtualbox' provider... + [web1] Importing base box 'leap-wheezy'... + 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100% + +Now the virtual machine 'web1' is running. You can add another local node using the same process. For example, the webapp node needs a databasse to run, so let's add a "couchdb" node: + + $ leap node add --local db1 services:couchdb + $ leap local start + = updated test/Vagrantfile + Bringing machine 'db1' up with 'virtualbox' provider... + [db1] Importing base box 'leap-wheezy'... + [db1] Matching MAC address for NAT networking... + [db1] Setting the name of the VM... + [db1] Clearing any previously set forwarded ports... + [db1] Fixed port collision for 22 => 2222. Now on port 2202. + [db1] Creating shared folders metadata... + [db1] Clearing any previously set network interfaces... + [db1] Preparing network interfaces based on configuration... + [db1] Forwarding ports... + [db1] -- 22 => 2202 (adapter 1) + [db1] Running any VM customizations... + [db1] Booting VM... + [db1] Waiting for VM to boot. This can take a few minutes. + [db1] VM booted and ready for use! + [db1] Configuring and enabling network interfaces... + [db1] Mounting shared folders... + [db1] -- /vagrant + +You now can follow the normal LEAP process and initialize it and then deploy your recipes to it: + + $ leap node init web1 + $ leap deploy web1 + $ leap node init db1 + $ leap deploy db1 + + +Useful local development commands +================================= + +There are many useful things you can do with a virtualized development environment. + +Listing what machines are running +--------------------------------- + +Now you have the two virtual machines "web1" and "db1" running, you can see the running machines as follows: + + $ leap local status + Current machine states: + + db1 running (virtualbox) + web1 running (virtualbox) + + This environment represents multiple VMs. The VMs are all listed + above with their current state. For more information about a specific + VM, run `vagrant status NAME`. + +Stopping machines +----------------- + +It is not recommended that you leave your virtual machines running when you are not using them. They consume memory and other resources! To stop your machines, simply do the following: + + $ leap local stop web1 db1 + +Connecting to machines +---------------------- + +You can connect to your local nodes just like you do with normal LEAP nodes, by running 'leap ssh node'. + +However, if you cannot connect to your local node, because the networking is not setup properly, or you have deployed a firewall that locks you out, you may need to access the graphical console. + +In order to do that, you will need to configure Vagrant to launch a graphical console and then you can login as root there to diagnose the networking problem. To do this, add the following to you +$HOME/.leaprc: + + @custom_vagrant_vm_line = 'config.vm.boot_mode = :gui' + +and then start, or restart, your local Vagrant node. You should get a VirtualBox graphical interface presented to you showing you the bootup and eventually the login. + +Snapshotting machines +--------------------- + +A very useful feature of local Vagrant development nodes is the ability to snapshot the current state and then revert to that when you need. + +For example, perhaps the base image is a little bit out of date and you want to get the packages updated to the latest before continuing. You can do that simply by starting the node, connecting to it and updating the packages and then snapshotting the node: + + $ leap local start web1 + $ leap ssh web1 + web1# apt-get -u dist-upgrade + web1# exit + $ leap local save web1 + +Now you can deploy to web1 and if you decide you want to revert to the state before deployment, you simply have to reset the node to your previous save: + + $ leap local reset web1 + +More information +---------------- + +See `leap help local` for a complete list of local-only commands and how they can be used. + + +Limitations +=========== + +Please consult the known issues for vagrant, see the [Known Issues](known-issues), section *Special Environments* + + +Troubleshooting Vagrant +======================= + +To troubleshoot vagrant issues, try going through these steps: + +* Try plain vagrant using the [Getting started guide](http://docs.vagrantup.com/v2/getting-started/index.html). +* If that fails, make sure that you can run virtual machines (VMs) in plain virtualbox (Virtualbox GUI or VBoxHeadless). + We don't suggest a sepecial howto for that, [this one](http://www.thegeekstuff.com/2012/02/virtualbox-install-create-vm/) seems pretty decent, or you follow the [Oracale Virtualbox User Manual](http://www.virtualbox.org/manual/UserManual.html). There's also specific documentation for [Debian](https://wiki.debian.org/VirtualBox) and for [Ubuntu](https://help.ubuntu.com/community/VirtualBox). If you succeeded, try again if you now can start vagrant nodes using plain vagrant (see first step). +* If plain vagrant works for you, you're very close to using vagrant with leap ! If you encounter any problems now, please [contact us](https://leap.se/en/about-us/contact) or use our [issue tracker](https://leap.se/code) + +Known working combinations +-------------------------- + +Please consider that using other combinations might work for you as well, these are just the combinations we tried and worked for us: + + +Debian Wheezy +------------- + +* `virtualbox-4.2 4.2.16-86992~Debian~wheezy` from Oracle and `vagrant 1.2.2` from vagrantup.com + + +Ubuntu Raring 13.04 +------------------- + +* `virtualbox 4.2.10-dfsg-0ubuntu2.1` from Ubuntu raring and `vagrant 1.2.2` from vagrantup.com + + +Using Vagrant with libvirt/kvm +============================== + +Vagrant can be used with different providers/backends, one of them is [vagrant-libvirt](https://github.com/pradels/vagrant-libvirt). Here are the steps how to use it. Be sure to use a recent vagrant version (>= 1.3). + +Install vagrant-libvirt plugin and add box +------------------------------------------ + sudo apt-get install libvirt-bin libvirt-dev + vagrant plugin install vagrant-libvirt + vagrant plugin install sahara + vagrant box add leap-wheezy https://downloads.leap.se/leap-debian-libvirt.box + + +Debugging +--------- + +If you get an error in any of the above commands, try to get some debugging information, it will often tell you what is wrong. In order to get debugging logs, you simply need to re-run the command that produced the error but prepend the command with VAGRANT_LOG=info, for example: + VAGRANT_LOG=info vagrant box add leap-wheezy https://downloads.leap.se/leap-debian-libvirt.box + +Start it +-------- + +Use this example Vagrantfile: + + Vagrant.configure("2") do |config| + config.vm.define :testvm do |testvm| + testvm.vm.box = "leap-wheezy" + testvm.vm.network :private_network, :ip => '10.6.6.201' + end + + config.vm.provider :libvirt do |libvirt| + libvirt.connect_via_ssh = false + end + end + +Then: + + vagrant up --provider=libvirt + +If everything works, you should export libvirt as the VAGRANT_DEFAULT_PROVIDER: + + export VAGRANT_DEFAULT_PROVIDER="libvirt" + +Now you should be able to use the `leap local` commands. + +Known Issues +------------ + +* 'Call to virConnectOpen failed: internal error: Unable to locate libvirtd daemon in /usr/sbin (to override, set $LIBVIRTD_PATH to the name of the libvirtd binary)' - you don't have the libvirtd daemon running or installed, be sure you installed the 'libvirt-bin' package and it is running +* 'Call to virConnectOpen failed: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Permission denied' - you need to be in the libvirt group to access the socket, do 'sudo adduser libvirt' and then re-login to your session +* see the [vagrant-libvirt issue list on github](https://github.com/pradels/vagrant-libvirt/issues) +* be sure to use vagrant-libvirt >= 0.0.11 and sahara >= 0.0.16 (which are the latest stable gems you would get with `vagrant plugin install [vagrant-libvirt|sahara]`) for proper libvirt support diff --git a/doc/en.md b/doc/en.md index bdae4630..f1a1fc17 100644 --- a/doc/en.md +++ b/doc/en.md @@ -20,7 +20,7 @@ LEAP maintains a repository of platform recipes, which typically do not need to As these recipes consist in abstract definitions, in order to configure settings for a particular service provider a system administrator has to create a provider instance (see below). -LEAP's platform recipes are distributed as a git repository: `git://leap.se/leap_platform.git` +LEAP's platform recipes are distributed as a git repository: `https://leap.se/git/leap_platform` The provider instance --------------------- @@ -64,7 +64,7 @@ One other significant difference between LEAP and typical system automation is h These two approaches, masterless push and pre-compiled static configuration, allow the sysadmin to manage a set of LEAP servers using traditional software development techniques of branching and merging, to more easily create local testing environments using virtual servers, and to deploy without the added complexity and failure potential of a master server. -The `leap` command line tool is distributed as a git repository: `git://leap.se/leap_cli`. It can be installed with `sudo gem install leap_cli`. +The `leap` command line tool is distributed as a git repository: `https://leap.se/git/leap_cli`. It can be installed with `sudo gem install leap_cli`. Getting started ---------------------------------- diff --git a/doc/faq.md b/doc/faq.md new file mode 100644 index 00000000..2654ce80 --- /dev/null +++ b/doc/faq.md @@ -0,0 +1,53 @@ +@title = 'Frequently asked questions' +@nav_title = 'FAQ' +@toc = true + +Puppet +====== + +Where do i find the time a server was last deployed ? +----------------------------------------------------- + +The puppet state file on the node indicates the last puppetrun: + + ls -la /var/lib/puppet/state/state.yaml + +What resources are touched by puppet/leap_platform (services/packages/files etc.) ? +----------------------------------------------------------------------------------- + +Log into your server and issue: + + grep -v '!ruby/sym' /var/lib/puppet/state/state.yaml | sed 's/\"//' | sort + + +How can i customize the leap_platform puppet manifests ? +-------------------------------------------------------- + +You can create a custom module `site_custom`. The class `site_custom::setup` will get +included in the first part of the deploy process, and `site_custom` during the second part. +Of cause you can also create a different git branch and change whatever you want, if you are +familiar wit git. + +Facter +====== + +How can i see custom facts distributed by leap_platform on a node ? +------------------------------------------------------------------- + +On the server, export the FACTERLIB env. variable to include the path of the custom fact in question: + + export FACTERLIB=/var/lib/puppet/lib/facter:/srv/leap/puppet/modules/stdlib/lib/facter/ + facter + + +Etc +=== + +How do i change the domain of my provider ? +------------------------------------------- + +* First of all, you need to have access to the nameserver config of your new domain. +* Update domain in provider.json +* remove all ca and cert files: `rm files/cert/* files/ca/*` +* create ca, csr and certs : `leap cert ca; leap cert csr; leap cert dh; leap cert update` +* deploy diff --git a/doc/guide.md b/doc/guide.md index dae392e5..52c3b2fa 100644 --- a/doc/guide.md +++ b/doc/guide.md @@ -15,16 +15,11 @@ When adding a new node to your provider, you should ask yourself four questions: Brief overview of the services: -![services diagram](service-diagram.png) - * **webapp**: The web application. Runs both webapp control panel for users and admins as well as the REST API that the client uses. Needs to communicate heavily with `couchdb` nodes. You need at least one, good to have two for redundancy. The webapp does not get a lot of traffic, so you will not need many. * **couchdb**: The database for users and user data. You can get away with just one, but for proper redundancy you should have at least three. Communicates heavily with `webapp` and `mx` nodes. * **soledad**: Handles the data syncing with clients. Typically combined with `couchdb` service, since it communicates heavily with couchdb. (not currently in stable release) * **mx**: Incoming and outgoing MX servers. Communicates with the public internet, clients, and `couchdb` nodes. (not currently in stable release) * **openvpn**: OpenVPN gateway for clients. You need at least one, but want as many as needed to support the bandwidth your users are doing. The `openvpn` nodes are autonomous and don't need to communicate with any other nodes. Often combined with `tor` service. - -Not pictured: - * **monitor**: Internal service to monitor all the other nodes. Currently, you can have zero or one `monitor` nodes. * **tor**: Sets up a tor exit node, unconnected to any other service. * **dns**: Not yet implemented. @@ -157,30 +152,32 @@ Configuration options The `ca` option in provider.json provides settings used when generating CAs and certificates. The defaults are as follows: - "ca": { - "name": "= global.provider.ca.organization + ' Root CA'", - "organization": "= global.provider.name", - "organizational_unit": "= 'https://' + global.provider.name", - "bit_size": 4096, - "digest": "SHA256", - "life_span": "10y", - "server_certificates": { - "bit_size": 2024, - "digest": "SHA256", - "life_span": "1y" - }, - "client_certificates": { - "bit_size": 2024, + { + "ca": { + "name": "= global.provider.ca.organization + ' Root CA'", + "organization": "= global.provider.name[global.provider.default_language]", + "organizational_unit": "= 'https://' + global.provider.domain", + "bit_size": 4096, "digest": "SHA256", - "life_span": "2m", - "limited_prefix": "LIMITED", - "unlimited_prefix": "UNLIMITED" + "life_span": "10y", + "server_certificates": { + "bit_size": 2048, + "digest": "SHA256", + "life_span": "1y" + }, + "client_certificates": { + "bit_size": 2048, + "digest": "SHA256", + "life_span": "2m", + "limited_prefix": "LIMITED", + "unlimited_prefix": "UNLIMITED" + } } } -To see what values are used for your provider, run `leap inspect provider.json`. You can modify the defaults as you wish by adding the values to provider.json. +You should not need to override these defaults in your own provider.json, but you can if you want to. To see what values are used for your provider, run `leap inspect provider.json`. -NOTE: A certificate `bit_size` greater than 2024 will probably not be recognized by most commercial CAs. +NOTE: A certificate `bit_size` greater than 2048 will probably not be recognized by most commercial CAs. Certificate Authorities ----------------------------------------- @@ -245,6 +242,18 @@ The private key file is extremely sensitive and care should be taken with its pr If your commercial CA has a chained CA cert, you should be OK if you just put the **last** cert in the chain into the `commercial_ca.crt` file. This only works if the other CAs in the chain have certs in the debian package `ca-certificates`, which is the case for almost all CAs. +If you want to add additional fields to the CSR, like country, city, or locality, you can configure these values in provider.json like so: + + "ca": { + "server_certificates": { + "country": "US", + "state": "Washington", + "locality": "Seattle" + } + } + +If they are not present, the CSR will be created without them. + Facts ============================== diff --git a/doc/known-issues.md b/doc/known-issues.md index abd28084..960eaad7 100644 --- a/doc/known-issues.md +++ b/doc/known-issues.md @@ -34,15 +34,15 @@ User setup and ssh . If the ssh host key changes, you need to run node init again (see: https://leap.se/en/docs/platform/guide#Working.with.SSH) -. At the moment, only ECDSA ssh host keys are supported. If you get the following error: `= FAILED ssh-keyscan: no hostkey alg (must be missing an ecdsa public host key)` then you should confirm that you have the following line defined in your server's /etc/ssh/sshd_config: -HostKey /etc/ssh/ssh_host_ecdsa_key and that file exists. If you made a change to your sshd_config, then you need to run `/etc/init.d/ssh restart` (see: https://leap.se/code/issues/2373) +. At the moment, only ECDSA ssh host keys are supported. If you get the following error: `= FAILED ssh-keyscan: no hostkey alg (must be missing an ecdsa public host key)` then you should confirm that you have the following line defined in your server's **/etc/ssh/sshd_config**: `HostKey /etc/ssh/ssh_host_ecdsa_key`. If that file doesn't exist, run `ssh-keygen -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key -N ""` in order to create it. If you made a change to your sshd_config, then you need to run `/etc/init.d/ssh restart` (see: https://leap.se/code/issues/2373) -. To remove an admin's access to your servers, please remove the directory for that user under the `users/` subdirectory in your provider directory and then remove that user's ssh keys from files/ssh/authorized_keys. When finished you *must* run a `leap deploy` to update that information on the servers (see: https://leap.se/code/issues/1863) +. To remove an admin's access to your servers, please remove the directory for that user under the `users/` subdirectory in your provider directory and then remove that user's ssh keys from files/ssh/authorized_keys. When finished you *must* run a `leap deploy` to update that information on the servers. . At the moment, it is only possible to add an admin who will have access to all LEAP servers (see: https://leap.se/code/issues/2280) . leap add-user --self allows only one key - if you run that command twice with different keys, you will just replace the key with the second key. To add a second key, add it manually to files/ssh/authorized_keys (see: https://leap.se/code/issues/866) + Deploying --------- diff --git a/doc/quick-start.md b/doc/quick-start.md index 5ba28f8d..0bce271a 100644 --- a/doc/quick-start.md +++ b/doc/quick-start.md @@ -1,116 +1,197 @@ @title = 'LEAP Platform Quick Start' @nav_title = 'Quick Start' -This tutorial walks you through the initial process of creating and deploying a service provider running the [LEAP platform](platform). First examples aim to build a provider in a virtual environment, and in the end running in real hardware is targeted. +Quick Start +=========== -First, a few definitions: +This tutorial walks you through the initial process of creating and deploying a minimal service provider running the [LEAP platform](platform). This Quick Start guide will guide you through building a three node OpenVPN provider. +If you are curious how this will look like without trying it out yourself, you can watch our [recorded screencasts](http://shelr.tv/users/524415e69660807910000021). -* **node:** A server that is part of the service provider's infrastructure. All nodes are running the Debian GNU/Linux operating system. -* **sysadmin:** This is you. -* **sysadmin machine:** Your desktop or laptop computer that you use to control the nodes. This machine can be running any variant of Unix, Linux, or Mac OS (however, only Debian derivatives are supported at the moment). +Our goal +------------------ -All the commands in this tutorial are run on your sysadmin machine. In order to complete the tutorial, the sysadmin machine must: +We are going to create a minimal LEAP provider offering OpenVPN service. This basic setup can be expanded by adding more OpenVPN nodes to increase capacity, or more webapp and couchdb nodes to increase availability (performance wise, a single couchdb and a single webapp are more than enough for most usage, since they are only lightly used, but you might want redundancy). -* Be a real machine with virtualization support in the CPU (VT-x or AMD-V). In other words, not a virtual machine. -* Have at least 4gb of RAM. -* Have a fast internet connection (because you will be downloading a lot of big files, like virtual machine images). +Our goal is something like this: -Install prerequisites + $ leap list + NODES SERVICES TAGS + couch1 couchdb + web1 webapp + vpn1 openvpn + +NOTE: You won't be able to run that `leap list` command yet, not until we actually create the node configurations. + +Requirements +------------ + +In order to complete this Quick Start, you will need a few things: + +* You will need three real or paravirtualized virtual machines (KVM, Xen, Openstack, Amazon, but not Vagrant - sorry) that have a basic Debian Stable installed. If you allocate 10G to each node, that should be plenty. +* You should be able to SSH into them remotely, and know their IP addresses and their SSH host keys +* You will need four different IPs, one for each node, and a second one for the VPN gateway +* The ability to create/modify DNS entries for your domain is preferable, but not needed. If you don't have access to DNS, you can workaround this by modifying your local resolver, i.e. editing `/etc/hosts`. +* You need to be aware that this process will make changes to your systems, so please be sure that these machines are a basic install with nothing configured or running for other purposes +* Your machines will need to be connected to the internet, and not behind a restrictive firewall. +* You should work locally on your laptop/workstation (one that you trust and that is ideally full-disk encrypted) while going through this guide. This is important because the provider configurations you are creating contain sensitive data that should not reside on a remote machine. The leap cli utility will login to your servers and configure the services. + +All the commands in this tutorial are run on your sysadmin machine. In order to complete the tutorial, the sysadmin will do the following: + +* Install pre-requisites +* Install the LEAP command-line utility +* Check out the LEAP platform +* Create a provider and its certificates +* Setup the provider's nodes and the services that will reside on those nodes +* Initialize the nodes +* Deploy the LEAP platform to the nodes +* Test that things worked correctly +* Some additional commands + +We will walk you through each of these steps. + + +Prepare your environment +======================== + +There are a few things you need to setup before you can get going. Just some packages, the LEAP cli and the platform. + +Install pre-requisites -------------------------------- *Debian & Ubuntu* Install core prerequisites: - sudo apt-get install git ruby ruby-dev rsync openssh-client openssl rake make - -Install Vagrant in order to be able to test with local virtual machines (typically optional, but required for this tutorial): - - sudo apt-get install vagrant virtualbox + $ sudo apt-get install git ruby ruby-dev rsync openssh-client openssl rake make bzip2 -Install leap +NOTE: leap_cli should work with ruby1.8, but has only been tested using ruby1.9. + + +Install the LEAP command-line utility --------------------- Install `leap` command from source: - git clone git://leap.se/leap_cli.git - cd leap_cli - rake build + $ git clone https://leap.se/git/leap_cli + $ cd leap_cli + $ rake build Then, install as root user (recommended): - sudo rake install + $ sudo rake install Or, install as unprivileged user: - rake install + $ rake install # watch out for the directory leap is installed to, then i.e. - sudo ln -s ~/.gem/ruby/1.9.1/bin/leap /usr/local/bin/leap + $ sudo ln -s ~/.gem/ruby/1.9.1/bin/leap /usr/local/bin/leap With both methods, you can use now /usr/local/bin/leap, which in most cases will be in your $PATH. +If you have successfully installed the LEAP cli, then you should be able to do the following: -Create a provider instance ---------------------------------------- + $ leap --help + +and be presented with the command-line help options. If you receive an error when doing this, please read through the README.md in the LEAP cli source to try and resolve any problems before going forwards. + + +Check out the platform +---------------------- + +The LEAP Platform is a series of puppet recipes and modules that will be used to configure your provider. You will need a local copy of the platform that will be used to setup your nodes and manage your services. To begin with, you will not need to modify the LEAP Platform. + +First we'll create a directory for LEAP things, and then we'll check out the platform code and initalize the modules: + + $ mkdir ~/leap + $ cd ~/leap + $ git clone https://leap.se/git/leap_platform.git + $ cd leap_platform + $ git submodule sync; git submodule update --init -A provider instance is a directory tree, usually stored in git, that contains everything you need to manage an infrastructure for a service provider. In this case, we create one for bitmask.net and call the instance directory 'bitmask'. - mkdir -p ~/leap/bitmask +Provider Setup +============== -Now, we will initialize this directory to make it a provider instance. Your provider instance will need to know where it can find local copy of the git repository leap_platform, which holds the puppet recipes you will need to manage your servers. Typically, you will not need to modify leap_platform. +A provider instance is a directory tree, usually stored in git, that contains everything you need to manage an infrastructure for a service provider. In this case, we create one for example.org and call the instance directory 'example'. - cd ~/leap/bitmask - leap new . + $ mkdir -p ~/leap/example + +Bootstrap the provider +----------------------- + +Now, we will initialize this directory to make it a provider instance. Your provider instance will need to know where it can find the local copy of the git repository leap_platform, which we setup in the previous step. + + $ cd ~/leap/example + $ leap new . + +NOTES: + . make sure you include that trailing dot! The `leap new` command will ask you for several required values: -* domain: The primary domain name of your service provider. In this tutorial, we will be using "bitmask.net". -* name: The name of your service provider. +* domain: The primary domain name of your service provider. In this tutorial, we will be using "example.org". +* name: The name of your service provider (we use "Example"). * contact emails: A comma separated list of email addresses that should be used for important service provider contacts (for things like postmaster aliases, Tor contact emails, etc). -* platform: The directory where you have a copy of the `leap_platform` git repository checked out. If it doesn't exist, it will be downloaded for you. +* platform: The directory where you have a copy of the `leap_platform` git repository checked out. + +You could also have passed these configuration options on the command-line, like so: + + $ leap new --contacts your@email.here --domain leap.example.org --name Example --platform=~/leap/leap_platform . You may want to poke around and see what is in the files we just created. For example: - cat provider.json + $ cat provider.json Optionally, commit your provider directory using the version control software you fancy. For example: - git init - git add . - git commit -m "initial commit" + $ git init + $ git add . + $ git commit -m "initial provider commit" Now add yourself as a privileged sysadmin who will have access to deploy to servers: - leap add-user --self + $ leap add-user --self -NOTE: in most cases, `leap` must be run from within a provider instance directory tree (e.g. ~/leap/bitmask). +NOTE: in most cases, `leap` must be run from within a provider instance directory tree (e.g. ~/leap/example). -Now generate required X509 certificates and keys: +Create provider certificates +---------------------------- - leap cert ca - leap cert csr +Create two certificate authorities, one for server certs and one for client +certs (note: you only need to run this one command to get both): + + $ leap cert ca + +Create a temporary cert for your main domain (you should replace with a real commercial cert at some point) + + $ leap cert csr To see details about the keys and certs that the prior two commands created, you can use `leap inspect` like so: - leap inspect files/ca/ca.crt + $ leap inspect files/ca/ca.crt + +Create the Diffie-Hellman parameters file, needed for forward secret OpenVPN ciphers: + + $ leap cert dh + +NOTE: the files `files/ca/*.key` are extremely sensitive and must be carefully protected. The other key files are much less sensitive and can simply be regenerated if needed. Edit provider.json configuration @@ -119,58 +200,99 @@ Edit provider.json configuration There are a few required settings in provider.json. At a minimum, you must have: { - "domain": "bitmask.net", - "name": "Bitmask", + "domain": "example.org", + "name": "Example", "contacts": { - "default": "email1@domain.org, email2@domain.org" + "default": "email1@example.org" } } For a full list of possible settings, you can use `leap inspect` to see how provider.json is evaluated after including the inherited defaults: - leap inspect provider.json + $ leap inspect provider.json -Create nodes ---------------------- -A "node" is a server that is part of your infrastructure. Every node can have one or more services associated with it. Some nodes are "local" and used only for testing. These local nodes exist only as virtual machines on your computer and cannot be accessed from outside (see `leap help local` for more information). +Setup the provider's nodes and services +--------------------------------------- -Create a local node, with the service "webapp": +A "node" is a server that is part of your infrastructure. Every node can have one or more services associated with it. Some nodes are "local" and used only for testing, see [Development](developmet) for more information. - leap node add --local web1 services:webapp +Create a node, with the service "webapp": -This created a node configuration file in `nodes/web1.json`, but it did not create the virtual machine. In order to test our node "web1", we need to first spin up a virtual machine. The next command will probably take a very long time, because it will need to download a VM image (about 700mb). + $ leap node add web1 ip_address:x.x.x.w services:webapp tags:production - leap local start +NOTE: replace x.x.x.w with the actual IP address of this node -Now that the virtual machine for web1 is running, you need to initialize it and then deploy the recipes to it. You only need to initialize a node once, but there is no harm in doing it multiple times. These commands will take a while to run the first time, as it needs to update the package cache on the new virtual machine. +This created a node configuration file in `nodes/web1.json`, but it did not do anything else. It also added the 'tag' called 'production' to this node. Tags allow us to conveniently group nodes together. When creating nodes, you should give them the tag 'production' if the node is to be used in your production infrastructure. - leap node init web1 - leap deploy web1 +The web application and the VPN nodes require a database, so lets create the database server node: -That is it, you should now have your first running node. However, the LEAP web application requires a database to run, so let's add a "couchdb" node: + $ leap node add couch1 ip_address:x.x.x.x services:couchdb tags:production - leap node add --local db1 services:couchdb - leap local start - leap node init db1 - leap deploy db1 +NOTE: replace x.x.x.x with the actual IP address of this node -Access the web application --------------------------------------------- +Now we need the VPN gateway, so lets create that node: -You should now have two local virtual machines running, one for the web application and one for the database. In order to connect to the web application in your browser, you need to point your domain at the IP address of the web application node (named web1 in this example). + $ leap node add vpn1 ip_address:x.x.x.y openvpn.gateway_address:x.x.x.z services:openvpn tags:production -There are a lot of different ways to do this, but one easy way is to modify your `/etc/hosts` file. First, find the IP address of the webapp node: +NOTE: replace x.x.x.y with the IP address of the machine, and x.x.x.z with the second IP. openvpn gateways must be assigned two IP addresses, one for the host itself and one for the openvpn gateway. We do this to prevent incoming and outgoing VPN traffic on the same IP. Without this, the client might send some traffic to other VPN users in the clear, bypassing the VPN. - leap list webapp --print ip_address -Then modify `/etc/hosts` like so: +Setup DNS +--------- + +Now that you have the nodes configured, you should create the DNS entries for these nodes. + +Set up your DNS with these hostnames: + + $ leap list --print ip_address,domain.full,dns.aliases + couch1 x.x.x.w, couch1.example.org, null + web1 x.x.x.x, web1.example.org, api.example.org, nicknym.example.org + vpn1 x.x.x.y, vpn1.example.org, null + +Alternately, you can adapt this zone file snippet: - 10.5.5.47 DOMAIN + $ leap compile zone -Replacing 'DOMAIN' with whatever you specified as the `domain` in the `leap new` command. +If you cannot edit your DNS zone file, you can still test your provider by adding entries to your local resolver hosts file (`/etc/hosts` for linux): -Next, you can connect to the web application either using a web browser or via the API using the LEAP client. To use a browser, connect to https://DOMAIN. Your browser will complain about an untrusted cert, but for now just bypass this. From there, you should be able to register a new user and login. + x.x.x.w couch1.example.org + x.x.x.x web1.example.org api.example.org example.org + x.x.x.y vpn1.example.org + +Please don't forget about these entries, they will override DNS queries if you setup your DNS later. + + +Initialize the nodes +-------------------- + +Node initialization only needs to be done once, but there is no harm in doing it multiple times: + + $ leap node init production + +This will initialize all nodes with the tag "production". When `leap node init` is run, you will be prompted to verify the fingerprint of the SSH host key and to provide the root password of the server(s). You should only need to do this once. + +If you prefer, you can initalize each node, one at a time: + + $ leap node init web1 + $ leap node init couch1 + $ leap node init vpn1 + +Deploy the LEAP platform to the nodes +-------------------- + +Now you should deploy the platform recipes to the nodes. Deployment can take a while to run, especially on the first run, as it needs to update the packages on the new machine: + + $ leap deploy web1 + +Watch the output for any errors (in red), if everything worked fine, you should now have your first running node. If you do have errors, try doing the deploy again. + +However, to deploy our three-node openvpn setup, we need the database and LEAP web application requires a database to run, so let's deploy to the couchdb and openvpn nodes: + + $ leap deploy couch1 + $ leap deploy vpn1 + +NOTE: the output from deploying can be quite busy, so we often do them each node one by one. What is going on here? -------------------------------------------- @@ -190,17 +312,55 @@ You can run `leap -v2 deploy` to see exactly what commands are being executed. -Additional commands -------------------------------------------- + +Test that things worked correctly +================================= + +You should now have three machines with the LEAP platform deployed to them, one for the web application, one for the database and one for the OpenVPN gateway. + + +Access the web application +-------------------------------------------- + +In order to connect to the web application in your browser, you need to point your domain at the IP address of the web application node (named web1 in this example). + +There are a lot of different ways to do this, but one easy way is to modify your `/etc/hosts` file. First, find the IP address of the webapp node: + + $ leap list webapp --print ip_address + +Then modify `/etc/hosts` like so: + + x.x.x.w leap.example.org + +Replacing 'leap.example.org' with whatever you specified as the `domain` in the `leap new` command. + +Next, you can connect to the web application either using a web browser or via the API using the LEAP client. To use a browser, connect to https://leap.example.org (replacing that with your domain). Your browser will complain about an untrusted cert, but for now just bypass this. From there, you should be able to register a new user and login. + +Use the VPN +----------- + +You should be able to simply test that the OpenVPN gateway works properly by doing the following: + + $ leap test init + $ sudo openvpn test/openvpn/unlimited.ovpn + +Or, you can use the LEAP client (called "bitmask") to connect to your new provider, create a user and then connect to the VPN. + + +Additional information +====================== + +It is useful to know a few additional things. + +Useful commands +--------------- Here are a few useful commands you can run on your new local nodes: * `leap ssh web1` -- SSH into node web1 (requires `leap node init web1` first). * `leap list` -- list all nodes. +* `leap list production` -- list only those nodes with the tag 'production' * `leap list --print ip_address` -- list a particular attribute of all nodes. -* `leap local reset web1` -- return web1 to a pristine state. -* `leap local stop` -- stop all local virtual machines. -* `leap local status` -- get the running state of all the local virtual machines. * `leap cert update` -- generate new certificates if needed. See the full command reference for more information. @@ -223,20 +383,12 @@ Examples: * `leap deploy webapp openvpn` -- deploy to all webapp OR openvpn nodes. * `leap node init vpn1` -- just init the node named vpn1. -Running on real hardware ------------------------------------ - -The steps required to initialize and deploy to nodes on the public internet are basically the same as we have seen so far for local testing nodes. There are a few key differences: - -* Obviously, you will need to acquire a real or virtual machine that you can SSH into remotely. -* When creating the node configuration, you should give it the tag "production" if the node is to be used in your production infrastructure. -* When creating the node configuration, you need to specify the IP address of the node. - -For example: +Keep track of your provider configurations +------------------------------------------ - leap node add db1 tags:production services:couchdb ip_address:4.4.4.4 +You should commit your provider changes to your favorite VCS whenever things change. This way you can share your configurations with other admins, all they have to do is to pull the changes to stay up to date. Every time you make a change to your provider, such as adding nodes, services, generating certificates, etc. you should add those to your VCS, commit them and push them to where your repository is hosted. -Also, running `leap node init NODE_NAME` on a real server will prompt you to verify the fingerprint of the SSH host key and to provide the root password of the server NODE_NAME. You should only need to do this once. +Note that your provider directory contains secrets! Those secrets include passwords for various services. You do not want to have those passwords readable by the world, so make sure that wherever you are hosting your repository, it is not public for the world to read. What's next ----------------------------------- diff --git a/doc/troubleshooting.md b/doc/troubleshooting.md new file mode 100644 index 00000000..bb2fc4b5 --- /dev/null +++ b/doc/troubleshooting.md @@ -0,0 +1,147 @@ +@title = 'Troubleshooting Guide' +@nav_title = 'Troubleshooting' +@toc = true + + +General +======= + +* Please increase verbosity when debugging / filing issues in our issue tracker. You can do this with adding i.e. `-v 5` after the `leap` cmd, i.e. `leap -v 2 deploy`. + +Webapp node +=========== + +Places to look for errors +------------------------- + +* `/var/log/apache2/error.log` +* `/srv/leap/webapp/log/production.log` +* `/var/log/syslog` (watch out for stunnel issues) + +Is haproxy ok ? +--------------- + + + curl -s -X GET "http://127.0.0.1:4096" + +Is couchdb accessible through stunnel ? +--------------------------------------- + + + curl -s -X GET "http://127.0.0.1:4000" + + +Check couchdb acl +----------------- + + + mkdir /etc/couchdb + cat /srv/leap/webapp/config/couchdb.yml.admin # see username and password + echo "machine 127.0.0.1 login admin password " > /etc/couchdb/couchdb-admin.netrc + chmod 600 /etc/couchdb/couchdb-admin.netrc + + curl -s --netrc-file /etc/couchdb/couchdb-admin.netrc -X GET "http://127.0.0.1:4096" + curl -s --netrc-file /etc/couchdb/couchdb-admin.netrc -X GET "http://127.0.0.1:4096/_all_dbs" + + +Couchdb node +============ + +Places to look for errors +------------------------- + +* `/opt/bigcouch/var/log/bigcouch.log` +* `/var/log/syslog` (watch out for stunnel issues) + + +Bigcouch membership +------------------- + +* All nodes configured for the provider should appear here: + + + curl -s --netrc-file /etc/couchdb/couchdb.netrc -X GET 'http://127.0.0.1:5986/nodes/_all_docs' + +* All configured nodes should show up under "cluster_nodes", and the ones online and communicating with each other should appear under "all_nodes". This example output shows the configured cluster nodes `couch1.bitmask.net` and `couch2.bitmask.net`, but `couch2.bitmask.net` is currently not accessible from `couch1.bitmask.net` + + + curl -s --netrc-file /etc/couchdb/couchdb.netrc 'http://127.0.0.1:5984/_membership' + {"all_nodes":["bigcouch@couch1.bitmask.net"],"cluster_nodes":["bigcouch@couch1.bitmask.net","bigcouch@couch2.bitmask.net"]} + + + +Databases +--------- + +* Following output shows all neccessary DBs that should be present. Note that the `user-0123456....` DBs are the data stores for a particular user. + + + curl -s --netrc-file /etc/couchdb/couchdb.netrc -X GET 'http://127.0.0.1:5984/_all_dbs' + ["customers","identities","sessions","shared","tickets","tokens","user-0","user-9d34680b01074c75c2ec58c7321f540c","user-9d34680b01074c75c2ec58c7325fb7ff","users"] + + + +Design Documents +---------------- + +* Is User `_design doc` available ? + + + curl -s --netrc-file /etc/couchdb/couchdb.netrc -X GET "http://127.0.0.1:5984/users/_design/User" + + + +MX node +======= + +Places to look for errors +------------------------- + +* `/var/log/mail.log` +* `/var/log/leap_mx.log` +* `/var/log/syslog` (watch out for stunnel issues) + + +Query leap-mx +------------- + +* for useraccount + + + postmap -v -q "joe@dev.bitmask.net" tcp:localhost:2244 + ... + postmap: dict_tcp_lookup: send: get jow@dev.bitmask.net + postmap: dict_tcp_lookup: recv: 200 + ... + +* for mailalias + + + postmap -v -q "joe@dev.bitmask.net" tcp:localhost:4242 + ... + postmap: dict_tcp_lookup: send: get joe@dev.bitmask.net + postmap: dict_tcp_lookup: recv: 200 f01bc1c70de7d7d80bc1ad77d987e73a + postmap: dict_tcp_lookup: found: f01bc1c70de7d7d80bc1ad77d987e73a + f01bc1c70de7d7d80bc1ad77d987e73a + ... + + + +Mailspool +--------- + +* Any file in the mailspool longer for a few seconds ? + + + ls -la /var/mail/vmail/Maildir/cur/ + + +VPN node +======== + +Places to look for errors +------------------------- + +* `/var/log/syslog` (watch out for openvpn issues) + + -- cgit v1.2.3