From 78e1face3430353e8af95f3eaefa7dc51368e7d0 Mon Sep 17 00:00:00 2001 From: elijah Date: Thu, 25 Feb 2016 14:55:55 -0800 Subject: cleaned up couchdb migration and upgrade notes --- amber/menu.txt | 2 + pages/docs/platform/services/couchdb.md | 195 ++++++++++++++++----------- pages/docs/platform/tutorials/upgrading.md | 90 ------------- pages/docs/platform/upgrading/en.haml | 5 + pages/docs/platform/upgrading/upgrade-0-8.md | 71 ++++++++++ 5 files changed, 193 insertions(+), 170 deletions(-) delete mode 100644 pages/docs/platform/tutorials/upgrading.md create mode 100644 pages/docs/platform/upgrading/en.haml create mode 100644 pages/docs/platform/upgrading/upgrade-0-8.md diff --git a/amber/menu.txt b/amber/menu.txt index 8acf059..ef341f7 100644 --- a/amber/menu.txt +++ b/amber/menu.txt @@ -54,6 +54,8 @@ docs soledad tor webapp + upgrading + upgrade-0-8 troubleshooting tests known-issues diff --git a/pages/docs/platform/services/couchdb.md b/pages/docs/platform/services/couchdb.md index 5b64ada..7bcc48c 100644 --- a/pages/docs/platform/services/couchdb.md +++ b/pages/docs/platform/services/couchdb.md @@ -4,19 +4,76 @@ Topology ------------------------ -`couchdb` nodes communicate heavily with `webapp`, `mx`, and `soledad` nodes. Typically, `couchdb` nodes will also have the `soledad` service. +Required: + +* Nodes with `couchdb` service must also have `soledad` service, if email is enabled. + +Suggested: + +* Nodes with `couchdb` service communicate heavily with `webapp` and `mx`. `couchdb` nodes do not need to be reachable from the public internet, although the `soledad` service does require this. Configuration ---------------------------- -There are no options that should be modified for `couchdb` nodes. +### Nighly dumps + +You can do a nightly couchdb data dump by adding this to your node config: + + "couch": { + "backup": true + } + +Data will get dumped to `/var/backups/couchdb`. + +### Plain CouchDB + +BigCouch is not supported on Platform version 0.8 and higher: only plain CouchDB is possible. For earlier versions, you must do this in order to use plain CouchDB: + + "couch": { + "master": true, + "pwhash_alg": "pbkdf2" + } + +Various Tasks +------------------------------------------------- + +### Re-enabling blocked account + +When a user account gets destroyed from the webapp, there's still a leftover doc in the identities db so other people can't claim that account without an admin's intervention. You can remove this username reservation through the webapp. + +However, here is how you could do it manually, if you wanted to: + +grep the identities db for the email address: + + curl -s --netrc-file /etc/couchdb/couchdb.netrc -X GET http://127.0.0.1:5984/identities/_all_docs?include_docs=true|grep test_127@bitmask.net + +lookup "id" and "rev" to delete the doc: + + curl -s --netrc-file /etc/couchdb/couchdb.netrc -X DELETE 'http://127.0.0.1:5984/identities/b25cf10f935b58088f0d547fca823265?rev=2-715a9beba597a2ab01851676f12c3e4a' + +### How to find out which userstore belongs to which identity? + + /usr/bin/curl -s --netrc-file /etc/couchdb/couchdb.netrc '127.0.0.1:5984/identities/_all_docs?include_docs=true' | grep testuser -NOTE: The LEAP platform is designed to support many database nodes. The goal is for you to be able to add nodes and remove nodes and everything should rebalance and work smoothly. Currently, however, we are using a broken CouchDB variant called BigCouch. Until we migrate off BigCouch, you should only have one `couchdb` node. More than one will work most of the time, but there are some bugs that can pop up and that are unfixed. + {"id":"665e004870ee17aa4c94331ff3ecb173","key":"665e004870ee17aa4c94331ff3ecb173","value":{"rev":"2-2e335a75c4b79a5c2ef5c9950706fe1b"},"doc":{"_id":"665e004870ee17aa4c94331ff3ecb173","_rev":"2-2e335a75c4b79a5c2ef5c9950706fe1b","user_id":"665e004870ee17aa4c94331ff3cd59eb","address":"testuser@example.org","destination":"testuser@example.org","keys": ... -Manual Tasks ---------------------- +* search for the "user_id" field +* in this example testuser@example.org uses the database user-665e004870ee17aa4c94331ff3cd59eb + + +### How much disk space is used by a userstore + +Beware that this returns the uncompacted disk size (see http://wiki.apache.org/couchdb/Compaction) + + echo "`curl --netrc -s -X GET 'http://127.0.0.1:5984/user-dcd6492d74b90967b6b874100b7dbfcf'|json_pp|grep disk_size|cut -d: -f 2`/1024"|bc + + +Deprecated BigCouch Tasks +----------------------------------------- + +As of release 0.8, the LEAP platform no longer supports BigCouch. This information is kept here for historical reference. ### Rebalance Cluster @@ -24,52 +81,78 @@ Bigcouch currently does not have automatic rebalancing. It will probably be added after merging into couchdb. If you add a node, or remove one node from the cluster, -. make sure you have a backup of all DBs ! +1. make sure you have a backup of all DBs ! + +1. put the webapp into [[maintenance mode => services/webapp#maintenance-mode]] + +1. Stop all services that access the database: -. put the webapp into [maintenance mode](https://leap.se/en/docs/platform/services/webapp#maintenance-mode) -. Stop all services that access the database: + ``` + workstation$ leap ssh soledad-nodes + server# /etc/init.d/soledad-server stop - * leap-mx - * postfix - * soledad-server - * nickserver + workstation$ leap ssh mx-node + server# /etc/init.d/postfix stop + server# /etc/init.d/leap-mx stop -. dump the dbs: + workstation$ leap ssh webapp + server# /etc/init.d/nickserver stop + ``` + Alternately, you can create a temporary firewall rule to block access (run on couchdb server): + + ``` + server# iptables -A INPUT -p tcp --dport 5984 --jump REJECT + ``` + +1. dump the dbs: + + ``` cd /srv/leap/couchdb/scripts time ./couchdb_dumpall.sh + ``` + +1. delete all dbs + +1. shut down old node -. delete all dbs -. shut down old node -. check the couchdb members +1. check the couchdb members + ``` curl -s —netrc-file /etc/couchdb/couchdb.netrc -X GET http://127.0.0.1:5986/nodes/_all_docs curl -s —netrc-file /etc/couchdb/couchdb.netrc http://127.0.0.1:5984/_membership + ``` +1. remove bigcouch from all nodes -. remove bigcouch from all nodes - + ``` apt-get --purge remove bigcouch + ``` +1. deploy to all couch nodes -. deploy to all couch nodes - - leap deploy development +couchdb + ``` + leap deploy couchdb + ``` -. most likely, deploy will fail because bigcouch will complain about not all nodes beeing connected. Let the deploy finish, restart the bigcouch service on all nodes and re-deploy: +1. most likely, deploy will fail because bigcouch will complain about not all nodes beeing connected. Let the deploy finish, restart the bigcouch service on all nodes and re-deploy: + ``` /etc/init.d/bigcouch restart + ``` +1. restore the backup -. restore the backup - + ``` cd /srv/leap/couchdb/scripts time ./couchdb_restoreall.sh + ``` +### Migrating from BigCouch to plain CouchDB -### Migrating from bigcouch to plain couchdb +Here are the steps needed to replace BigCouch with CouchDB. -At the end of this process, you will have just *one* couchdb server. If you had a bigcouch cluster before, you will be removing all but one of those machines to consolidate them into one couchdb machine. +At the end of this process, you will have just *one* noe with `services` property equal to `couchdb`. If you had a BigCouch cluster before, you will be removing all but one of those machines to consolidate them into one couchdb machine. 1. if you have multiple nodes with the couchdb service on them, pick one of them to be your couchdb server, and remove the service from the others. If these machines were only doing couchdb, you can remove the nodes completely with `leap node rm ` and then you can decommission the servers @@ -124,6 +207,12 @@ At the end of this process, you will have just *one* couchdb server. If you had workstation$ leap deploy couchdb ``` + If you used the iptables method of blocking access to couchdb, you need to run it again because the deploy just overwrote all the iptables rules: + + ``` + server# iptables -A INPUT -p tcp --dport 5984 --jump REJECT + ``` + 1. restore the backup, this will take approximately the same amount of time as the backup took above: ``` @@ -161,57 +250,3 @@ At the end of this process, you will have just *one* couchdb server. If you had 1. Relax, enjoy a refreshing beverage. -### Re-enabling blocked account - -When a user account gets destroyed from the webapp, there's still a leftover doc in the identities db so other people can't claim that account without an admin's intervention. You can remove this username reservation through the webapp. - -However, here is how you could do it manually, if you wanted to: - -grep the identities db for the email address: - - curl -s --netrc-file /etc/couchdb/couchdb.netrc -X GET http://127.0.0.1:5984/identities/_all_docs?include_docs=true|grep test_127@bitmask.net - -lookup "id" and "rev" to delete the doc: - - curl -s --netrc-file /etc/couchdb/couchdb.netrc -X DELETE 'http://127.0.0.1:5984/identities/b25cf10f935b58088f0d547fca823265?rev=2-715a9beba597a2ab01851676f12c3e4a' - -### How to find out which userstore belongs to which identity? - - /usr/bin/curl -s --netrc-file /etc/couchdb/couchdb.netrc '127.0.0.1:5984/identities/_all_docs?include_docs=true' | grep testuser - - {"id":"665e004870ee17aa4c94331ff3ecb173","key":"665e004870ee17aa4c94331ff3ecb173","value":{"rev":"2-2e335a75c4b79a5c2ef5c9950706fe1b"},"doc":{"_id":"665e004870ee17aa4c94331ff3ecb173","_rev":"2-2e335a75c4b79a5c2ef5c9950706fe1b","user_id":"665e004870ee17aa4c94331ff3cd59eb","address":"testuser@example.org","destination":"testuser@example.org","keys": ... - -* search for the "user_id" field -* in this example testuser@example.org uses the database user-665e004870ee17aa4c94331ff3cd59eb - - -### How much disk space is used by a userstore - -Beware that this returns the uncompacted disk size (see http://wiki.apache.org/couchdb/Compaction) - - echo "`curl --netrc -s -X GET 'http://127.0.0.1:5984/user-dcd6492d74b90967b6b874100b7dbfcf'|json_pp|grep disk_size|cut -d: -f 2`/1024"|bc - -## Use plain couchdb instead of bigcouch - -Be aware that latest stable couchdb 1.6 cannot be clustered like bigcouch, so you can use this only if you have a single couchdb node. - -Use this in your couchdb node config: - - "couch": { - "master": true, - "pwhash_alg": "pbkdf2" - } - -Local couch data dumps -====================== - -You can let one or more nodes do a nightly couchdb data dump adding this to your node config: - - "couch": { - "backup": true - } - -Data will get dumped to `/var/backups/couchdb`. - -Be aware that this will gather all data possibly shared over multiple nodes on one node. - diff --git a/pages/docs/platform/tutorials/upgrading.md b/pages/docs/platform/tutorials/upgrading.md deleted file mode 100644 index 8b0dd20..0000000 --- a/pages/docs/platform/tutorials/upgrading.md +++ /dev/null @@ -1,90 +0,0 @@ -@title = 'Upgrading' -@nav_title = 'Upgrading' -@summary = "Upgrading the platform and the OS" -@toc = true - - -Upgrading the platform -====================== - -From 0.7.1 to 0.8 -================= - -Next to other changesm 0.8 introduces several major changes that need do get taken into account while upgrading: - -- Dropping Debian Wheezy support. You need to upgrade your nodes to jessie before deploying a platform upgrade. -- Dropping Bigcouch support. LEAP Platform now requires couchdb and therefore you need to migrate from bigcouch to couchdb. - -Here's how to upgrade from wheezy nodes running bigcouch to jessie nodes using couchdb: - -- Follow https://leap.se/en/docs/platform/services/couchdb#migrating-from-bigcouch-to-plain-couchdb, but only until the step where you removed bigouch. - -- Now upgrade to jessie (see the Howto below) - -- Continue with https://leap.se/en/docs/platform/services/couchdb#migrating-from-bigcouch-to-plain-couchdb at the point where you stopped for the first step, and deploy to the couchdb node. - - -Upgrading the operating system -============================== - -From Debian Wheezy to Jessie ----------------------------- - -There are the [Debian release notes on how to upgrade from wheezy to jessie](https://www.debian.org/releases/stable/amd64/release-notes/ch-upgrading.html). -He're the steps that worked for us, but please keep in mind that this is not a bullet-prove documentation, so use it on your own risk: - - screen - script -t 2>~/leap_upgrade-jessiestep.time -a ~/upgrade-jessiestep.script - - apt-get autoremove - apt-get update - DEBIAN_FRONTEND=noninteractive apt-get -y -o DPkg::Options::=--force-confold dist-upgrade - - dpkg --audit - dpkg --get-selections | grep 'hold$' - # if anything is held, you need to resolve it before continuing. - - apt-get clean - - # switch sources to jessie - sed -i 's/wheezy/jessie/g' /etc/apt/sources.list - echo "deb http://deb.leap.se/0.8 jessie main" > /etc/apt/sources.list.d/leap.list - - # remove pinnings to wheezy - rm /etc/apt/preferences - - apt-get update - - # test there is enough space for the upgrade - apt-get -o APT::Get::Trivial-Only=true dist-upgrade - - # do first stage upgrade - DEBIAN_FRONTEND=noninteractive apt-get -y -o DPkg::Options::=--force-confold upgrade - - # repeat dist-upgrade until it makes no more changes: - DEBIAN_FRONTEND=noninteractive apt-get -y -o DPkg::Options::=--force-confold dist-upgrade - - # resolve any apt issues if there are some - apt-get -f install - - reboot - - - - - - - - -### Issues - -- Failure restarting some services for OpenSSL upgrade - -If you get this warning: - - The following services could not be restarted for the OpenSSL library upgrade: - postfix - You will need to start these manually by running '/etc/init.d/ start'. - -Just ignore it, it should be fixed on reboot/deploy. - diff --git a/pages/docs/platform/upgrading/en.haml b/pages/docs/platform/upgrading/en.haml new file mode 100644 index 0000000..efa0d7c --- /dev/null +++ b/pages/docs/platform/upgrading/en.haml @@ -0,0 +1,5 @@ +- @nav_title = "Upgrading" +- @title = "Upgrading from prior LEAP platform releases" +- @summary = "" + += child_summaries \ No newline at end of file diff --git a/pages/docs/platform/upgrading/upgrade-0-8.md b/pages/docs/platform/upgrading/upgrade-0-8.md new file mode 100644 index 0000000..ca3fbe5 --- /dev/null +++ b/pages/docs/platform/upgrading/upgrade-0-8.md @@ -0,0 +1,71 @@ +@title = 'Upgrade to 0.8' +@toc = false + +LEAP Platform release 0.8 introduces several major changes that need do get taken into account while upgrading: + +* Dropping Debian Wheezy support. You need to upgrade your nodes to jessie before deploying a platform upgrade. +* Dropping BigCouch support. LEAP Platform now requires CouchDB and therefore you need to migrate all your data from BigCouch to CouchDB. + +Here's how to upgrade from Wheezy nodes running BigCouch to Jessie nodes using CouchDB: + +1. Follow [["migrating from BigCouch to plain CouchDB" => services/couchdb#migrating-from-bigcouch-to-plain-couchdb]], but only until the step where you removed BigCouch. +2. Now upgrade to Jessie (see below) +3. Continue with [["migrating from BigCouch to plain CouchDB" => services/couchdb#migrating-from-bigcouch-to-plain-couchdb]] at the point where you stopped for the first step, and deploy to the couchdb node. + +Upgrade from Debian Wheezy to Jessie +------------------------------------------------ + +There are the [Debian release notes on how to upgrade from wheezy to jessie](https://www.debian.org/releases/stable/amd64/release-notes/ch-upgrading.html). Here are the steps that worked for us, but please keep in mind that there is no bullet-proof method that will work in every situation. USE AT YOUR OWN RISK. + + # keep a log of the progress: + screen + script -t 2>~/leap_upgrade-jessiestep.time -a ~/upgrade-jessiestep.script + + # ensure you have a good wheezy install: + apt-get autoremove + apt-get update + DEBIAN_FRONTEND=noninteractive apt-get -y -o DPkg::Options::=--force-confold dist-upgrade + + # if anything is held, you need to resolve it before continuing: + dpkg --audit + dpkg --get-selections | grep 'hold$' + + # switch sources to jessie + sed -i 's/wheezy/jessie/g' /etc/apt/sources.list + echo "deb http://deb.leap.se/0.8 jessie main" > /etc/apt/sources.list.d/leap.list + + # remove pinnings to wheezy + rm /etc/apt/preferences + rm /etc/apt/preferences.d/* + + # get jessie package lists + apt-get update + + # test if there is enough disk space for the upgrade + apt-get clean + apt-get -o APT::Get::Trivial-Only=true dist-upgrade + + # do first stage upgrade + DEBIAN_FRONTEND=noninteractive apt-get -y -o DPkg::Options::=--force-confold upgrade + + # repeat dist-upgrade until it makes no more changes: + DEBIAN_FRONTEND=noninteractive apt-get -y -o DPkg::Options::=--force-confold dist-upgrade + + # resolve any apt issues if there are some + apt-get -f install + + reboot + + +### Issues + +**Failure restarting some services for OpenSSL upgrade** + +If you get this warning: + + The following services could not be restarted for the OpenSSL library upgrade: + postfix + You will need to start these manually by running '/etc/init.d/ start'. + +Just ignore it, it should be fixed on reboot/deploy. + -- cgit v1.2.3