10 files changed, 845 insertions, 11 deletions
diff --git a/docs/client/en.md b/docs/client/en.md
index bcb495f..e2d51be 100644
--- a/docs/client/en.md
+++ b/docs/client/en.md
@@ -1,6 +1,7 @@
 @title = 'LEAP Client'
+@summary = "Easy-to-use application for cloud communication services that are client encrypted."
 
-The **LEAP Client** is a GPL3 Licensed multiplatform client, written in python using PyQt4, that supports the features offered by [the LEAP Platform](platform). Currently is being tested on Linux, support for OSX and Windows will come soon.
+The **LEAP Client** is a GPL3 Licensed multiplatform client, written in Python using PyQt4, that supports the features offered by [the LEAP Platform](platform). Currently is being tested on Linux, support for OSX and Windows will come soon.
 
 * [User guide](client/user-guide)
 * [Running latest code](client/bleeding-edge)
diff --git a/docs/design/cuttlefish.md b/docs/design/cuttlefish.md
new file mode 100644
index 0000000..6b2c0f5
--- /dev/null
+++ b/docs/design/cuttlefish.md
@@ -0,0 +1,7 @@
+@title = 'Cuttlefish'
+@toc = true
+@summary = "Federated events and callback notifications."
+
+Not yet written.
+
+About the name: Cuttlefish are able to communicate by creating [different patterns on their skin](http://www.newscientist.com/article/dn3728-mathematics-reveals-the-cuttlefishs-wink.html) and communicate secretly with each other by [changing the polarization of their skin](http://www.ncbi.nlm.nih.gov/pubmed/9319987). Also, cuttlefish are [freakishly smart](http://www.pbs.org/wgbh/nova/nature/spineless-smarts.html).
diff --git a/docs/design/en.haml b/docs/design/en.haml
new file mode 100644
index 0000000..b5f1bc3
--- /dev/null
+++ b/docs/design/en.haml
@@ -0,0 +1,8 @@
+- @title = "Design Docs"
+- @summary = "Design documents and specifications for various LEAP components and protocols."
+
+%h1.first Design Documents
+
+Design documents and specifications for various LEAP components and protocols.
+
+= child_summaries
+\ No newline at end of file
diff --git a/docs/design/nicknym.md b/docs/design/nicknym.md
new file mode 100644
index 0000000..4041d56
--- /dev/null
+++ b/docs/design/nicknym.md
@@ -0,0 +1,445 @@
+@title = 'Nicknym'
+@toc = true
+@summary = "Automatic discovery and validation of public keys."
+
+Introduction
+==========================================
+
+Nicknym is a system to map user nicknames to public keys. With Nicknym, the user to be able to think solely in terms of nickname, while still being able to communicate with a high degree of security (confidentiality, integrity, and authenticity). Essentially, Nicknym is a system for binding human-memorable nicknames to a cryptographic key via automatic discovery and automatic validation.
+
+Nicknym is a federated protocol: a Nicknym address is in the form `username@domain` just alike an email address and Nicknym includes both a client and a server component. Although the client can fall back to legacy methods of key discovery when needed, domains that run the Nicknym server component enjoy much stronger identity guarentees.
+
+Nicknym is key agnostic, and supports whatever public key information is available for an address (OpenPGP, OTR, X.509, RSA, etc). However, Nicknym enforces a strict one-to-one mapping of address to public key.
+
+Existing forms of secure identity are deeply flawed. These systems rely on either a single trusted entity (e.g. Skype), a vulnerable Certificate Authority system (e.g. S/MIME), or keys that cannot be made human memorable (e.g. OpenPGP, OTR). When an identity system is hard to use, it is effectively compromised because too few people take the time to use it properly.
+
+The broken nature of existing identities systems (either in security or in usability) is especially troubling because identity remains a bedrock precondition for any message security: you cannot ensure confidentiality or integrity without confirming the authenticity of the other party. Nicknym is a protocol to solve this problem in a way that is backward compatible, easy for the user, and includes very strong authenticity when possible.
+
+Goals
+==========================================
+
+**High level goals**
+
+* Pseudo-anonymous and human friendly addresses in the form `username@domain`.
+* Automatic discovery and validation of public keys associated with an address.
+* The user should be able to use Nicknym without understanding anything about public/private keys or signatures.
+
+**Technical goals**
+
+* Wide utility: nicknym should be a general purpose protocol that can be used in wide variety of contexts.
+* No revocation: instead of key revocation, support short lived keys that frequently and automatically refresh.
+* Prevent dangerous actions: Nicknym should fail hard when there is a possibility of an attack.
+* Minimize false positives: because Nicknym fails hard, we should minimize false positives where it fails incorrectly.
+* Resistant to malicious actors: Nicknym should be externally auditable in order to assure service providers are not compromised or advertising bogus keys.
+* Resistant to association analysis: Nicknym should not reveal to any actor or network observer a map of a user's associations.
+
+**Non-goals**
+
+* Nicknym does not try to create a decentralized or peer-to-peer identity system.
+
+The binding problem
+=============================================
+
+Nicknym attempts to solve the problem of binding a human memorable identifer to a cryptographic key. If you have the identifier, you should be able to get the key with a high level of confidence, and vice versa. The goal is to have decentralized, human memorable, globally unique public keys. In other words, to violate [Zooko's triangle](https://en.wikipedia.org/wiki/Zooko's_triangle) by making a few consessions.
+
+There are a number of established methods for binding identifier to key:
+
+* [Web of Trust (WOT)](http://en.wikipedia.org/wiki/Web_of_trust)
+* Trust on First Use (TOFU)
+* [X.509 Certificate Authority System](https://en.wikipedia.org/wiki/X.509)
+* [DNSSEC](https://en.wikipedia.org/wiki/Dnssec)
+* [Shared Secret](https://en.wikipedia.org/wiki/Socialist_millionaire)
+* Mail-back Verification
+* [Network Perspective](http://convergence.io/)
+* Global Append-only Log
+* Nonverbal Feedback (a la ZRTP)
+
+The methods differ widely, but they all try to solve the same general problem of proving that a person or organization is in control of a particular key.
+
+Nicknym uses a combination of these methods, utilizing TOFU, X.509, Network Perspective, and additional methods we call "Provider Keys" and "Federated Web of Trust" (FWOT).
+
+1. Nicknym starts with TOFU of user keys, because it is easy to do and backward compatible with legacy providers. In TOFU, your client naively accept the key of another user when it first encounters it. When you TOFU a user key, you are making a bet that possible attackers against you did not have the foresight to specifically target you with a false key during discovery.
+2. Next, we add X.509. For those providers that publish the public keys of their users, we require that these keys be fetched over validated TLS. This makes third party attacks against TOFU more difficult, but also places a lot of trust in the providers (and the Certificate Authorities).
+3. Next, we add a simple form of Network Perspective where the client can ask one provider what key another provider is distributing. This allows a user's client to be able to audit their provider and keep them honest in an automated manner. If a service provider distributes bogus keys, their users and other providers will be quickly alerted to the problem.
+4. Next, we add Provider Keys. If a service provider has a provider key, the public keys of its users are additionally signed by the provider with the "provider key". If your client has the correct provider key, you no longer need to TOFU the keys of the provider's users. This has the benefit making it possible for a user to issue new keys, and to add support for very short-lived keys rather than trying to use key revocation. A service provider is much less likely to lose their private key or have it compromised, a significant problem with TOFU of user keys.
+5. Finally, we add a Federated Web of Trust. The system works like this: each service provider is responsible for the due diligence of properly signing the keys of a few other providers, akin to the distributed web of trust model of OpenPGP, but with all the hard work of proper signature validation placed upon the service provider. When a user communicates with another party who happens to use a service provider that participates in the FWOT, the user’s software will automatically trace a chain of signature from the other party’s key, to their service provider, to the user’s own service provider (with some possible intermediary signatures). This allows for identity that is verified through an end-to-end trust path from any user to any other user in a way that can be automated and is human memorable. Support for a FWOT allows us to bypass entirely X.509 Certificate Authorities, to gracefully handle short lived provider keys, and to handle emergency re-key events if a provider's key is lost.
+
+As we move down this list, each measure taken gets more complicated, requires more provider cooperation, and provides less additional benefit than the one before it. Nevertheless, each measure contributes some important benefit toward the goal of automatic binding of user identity to public key.
+
+**Questions**
+
+*Why not use WOT?* Most users are empirically unable to properly maintain a web of trust. The concepts are hard and it is easy to mess up the signing practice.
+
+*Why not use DNSSEC?* Many reasons. DNS records are slow to update. RSA Public keys will soon be too big for UDP packets (though this is not true of ECC), so putting keys in DNS will mean putting a URL to a key in DNS, so you might as well just use TLS. DNSSEC could still be of added benefit if you put the fingerprint in the DNS record. Mostly, however, a simple HTTP get request is a lot easier to deal with than DNS, both for the client and the server.
+
+*Why not use Shared Secret?* Shared secrets, like with the Socialist Millionaire protocol, are cool in theory but prone to user error and frustration in practice. Was the secret "Invisible Zebra" or "invisibleZebra"?
+
+*Why not use Mail-back Verification?* If the provider distributes user keys, there is not any benefit to mail-back verification. However, it would be good to add support for mail-back verification for non-cooperating legacy providers.
+
+*Why not use Global Append-only Log?* Maybe we should, they are neat. However, current implementations are resource intensive and experimental (e.g. namecoin).
+
+*Why not use Nonverbal Feedback?* ZRTP can use non-verbal clues to establish secure identify because of the nature of a live phone call. This doesn't work for text only messaging.
+
+
+Related work
+===================================
+
+**WebID and BrowserID**
+
+What about WebID or BrowserID? These are both interesting cryptographic identity standards that are gaining support and implementations. So why do we need something new?
+
+These protocols, and the poorly conceived OpenID Connect, are designed to address a fundamentally different problem: authenticating a user to a website. The problem of authenticating users to one another requires a different architecture entirely. There are some similarities, however, and in the long run Nicknym could be combined with something like BrowserID.
+
+**STEED**
+
+[STEED](http://g10code.com/steed.html) is a proposal with very similar goals to Nicknym. In a nutshell, Nicknym basically looks very similar to STEED when the domain owner does not support Nicknym. STEED includes four main ideas:
+
+* trust upon first contact: Nicknym uses this as well, although this is the fallback mechanism when others fail.
+* automatic key distribution and retrieval: Nicknym uses this as well, although we used HTTP for this instead of DNS.
+* automatic key generation: Nicknym is designed specifically to support automatic key generation, but this is outside the scope of the Nicknym protocol and it is not required.
+* opportunistic encryption: Again, Nicknym is designed to support opportunistic encryption, but does not require it.
+
+Additional differences include:
+
+* Nicknym is key agnostic: Nicknym does not make an assumption about what types of public keys a user wants to associate with their address.
+* Nicknym is protocol agnostic: Nicknym can be used with SMTP, XMPP, SIP, etc.
+* Nicknym relies on service provider adoption: With Nicknym, the strength of verification of public keys rests the degree to which a service provider adopts Nicknym. If a service provider does not support Nicknym, then effectively Nicknym opperates like STEED for that domain.
+
+
+Nicknym protocol
+==============================
+
+Definitions
+-------------------------
+
+* **address**: A globally unique handle in the form user@domain (i.e. an email, SIP, or XMPP address).
+* **provider**: A service provider that offers end-user services on a particular domain.
+* **user key**: A public/private key pair associated with a user address. If not specified, "user key" refers to the public key.
+* **provider key**: A public/private key pair owned by the provider. The address associated with this key is just the domain of the service provider.
+* **validated key**: A key is "validated" if the nickagent has bound the user address to a public key.
+* **nickagent**: Client side program that manages a user's contact list, the public keys they have encountered and validated, and the user's own key pairs.
+* **nickserver**: Server side daemon run by providers who support Nicknym.
+
+Nickserver requests
+-----------------------
+
+A nickagent will attempt to discover the public key for a particular user address by contacting a nickserver. The nickserver returns JSON encoded key information in response to a simple HTTP request with a user's address. For example:
+
+    curl -X POST -d address=alice@domain.org https://nicknym.domain.org:6425
+
+* The port is always 6425.
+* The HTTP verb may be POST or GET.
+* The request must use TLS (see [Query security](#Query.security)).
+* The query data should have a single field 'address'.
+* For POST requests to nicknym.domain.org, the query data may be encrypted to the the public OpenPGP key nicknym@domain.org (see [Query security](#Query.security)).
+
+Requests may be local or foreign, and for user keys or for provider keys.
+
+* **local** requests are for information that the nickserver is authoritative. In other words, when the requested address is for the same domain that the nickserver is running on.
+* **foreign** request are for information about other domains.
+* **user key** requests are for addresses in the form "username@domain".
+* **provider key** requests are for addresses in the form "domain".
+
+**Local, Provider Key request**
+
+For example:
+
+    https://nicknym.domain.org:6425/?address=domain.org
+
+The response is the authoritative provider key for that domain.
+
+**Local, User Key request**
+
+For example:
+
+    https://nicknym.domain.org:6425/?address=alice@domain.org
+
+The nickserver returns authoritative key information from the provider's own user database. Every public key returned for local requests must be signed by the provider's key.
+
+**Foreign, Provider Key request**
+
+For example:
+
+    https://nicknym.domain.org:6425/?address=otherdomain.org
+
+1. First, check the nickserver's cache database of discovered keys. If the cache is not old, return this key.
+2. Otherwise, fetch provider key from the provider's nickserver, cache the result, and return it.
+
+**Foreign, User Key request**
+
+For example:
+
+    https://nicknym.domain.org:6425/?address=bob@otherdomain.org
+
+* First, check the nickserver's database cache of nicknyms. If the cache is not old, return the key information found in the cache.
+* Otherwise, attempt to contact a nickserver run by the provider of the requested address. If the nickserver exists, query that nickserver, cache the result, and return it in the response.
+* Otherwise, fall back to querying existing SKS keyservers, cache the result and return it.
+* Otherwise, return a 404 error.
+
+If the key returned for a foreign request contains multiple user addresses, they are all ignored by nicknym except for the user address specified in the request.
+
+Nickserver response
+---------------------------------
+
+A nickserver response is a JSON encoded map with a field "address" plus one or more of the following fields: "openpgp", "otr", "rsa", "ecc", "x509-client", "x509-server", "x509-ca".
+
+A nickserver response is always signed with the OpenPGP public signing key associated with the address nicknym@domain.org. The signature is ASCII armored and appended to the JSON.
+
+For example:
+
+    {
+      "address": "alice@example.org",
+      "openpgp": "6VtcDgEKaHF64uk1c/crFhRHuFW9kTvgxAWAK01rXXjrxEa/aMOyXnVQuQINBEof...."
+    }
+    -----BEGIN PGP SIGNATURE-----
+    iQIcBAEBCgAGBQJRhWO+AAoJEIaItIgARAAl2IwP/24z9CjKjD0fd27pQs+r+e3h
+    p8KAYDbVac3+c3vm30DjHO/RKF4Zq6+sTAIkrFvXOwYJl9KgjMpQVV/voInjxATz
+    -----END PGP SIGNATURE-----
+
+If the data in the request was encrypted to the public key nicknym@domain.org, then the JSON response and signature are additionally encrypted to the symmetric key found in the request and returned base64 encoded.
+
+Query balancing
+------------------------
+
+A nickagent must choose what IP address to query by selecting randomly from among hosts that resolve from `nicknym.domain.org` (where `domain.org` is the domain name of the provider).
+
+If a host does not response, a nickagent must skip over it and attempt to contact another host in the pool.
+
+Query security
+--------------------------
+
+TLS is required for all nickserver queries.
+
+When querying https://nicknym.domain.org, nickagent must validate the TLS connection in one of three ways:
+
+1. Using a commercial CA certificate distributed with the host operating system.
+2. Using a seeded CA certificate (see [Discovering nickservers](#Discoverying.nickservers)).
+3. Using a custom self-signed CA certificate discovered for the domain, so long as the CA certificate was discovered via #1 or #2. Custom CA certificates may be discovered for a domain by making a provider request of a nickserver (e.g. https://nicknym.known-domain.org/?address=new-domain.org).
+
+Optionally, a nickagent may make an encrypted query like so:
+
+0. Suppose the nickagent wants to make an encrypted query regarding the address alice@x.org.
+1. Nickagent discovers the public key for nicknym@domain.org
+2. Nickagent uses the OpenPGP key for nicknym@domain.org to encrypt the body of the request (using POST). The request body should consist of the address being queried and the second line a randomly generated 128 bit symmetric key. The request can be foreign or local.
+3. The body of the nickserver' response is encrypted using AES128 using the symmetric key.
+
+Comment: although it may seem excessive to encrypt both the request via TLS and the request body via OpenPGP, the reason for this is that many requests will not use OpenPGP.
+
+Automatic key validation
+----------------------------------
+
+A key is "validated" if the nickagent has bound the user address to a public key.
+
+Nicknym supports three different levels of key validation:
+
+* Level 3 - path trusted: A path of cryptographic signatures can be traced from a trusted key to the key under evaluation. By default, only the provider key from the user's provider is a "trusted key".
+* Level 2 - provider signed: The key has been signed by a provider key for the same domain, but the provider key is not validated using a trust path (i.e. it is only registered)
+* Level 1 - registered: The key has been encountered and saved, it has no signatures (that are meaningful to the nickagent).
+
+nickagent will try to validate using the highest level possible.
+
+Automatic renewal
+-----------------------------
+
+A validated public key is replaced with a new key when:
+
+* The new key is path trusted
+* The new key is provider signed, but the old key is only registered.
+* The new key has a later expiration, and the old key is only registered and will expire "soon" (exact time TBD).
+* The agent discovers a new subkey, but the master signing key is unchanged.
+
+In all other cases, the new key is rejected.
+
+The nickagent will attempt to refresh a key by making request to a nickserver of its choice when a key is past 3/4 of its lifespan and again when it is about to expire.
+
+Nicknym encourages, but does not require, the use of short lived public keys, in the range of X to Y days. It is recommended that short lived keys are not uploaded to OpenPGP keyservers.
+
+Automatic invalidation
+----------------------------
+
+A key is invalidated if:
+
+* The old key has expired, and no new key can be discovered with equal or greater validation level.
+
+This means validation is a one way street: once a certain level of validation is established for a user address, no client should accept any future keys for that address with a lower level of validation.
+
+Discovering nickservers
+--------------------------------
+
+It is entirely up to the nickagent to decide what nickservers to query. If it wanted to, a nickagent send all its requests to a single nickserver.
+
+However, nickagents should discover new nickservers and balance their queries to these nickservers for the purposes of availability, load balancing, network perspective, and hiding the user's association map.
+
+Whenever the nickagent is asked by a locally running application for a public key corresponding to an address on the domain `domain.org`, it may check to see if the host `nicknym.domain.org` exists. If the domain resolves, then the nickagent may add it to the pool of known nickservers.
+
+Additionally, a nickagent may be distributed with an initial list of "seed" nickservers. In this case, the nickagent is distributed with a copy of the CA certificate used to validate the TLS connection with each respective seed nickserver.
+
+Cross-provider signatures
+----------------------------------
+
+To be written.
+
+Auditing
+----------------------------
+
+In order to keep the user's provider from handing out bogus public keys, a nickagent should occasionally make foreign queries of the user's own address against nickservers run by third parties.
+
+In order to prevent a nickserver from handing out bogus provider keys, a nickagent should query multiple nickservers before a provider key is registered or path trusted.
+
+Possible attacks:
+
+**Attack 1 - Intercept Outgoing:**
+
+* Attack: provider `A` signs an impostor key for provider `B` and distributes it to users of `A` (in order to intercept outgoing messages sent to `B`).
+* Countermeasure: By querying multiple nickservers for the provider key of `B`, the nickagent can detect if provider `A` is attempting to distribute impostor keys.
+
+**Attack 2 - Intercept Incoming:**
+
+* Attack: provider `A` signs an impostor key for one of its own users, and distributes to users of provider `B` (in order to intercept incoming messages).
+* Countermeasure: By querying for its own keys, a nickagent can detect if a provider is given out bogus keys for their addresses.
+
+**Attack 3 - Association Mapping:**
+
+* Attack: A provider tracks all the requests for key discovery in order to build a map of association.
+* Countermeasure: By performing foreign key queries via third party nickservers, an agent can prevent any particular entity from tracking their queries.
+
+
+Future enhancements
+---------------------
+
+Should we support additional discovery mechanisms:
+
+* Webfinger includes a standard mechanism for distributing a user's public key via a simple HTTP request. This is very easy to implement on the server, and very easy to consume on the client side.
+* There are multiple competing standards for key discovery via DNS. When and if one of these emerges predominate, Nicknym should attempt to use this method when available. DNS discovery, however, has some problems. DNS discovery of keys is much harder to implement, because the service provider must run their own customized authoritative nameserver. Also, since (RSA) keys can be too big for domain UDP packets, any future-proof DNS method relies on an HTTP request, thus undermining the potential benefit of decentralization you might get from using DNS rather than webfinger.
+
+
+
+Reference nickagent implementation
+====================================================
+
+There is a reference nickagent implementation called "key manager" written in Python and integrated into the LEAP client. It uses Soledad to store its data.
+
+Public API
+----------------------------
+
+**refresh_keys()**
+
+updates the keys with fresh ones, as needed.
+
+**get_key(address, type)**
+
+returns a single public key for address. type is one of 'openpgp', 'otr', 'x509', or 'rsa'.
+
+**send_key(address, public_key, type)**
+
+authenticates with the appropriate provider and saves the public_key in the user database.
+
+Storage
+--------------------------
+
+Key manager uses Soledad for storage. GPGME, however, requires keys to be stored in keyrings, which are read from disk.
+
+For now, Key Manager deals with this by storing each key in its own keyring. In other words, every key is in a keyring with exactly 1 key, and this keyring is stored in a Soledad document. To keep from confusing this keyring from a normal keyring, I will call it a 'unitary keyring'.
+
+Suppose Alice needs to communicate with Bob:
+
+1. Alice's Key Manager copies to disk her private key and bob's public key. The key manager gets these from Soledad, in the form of unitary Keyrings.
+2. Client code uses GPGME, feeding it these temporary keyring files.
+3. The keyrings are destroyed.
+
+TBD: how best to ensure destruction of the keyring files.
+
+An example Soledad document for an address:
+
+    {
+      "address":"alice@example.org",
+      "keys": [
+        {
+          "type": "opengpg"
+          "key": "binary blob",
+          "keyring": "binary blob",
+          "expires_on": "2014-01-01",
+          "validation": "provider_signed",
+          "first_seen_at": "2013-04-01 00:11:00",
+          "last_audited_at": "2013-04-02 12:00:00",
+        },
+        {
+          "type": "otr"
+          "key": "binary blob",
+          "expires_on": "2014-01-01",
+          "validation": "registered",
+          "first_seen_at": "2013-04-01 00:11:00",
+          "last_audited_at": "2013-04-02 12:00:00",
+        }
+      ]
+    }
+
+Pseudocode
+---------------------------
+
+get_key
+
+    #
+    # return a key for an address
+    #
+    function get_key(address, type)
+      if key for address exists in soledad database?
+        return key
+      else
+        fetch key from nickserver
+        save it in soledad
+        return key
+      end
+    end
+
+send_key
+
+    #
+    # send the user's provider the user's key. this key will get signed by the provider, and replace any prior keys
+    #
+    function send_key(type)
+      if not authenticated:
+        error!
+      end
+      get (self.address, type)
+      send (key_data, type) to the provider
+    end
+
+refresh_keys
+
+    #
+    # update the user's db of validated keys to see if there are changes.
+    #
+    function refresh_keys()
+      for each key in the soledad database (that should be checked?):
+          newkey = fetch_key_from_nickserver()
+          if key is about to expire and newkey complies with the renewal paramters:
+              replace key with newkey
+          else if fingerprint(key) != fingerprint(newkey):
+              freak out, something wrong is happening? :)
+              may be handle revokation, or try to get some voting for a given key and save that one (retrieve it through tor/vpn/etc and see what's the most found key or something like that.
+          else:
+              everything's cool for this key, continue
+          end
+      end
+    end
+
+private fetch_key_from_nickserver
+
+    function fetch_key_from_nickserver(key)
+      randomly pick a subset of the available nickservers we know about
+      send a tcp request to each in this subset in parallel
+      first one that opens a successful socket is used, all the others are terminated immediately
+      make http request
+      parse json for the keys
+      return keys
+    end
+
+
+Reference nickserver implementation
+=====================================================
+
+The reference nickserver is written in Ruby 1.9 and licensed GPLv3. It is lightweight and scalable (supporting high concurrency, and reasonable latency), and uses EventMachine for asynchronous network IO. Data stored in CouchDB.
+
+For more information, see https://github.com/leapcode/nickserver
+
diff --git a/docs/overview.md b/docs/design/overview.md
index 83f5c76..2d257c7 100644
--- a/docs/overview.md
+++ b/docs/design/overview.md
@@ -1,5 +1,6 @@
 @nav_title = "Overview"
 @title = "Overview of LEAP architecture"
+@summary = "Bird's eye view of how all the pieces fit together."
 
 The LEAP Platform allows an organization to deploy and manage a complete infrastructure for providing user communication services.
 
diff --git a/docs/design/soledad.md b/docs/design/soledad.md
new file mode 100644
index 0000000..0e21016
--- /dev/null
+++ b/docs/design/soledad.md
@@ -0,0 +1,367 @@
+@title = 'Soledad'
+@summary = 'A server daemon and client library to provide client-encrypted application data that is kept synchronized among multiple client devices.'
+@toc = true
+
+Introduction
+=====================
+
+Soledad is a system for to allow client applications the ability to securely share synchronized document databases. Soledad is based on Ubuntu's U1DB, "a cross-platform, cross-device, syncable database API", but with the addition of client-side encryption of documents stored on the server, and encryption of the local database replica. Soledad is an acronym of "Synchronization of Locally Encrypted Documents Among Devices" and means "solitude" in Spanish.
+
+Key aspects of Soledad include:
+
+* **Client and server:** Soledad includes a server daemon and client application library.
+* **Client-side encryption:** Soledad puts very little trust in the server by encrypting all data before it is synchronized to the server and by limiting ways in which the server can modify the user's data.
+* **Local storage:** All data cached locally is stored in an encrypted database.
+* **Document database:** An application using the Soledad client library is presented with a document-centric database API for storage and sync. Documents may be indexed, searched, and versioned.
+
+The current reference implementation of Soledad is written in Python and distributed under a GPLv3 license.
+
+Goals
+======================
+
+Security goals
+--------------------------------------
+
+* *Client-side encryption:* Before any data is synced to the cloud, it should be encrypted/decrypted on the client device.
+* *Encrypted local storage:* Any data cached or stored on the client should be stored in an encrypted format.
+* *Resistant to offline attacks:* Data stored on the server should be highly resistant to offline attacks (i.e. an attacker with a static copy of data stored on the server would have a very hard time discerning much from the data).
+* *Resistant to online attacks:* Analysis of storing and retrieving data should not leak potentially sensitive information.
+* *Resistance to data tampering:* The server should not be able to provide the client with old or bogus data for a document.
+
+Synchronization goals
+-------------------------------------
+
+* *Consistency:* multiple clients should all get sync'ed with the same data.
+* *Sync flag:* the ability to partially sync data. For example, so a mobile device doesn't need to sync all email attachments.
+* *Multi-platform:* supports both desktop and mobile clients.
+* *Quota:* the ability to identify how much storage space a user is taking up.
+* *Scalable cloud:* distributed master-less storage on the cloud side, with no single point of failure.
+* *Conflict resolution:* conflicts are flagged and handed off to the application logic to resolve.
+
+Usability goals
+---------------------------------
+
+* *Availability*: the user should always be able to access their data.
+* *Recovery*: there should be a mechanism for a user to recover their data should they forget their password.
+
+Known limitations
+------------------------------
+
+* Currently, the server knows when the contents of a document have changed.
+* Currently, there is no facility for sharing documents among multiple users.
+
+Non-goals
+---------------------------
+
+* Soledad is not for filesystem synchronization, storage or backup. It provides an API for application code to synchronize and store arbitrary schema-less JSON documents in one big flat document database. One could model a filesystem on top of Soledad, but it would be a bad fit.
+* Soledad is not intended for decentralized peer-to-peer synchronization, although the underlying synchronization protocol does not require a server. Soledad takes a cloud approach in order to ensure that a client has quick access to an available copy of the data.
+
+Related software
+==================================
+
+[Crypton](https://crypton.io/) - Similar goals to Soledad, but in javascript for HTML5 applications.
+
+[U1DB](http://pythonhosted.org/u1db/) - Similar API as Soledad, without encryption.
+
+Protocol
+===================================
+
+Storage secret
+-----------------------------------
+
+When a client application first wants to use Soledad, it must provide the user's password to unlock the `storage_secret`. The `storage_secret` is a long, randomly generated symmetric key used to encrypt both the documents stored on the server and the local replica of these documents.
+
+TO ADD: example code
+
+The `storage_secret` is saved locally on disk in the file `soledad.json`, block encrypted using a derived key. The derived key is obtained from the user's password.
+
+The file `soledad.json` has a field `storage_secrets` that looks like so:
+
+    {
+      "storage_secrets": {
+        "<secret_id>": {
+          "kdf": "scrypt",
+          "kdf_salt": "400$8$5fb$61b499fe3366d947",
+          "kdf_length": 128,
+          "cipher": "aes128",
+          "length": 512,
+          "secret": "<encrypted storage_secret 1>",
+        }
+      }
+    }
+
+The `storage_secrets` entry is a map that stores information about each storage key, indexed by the id of each key. For each storage key, the following fields are stored:
+
+* `kdf`: the key derivation function to use. Only scrypt is currently supported (so for now, this value is ignored).
+* `kdf_salt`: the salt used in the kdf. The salt for scrypt is not random, but encodes important parameters like the limits for time and memory.
+* `kdf_length`: the length of the derived key resulting from the kdf.
+* `secret`: the encrypted `storage_secret`, created by `sym_encrypt(cipher, storage_secret, derived_key)` (base64 encoded).
+* `length`: the length of `storage_secret`, when not encrypted.
+* `cipher`: what cipher to use to encrypt `storage_secret`. It must match kdf_length (i.e. the length of the derived_key).
+* `secret_id`: a handle used to refer to a particular storage_secret and equal to `md5(storage_secret)`.
+
+Other variables:
+
+* `derived_key` is equal to `kdf(user_password, kdf_salt, kdf_length)`.
+* `storage_secret` is equal to `sym_decrypt(cipher, secret, derived_key)`.
+
+In the current version, only one `storage_secret` is supported.
+
+The `storage_secret` is shared among all devices with access to a particular user's Soledad database. See [Recovery and bootstrap](#Recovery.and.bootstrap) for how the storage_secret is initially installed on a device.
+
+We don't use the derived_key as the storage_secret because we want the user to be able to change their password without needing to re-key.
+
+TO DO: settle on a block cipher.
+
+Unresolved:
+
+* How do devices receive updates if the storage_secret changes?
+
+Document API
+-----------------------------------
+
+This is unchanged and identical to the [API used in U1DB](http://pythonhosted.org/u1db/reference-implementation.html).
+
+* Document storage: `create_doc()`, `put_doc()`, `get_doc()`.
+* Synchronization between database replicas: `sync()`.
+* Document indexing and searching: `create_index()`, `list_indexes()`, `get_from_index()`, `delete_index()`.
+* Document conflict resolution: `get_doc_conflicts()`, `resolve_doc()`.
+
+TO ADD: code examples
+
+Document encryption
+------------------------
+
+Before a JSON document is synced with the server, it is transformed into a document that looks like so:
+
+    {
+      "scheme": "aes128",
+      "secret_id": "1",
+      "ciphertext": "xxxxxxxxx",
+      "mac": "xxxxxxx"
+    }
+
+About these fields:
+
+* `ciphertext`: The original JSON document, encrypted and base64 encoded. `ciphertext` is equal to `sym_encrypt(cipher, content, document_secret)`.
+* `scheme`: Information about the block cipher that is used to encrypt this document.
+* `secret_id`: The id of the storage_secret that was used to generate the `document_key`.
+* `mac`: Defined as `HMAC(doc_id|rev|ciphertext, document_secret)`. The purpose of this field is to prevent the server from tampering with the stored documents.
+
+Other variables:
+
+* `document_secret`: equal to `HMAC(doc_id, storage_secret)`. This value is unique for every document and only kept in memory. We use document_secret instead of simply storage_secret in order to hinder possible derivation of storage_secret by the server. Every `doc_id` is unique.
+* `content`: equal to `sym_decrypt(cipher, ciphertext, document_secret)`.
+
+When receiving a document with the above structure from the server, Soledad client will decrypt the `ciphertext` to find `content`, verify that the mac is correct, and then store `content` as a cleartext document in the local database replica.
+
+Soledad client will verify that the mac is correct, decrypt the `ciphertext` to find `content`, and then store `content` as a document in the local database replica.
+
+Document synchronization
+-----------------------------------
+
+Soledad follows the U1DB synchronization protocol, with two changes:
+
+* Soledad adds the ability to flag some documents so they are not synchronized by default.
+* Soledad will refuse to synchronize a document if it is encrypted and the MAC is incorrect.
+
+TO ADD: code examples
+
+Document IDs
+--------------------
+
+Like U1DB, Soledad allows the programmer to use whatever ID they choose for each document. However, it is best practice to let the library choose random IDs for each document so as to ensure you don't leak information. In other words, leave the second argument to `create_doc()` empty.
+
+UNRESOLVED: perhaps Soledad should forbid custom document IDs.
+chiiph: I don't think we should forbid this, it's handy for certain cases and the downside isn't too problematic.
+
+Re-keying
+-----------
+
+Sometimes there is a need to change the `storage_secret`. Rather then re-encrypt every document, Soledad implements a system called "lazy revocation" where a new storage_secret is generated and used for all subsequent encryption. The old storage_secret is still retained and used when decrypting older documents that have not yet been re-encrypted with the new storage_secret.
+
+Implementation status: not yet.
+
+TO DO: code example
+
+Authentication
+-----------------------
+
+Unlike U1DB, Soledad only supports token authentication and does not support not support OAuth. Soledad itself does not handle authentication. Instead, this job is handled by a thin middleware layer running in front of the Soledad server daemon.
+
+Recovery and bootstrap
+------------------------------------------
+
+In order to bootstrap Soledad on a new device, the user only needs their login name and password. Everything else is downloaded from the server.
+
+**Recovery database**
+
+In order to support this functionality, the Soledad client stores a recovery document in a special recovery database. This database is shared among all users.
+
+The recovery database supports two functions:
+
+* `get_doc(doc_id)`
+* `put_doc(doc_id, recovery_document_content)`
+
+**Recovery document**
+
+An example recovery document:
+
+    {
+      "doc_id": "xxxxx"
+      "kdf": "scrypt",
+      "kdf_salt": "400$8$5fb$61b499fe3366d947",
+      "kdf_length": 128,
+      "cipher": "aes128",
+      "soledad": "xxxxx"
+    }
+
+About these fields:
+
+* `doc_id` is determined by the client and computed from `hmac(username@domain, user_password)`.
+* `soledad`: the encrypted `soledad.json`, created by `sym_encrypt(cipher, contents(soledad.json), derived_key)` (base64 encoded).
+* `kdf`: the key derivation function to use. Only scrypt is currently supported (so for now, this value is ignored).
+* `kdf_salt`: the salt used in the kdf. The salt for scrypt is not random, but encodes important parameters like the limits for time and memory.
+* `kdf_length`: the length of the derived key resulting from the kdf.
+* `cipher`: what cipher to use to encrypt `soledad`. It must match kdf_length (i.e. the length of the derived_key).
+
+**Authentication**
+
+Like other Soledad functions, access to the recovery database requires token authentication. However, the recovery database is shared among all users. Any user can query for any `doc_id`. The purpose of this is to allow the server to not know which user corresponds to which recovery document.
+
+To mitigate the vulnerabily created by this design, the response to queries of the discovery database have a very long delay.
+
+TODO: come up with a better authentication scheme.
+TODO: determine the response delay.
+
+
+Client Reference Implementation
+===================================
+
+Dependencies:
+
+* [U1DB](https://launchpad.net/u1db) provides an API and protocol for synchronised databases of JSON documents.
+* [SQLCipher](http://sqlcipher.net/) provides a block-encrypted SQLite database used for local storage.
+* python-gnupg
+
+Local storage
+--------------------------
+
+U1DB reference implementation in Python has an SQLite backend that implements the object store API over a common SQLite database residing in a local file. To allow for encrypted local storage, Soledad adds a SQLCipher backend, built on top of U1DB's SQLite backend, which adds [SQLCipher API](http://sqlcipher.net/sqlcipher-api/) to U1DB.
+
+**Responsibilities**
+
+The SQLCipher backend is responsible for:
+
+* Providing the SQLCipher API for U1DB (`PRAGMA` statements that control encryption parameters).
+* Guaranteeing that the local database used for storage is indeed encrypted.
+* Guaranteeing secure synchronization:
+  * All data being sent to a remote replica is encrypted with a symmetric key before being sent.
+  * Ensure that data received from remote replica is indeed encrypted to a symmetric key when it arrives, and then that it is decrypted before being included in the local database replica.
+* Correctly representing and handling new Document properties as sync flag.
+
+The Soledad `storage_key` is used directly as the key for the SQLCipher encryption layer. SQLCipher supports the use of a raw 256 bit keys if provided as a 64 character hex string. This will skip the key derivation step (PBKDF2), which is redundant in our case. For example:
+
+    sqlite> PRAGMA key = "x'2DD29CA851E7B56E4697B0E1F08507293D761A05CE4D1B628663F411A8086D99'";
+
+**Classes**
+
+SQLCipher backend classes:
+
+* `SQLCipherDatabase`: An extension of SQLitePartialExpandDatabase used by Soledad Client to store data locally using SQLCipher. It implements the following:
+  * Need of a password to instantiate the db.
+  * Verify if the db instance is indeed encrypted.
+  * Use a LeapSyncTarget for encrypting content before synchronizing over HTTP.
+  * "Syncable" option for documents (users can mark documents as not syncable, so they do not propagate to the server).
+
+Encrypted synchronization target
+--------------------------------------------------
+
+To allow for database synchronization among devices, Soledad uses the following conventions:
+
+* Centralized synchronization scheme: Soledad clients always sync with a server, and never between themselves.
+* The server stores its database in a CouchDB database using a REST API over HTTP.
+* All data sent to the server is encrypted with a symmetric secret before being sent. Note that this ensures all data received by the server and stored in the CouchDB database has been encrypted by the client.
+* All data received from the server is validated as being an encrypted blob, and then is decrypted before being stored in local database. Note that the local database provides a new encryption layer for the data through SQLCipher.
+
+**Responsibilities**
+
+Provide sync between local and remote replicas:
+
+* Encrypt outgoing content.
+* Decrypt incoming content.
+
+**Classes**
+
+Synchronization-related classes:
+
+* `LEAPDocument`: an extension of @u1db.Document@ with methods to:
+  * Return a symmetric encrypted version of Documents JSON representation.
+  * Set document's content by symmetric decrypting an encrypted JSON representation.
+* `LEAPSyncTarget`: an extension of `HTTPSyncTarget` with the following modified methods:
+  * `sync_exchange`: request encrypted version of Document's content before sending it to the network.
+  * `_parse_sync_stream`: set Document's content based on encrypted version right after it arrives as a response from the network.
+
+Server Reference Implementation
+======================================================
+
+Dependencies:
+
+* [CouchDB](https://couchdb.apache.org/] for server storage, via [python client library](https://pypi.python.org/pypi/CouchDB/0.8).
+* WSGI middleware for authentication.
+* [Twisted](http://twistedmatrix.com/trac/) to run the WSGI application.
+
+CouchDB backend
+-------------------------------
+
+In the server side, Soledad stores its database replicas in CouchDB servers. Soledad's CouchDB backend implementation is built on top of the reference `InMemory` implementation, but forces storage and fetch of U1DB data on a remote couch server for every write and read operation, respectively.
+
+CouchDB backend is responsible for:
+
+* Initializing and maintaining the following U1DB replica data in the database:
+  * Transaction log.
+  * Conflict log.
+  * Synchronization log.
+  * Indexes.
+* Mapping the U1DB API to CouchDB API.
+
+**Classes**
+
+* `CouchDatabase`: A backend used by Soledad Server to store data in CouchDB.
+* `CouchSyncTarget`: Just a target for syncing with Couch database.
+* `CouchServerState`: Inteface of the WSGI server with the CouchDB backend.
+
+WSGI Server
+-----------------------------------------
+
+The U1DB server reference implementation provides for an HTTP api backed by SQLite databases (of minimal usefulness in production environment!). Soledad extends this with token-based auth HTTP access to CouchDB databases.
+
+* Soledad makes use of @twistd@ from Twisted API to serve its WSGI application.
+* Authentication is done by means of a token.
+* Soledad implements a WSGI middleware in server side that:
+  * Uses the provided token to verify read and write access to each user's private databases and write access to the shared recovery database.
+  * Allows reading from the shared remote recovery database.
+  * Uses CouchDB as its backend.
+
+**Classes**
+
+* `SoledadAuthMiddleware`: implemnets the WSGI middleware with token based auth as described before.
+* `SoledadApp`: The WSGI application. For now, not different from `u1db.remote.http_app.HTTPApp`.
+
+**Authentication**
+
+Soledad Server authentication middleware controls access to user's private databases and to the shared recovery database. Soledad client provides a token for Soledad server that can check the validity of this token for this user's session by querying a certain database.
+
+A valid token for this user's session is required for:
+
+* Read and write access to this user's database.
+* Read and write access to the shared recovery database.
+
+Tests
+===================
+
+To be sure the new implemented backends work correctly, we included in Soledad the U1DB tests that are relevant for the new pieces of code (backends, document, http(s) and sync tests). We also added specific tests to the new functionalities we are building.
+
+salt = SCrypt::Engine.generate_salt(:max_time => 10)
+SCrypt::Engine.hash_secret "my grand secret", salt
+
diff --git a/docs/en.haml b/docs/en.haml
index 3fd68b7..7f827d6 100644
--- a/docs/en.haml
+++ b/docs/en.haml
@@ -1,2 +1,5 @@
-- @title = "Documentation"
-= act_as('overview')
-\ No newline at end of file
+- @title = "Development"
+
+%h1.first LEAP Software Development
+
+= child_summaries
+\ No newline at end of file
diff --git a/docs/platform/en.md b/docs/platform/en.md
index 5af8190..f287624 100644
--- a/docs/platform/en.md
+++ b/docs/platform/en.md
@@ -1,5 +1,6 @@
 @title = 'LEAP Platform for Service Providers'
 @nav_title = 'Provider Platform'
+@summary = 'Software platform to automate the process of running a communication service provider.'
 @toc = true
 
 If you have ever been a sysadmin for an organization or company that provides communication services to end users, you have probably screamed this many times:
diff --git a/docs/development/en.haml b/docs/source/en.haml
index e21ee10..a2f4e68 100644
--- a/docs/development/en.haml
+++ b/docs/source/en.haml
@@ -1,10 +1,7 @@
-- @title = "Development"
+- @title = "Source Code"
+- @summary = "Submit a pull request today!"
 
-%h1.first IRC
-
-Join us on freenode at #leap-dev.
-
-%h1.topgap Source Code
+%h1.first Source Code
 
 %h3 Client
 
diff --git a/menu.txt b/menu.txt
index aeec69b..5ac3d2b 100644
--- a/menu.txt
+++ b/menu.txt
@@ -1,6 +1,10 @@
 docs
-  overview
-  development
+  design
+    overview
+    nicknym
+    soledad
+#    cuttlefish
+  source
   platform
     quick-start
     guide