summaryrefslogtreecommitdiff
path: root/docs/reference
diff options
context:
space:
mode:
Diffstat (limited to 'docs/reference')
-rw-r--r--docs/reference/auth.rst3
-rw-r--r--docs/reference/client-database.rst12
-rw-r--r--docs/reference/document-encryption.rst27
-rw-r--r--docs/reference/document-sync.rst90
-rw-r--r--docs/reference/server-database.rst2
-rw-r--r--docs/reference/storage-secrets.rst37
6 files changed, 168 insertions, 3 deletions
diff --git a/docs/reference/auth.rst b/docs/reference/auth.rst
index 07d8865c..ac63b414 100644
--- a/docs/reference/auth.rst
+++ b/docs/reference/auth.rst
@@ -3,9 +3,6 @@
Authentication
==============
-.. contents::
- :local:
-
Authentication with the Soledad server is made using `Twisted's Pluggable
Authentication system
<https://twisted.readthedocs.io/en/latest/core/howto/cred.html>`_. The
diff --git a/docs/reference/client-database.rst b/docs/reference/client-database.rst
new file mode 100644
index 00000000..d8fd3be7
--- /dev/null
+++ b/docs/reference/client-database.rst
@@ -0,0 +1,12 @@
+.. _client-databases:
+
+Client-side databases
+=====================
+
+Soledad Client uses `SQLCipher <https://www.zetetic.net/sqlcipher/>`_ for
+storing data. The symmetric key used to unlock databases is chosen randomly and
+stored encrypted with the user's passphrase (see :ref:`storage-secrets` for
+more details).
+
+:ref:`Documents <document-encryption>` and :ref:`blobs <blobs>` are stored in
+different databases protected with the same symmetric secret.
diff --git a/docs/reference/document-encryption.rst b/docs/reference/document-encryption.rst
new file mode 100644
index 00000000..724c78d1
--- /dev/null
+++ b/docs/reference/document-encryption.rst
@@ -0,0 +1,27 @@
+.. _document-encryption:
+
+Document encryption
+===================
+
+Before a JSON document is sent to the server, Soledad Client symmetrically
+encrypts it using AES-256 operating in GCM mode. That mode of encryption
+automatically calculates a MAC during block encryption, and so gives Soledad
+the ability to encrypt on the fly while transmitting data to the server.
+Similarly, when downloading a symmetrically encrypted document from the server,
+Soledad Client will decrypt it and verify the MAC tag in the end before
+accepting the document.
+
+Soledad Client will allways do *symmetric encryption*. Server-side applications
+can define their own encryption schemes and Soledad Client will not try to
+decrypt in those cases. The symmetric key used to encrypt a document is derived
+from the storage secret and the document id, with HMAC using SHA-256 as a hash
+function.
+
+The calculation of the MAC also takes into account the document revision to
+avoid tampering. Soledad Client will refuse to accept a document if it does not
+include a higher revision. In this way, the server cannot rollback a document
+to an older revision. The server also cannot delete a document, since document
+deletion is handled by removing the document contents, marking it as deleted,
+and incrementing the revision. However, a server can withhold from the client
+new documents and new revisions of a document (including withholding document
+deletion).
diff --git a/docs/reference/document-sync.rst b/docs/reference/document-sync.rst
new file mode 100644
index 00000000..679e0a6f
--- /dev/null
+++ b/docs/reference/document-sync.rst
@@ -0,0 +1,90 @@
+Document synchronization
+========================
+
+Soledad follows `the U1DB synchronization protocol
+<https://pythonhosted.org/u1db/conflicts.html>`_ with some modifications:
+
+* A synchronization always happens between the Soledad Server and one Soledad
+ Client. Many clients can synchronize with the same server.
+
+* Soledad Client :ref:`always encrypts <document-encryption>` before sending
+ data to the server.
+
+* Soledad Client refuses to receive a document if it is encrypted and the MAC
+ is incorrect.
+
+* Soledad Server doesn't try to decide about document convergence based on the
+ document's content, because the content is client-encrypted.
+
+Synchronization protocol
+------------------------
+
+Synchronization between the Soledad Server and one Soledad Client consists of
+the following steps:
+
+1. The client asks the server for the information it has stored about the last
+ time they have synchronized (if ever).
+
+2. The client validates that its information regarding the last synchronization
+ is consistent with the server's information, and raises an error if not.
+ (This could happen for instance if one of the replicas was lost and restored
+ from backup, or if a user inadvertently tries to synchronize a copied
+ database.)
+
+3. The client generates a list of changes since the last change the server
+ knows of.
+
+4. The client checks what the last change is it knows about on the server.
+
+5. If there have been no changes on either side that the other side has not
+ seen, the synchronization stops here.
+
+6. The client encrypts and sends the changed documents to the server, along
+ with what the latest change is that it knows about on the server.
+
+7. The server processes the changed documents, and records the client's latest
+ change.
+
+8. The server responds with the documents that have changes that the client
+ does not yet know about.
+
+9. The client decrypts and processes the changed documents, and records the
+ server's latest change.
+
+10. If the client has seen no changes unrelated to the synchronization during
+ this whole process, it now sends the server what its latest change is, so
+ that the next synchronization does not have to consider changes that were
+ the result of this one.
+
+Synchronization metadata
+------------------------
+
+The synchronization information stored on each database replica consists of:
+
+* The replica id of the other replica. (Which should be globally unique
+ identifier to distinguish database replicas from one another.)
+
+* The last known generation and transaction id of the other replica.
+
+* The generation and transaction id of this replica at the time of the most
+ recent succesfully completed synchronization with the other replica.
+
+Transactions
+------------
+
+Any change to any document in a database constitutes a transaction. Each
+transaction increases the database generation by 1, and is assigned
+a transaction id, which is meant to be a unique random string paired with each
+generation.
+
+The transaction id can be used to detect the case where replica A and replica
+B have previously synchronized at generation N, and subsequently replica B is
+somehow reverted to an earlier generation (say, a restore from backup, or
+somebody made a copy of the database file of replica B at generation < N, and
+tries to synchronize that), and then new changes are made to it. It could end
+up at generation N again, but with completely different data.
+
+Having random unique transaction ids will allow replica A to detect this
+situation, and refuse to synchronize to prevent data loss. (Lesson to be
+learned from this: do not copy databases around, that is what synchronization
+is for.)
diff --git a/docs/reference/server-database.rst b/docs/reference/server-database.rst
index 591d688d..d3dfdb5f 100644
--- a/docs/reference/server-database.rst
+++ b/docs/reference/server-database.rst
@@ -13,6 +13,8 @@ the server. Authorization for creating, updating, deleting and retrieving
information about the user database as well as performing synchronization is
handled by the `leap.soledad.server.auth` module.
+.. _shared-database:
+
Shared database
---------------
diff --git a/docs/reference/storage-secrets.rst b/docs/reference/storage-secrets.rst
new file mode 100644
index 00000000..039075f3
--- /dev/null
+++ b/docs/reference/storage-secrets.rst
@@ -0,0 +1,37 @@
+.. _storage-secrets:
+
+Storage secrets
+===============
+
+Soledad randomly generates secrets that are used to derive encryption keys for
+protecting all data that is stored in the server and in the local storage.
+These secrets are themselves encrypted using a key derived from the user’s
+passphrase, and saved locally on disk.
+
+The encrypted secrets are stored in a local file in the user's in a JSON
+structure that looks like this::
+
+ encrypted = {
+ 'version': 2,
+ 'kdf': 'scrypt',
+ 'kdf_salt': <base64 encoded salt>,
+ 'kdf_length': <the length of the derived key>,
+ 'cipher': <a code indicating the cipher used for encryption>,
+ 'length': <the length of the plaintext>,
+ 'iv': <the initialization vector>,
+ 'secrets': <base64 encoding of ciphertext>,
+ }
+
+When a client application first wants to use Soledad, it must provide the
+user’s password to unlock the storage secrets. Currently, the storage secrets
+are shared among all devices with access to a particular user's Soledad
+database.
+
+The storage secrets are currently backed up in the provider (encrypted with the
+user's passphrase) for the case where the user looses or resets her device (see
+:ref:`shared-database` for more information). There are plans to make this
+feature optional, allowing for less trust in the provider while increasing the
+responsibility of the user.
+
+If the user looses her passphrase, there is currently no way of recovering her
+data.