diff options
Diffstat (limited to 'docs/reference')
-rw-r--r-- | docs/reference/auth.rst | 3 | ||||
-rw-r--r-- | docs/reference/client-database.rst | 12 | ||||
-rw-r--r-- | docs/reference/document-encryption.rst | 27 | ||||
-rw-r--r-- | docs/reference/document-sync.rst | 90 | ||||
-rw-r--r-- | docs/reference/server-database.rst | 2 | ||||
-rw-r--r-- | docs/reference/storage-secrets.rst | 37 |
6 files changed, 168 insertions, 3 deletions
diff --git a/docs/reference/auth.rst b/docs/reference/auth.rst index 07d8865c..ac63b414 100644 --- a/docs/reference/auth.rst +++ b/docs/reference/auth.rst @@ -3,9 +3,6 @@ Authentication ============== -.. contents:: - :local: - Authentication with the Soledad server is made using `Twisted's Pluggable Authentication system <https://twisted.readthedocs.io/en/latest/core/howto/cred.html>`_. The diff --git a/docs/reference/client-database.rst b/docs/reference/client-database.rst new file mode 100644 index 00000000..d8fd3be7 --- /dev/null +++ b/docs/reference/client-database.rst @@ -0,0 +1,12 @@ +.. _client-databases: + +Client-side databases +===================== + +Soledad Client uses `SQLCipher <https://www.zetetic.net/sqlcipher/>`_ for +storing data. The symmetric key used to unlock databases is chosen randomly and +stored encrypted with the user's passphrase (see :ref:`storage-secrets` for +more details). + +:ref:`Documents <document-encryption>` and :ref:`blobs <blobs>` are stored in +different databases protected with the same symmetric secret. diff --git a/docs/reference/document-encryption.rst b/docs/reference/document-encryption.rst new file mode 100644 index 00000000..724c78d1 --- /dev/null +++ b/docs/reference/document-encryption.rst @@ -0,0 +1,27 @@ +.. _document-encryption: + +Document encryption +=================== + +Before a JSON document is sent to the server, Soledad Client symmetrically +encrypts it using AES-256 operating in GCM mode. That mode of encryption +automatically calculates a MAC during block encryption, and so gives Soledad +the ability to encrypt on the fly while transmitting data to the server. +Similarly, when downloading a symmetrically encrypted document from the server, +Soledad Client will decrypt it and verify the MAC tag in the end before +accepting the document. + +Soledad Client will allways do *symmetric encryption*. Server-side applications +can define their own encryption schemes and Soledad Client will not try to +decrypt in those cases. The symmetric key used to encrypt a document is derived +from the storage secret and the document id, with HMAC using SHA-256 as a hash +function. + +The calculation of the MAC also takes into account the document revision to +avoid tampering. Soledad Client will refuse to accept a document if it does not +include a higher revision. In this way, the server cannot rollback a document +to an older revision. The server also cannot delete a document, since document +deletion is handled by removing the document contents, marking it as deleted, +and incrementing the revision. However, a server can withhold from the client +new documents and new revisions of a document (including withholding document +deletion). diff --git a/docs/reference/document-sync.rst b/docs/reference/document-sync.rst new file mode 100644 index 00000000..679e0a6f --- /dev/null +++ b/docs/reference/document-sync.rst @@ -0,0 +1,90 @@ +Document synchronization +======================== + +Soledad follows `the U1DB synchronization protocol +<https://pythonhosted.org/u1db/conflicts.html>`_ with some modifications: + +* A synchronization always happens between the Soledad Server and one Soledad + Client. Many clients can synchronize with the same server. + +* Soledad Client :ref:`always encrypts <document-encryption>` before sending + data to the server. + +* Soledad Client refuses to receive a document if it is encrypted and the MAC + is incorrect. + +* Soledad Server doesn't try to decide about document convergence based on the + document's content, because the content is client-encrypted. + +Synchronization protocol +------------------------ + +Synchronization between the Soledad Server and one Soledad Client consists of +the following steps: + +1. The client asks the server for the information it has stored about the last + time they have synchronized (if ever). + +2. The client validates that its information regarding the last synchronization + is consistent with the server's information, and raises an error if not. + (This could happen for instance if one of the replicas was lost and restored + from backup, or if a user inadvertently tries to synchronize a copied + database.) + +3. The client generates a list of changes since the last change the server + knows of. + +4. The client checks what the last change is it knows about on the server. + +5. If there have been no changes on either side that the other side has not + seen, the synchronization stops here. + +6. The client encrypts and sends the changed documents to the server, along + with what the latest change is that it knows about on the server. + +7. The server processes the changed documents, and records the client's latest + change. + +8. The server responds with the documents that have changes that the client + does not yet know about. + +9. The client decrypts and processes the changed documents, and records the + server's latest change. + +10. If the client has seen no changes unrelated to the synchronization during + this whole process, it now sends the server what its latest change is, so + that the next synchronization does not have to consider changes that were + the result of this one. + +Synchronization metadata +------------------------ + +The synchronization information stored on each database replica consists of: + +* The replica id of the other replica. (Which should be globally unique + identifier to distinguish database replicas from one another.) + +* The last known generation and transaction id of the other replica. + +* The generation and transaction id of this replica at the time of the most + recent succesfully completed synchronization with the other replica. + +Transactions +------------ + +Any change to any document in a database constitutes a transaction. Each +transaction increases the database generation by 1, and is assigned +a transaction id, which is meant to be a unique random string paired with each +generation. + +The transaction id can be used to detect the case where replica A and replica +B have previously synchronized at generation N, and subsequently replica B is +somehow reverted to an earlier generation (say, a restore from backup, or +somebody made a copy of the database file of replica B at generation < N, and +tries to synchronize that), and then new changes are made to it. It could end +up at generation N again, but with completely different data. + +Having random unique transaction ids will allow replica A to detect this +situation, and refuse to synchronize to prevent data loss. (Lesson to be +learned from this: do not copy databases around, that is what synchronization +is for.) diff --git a/docs/reference/server-database.rst b/docs/reference/server-database.rst index 591d688d..d3dfdb5f 100644 --- a/docs/reference/server-database.rst +++ b/docs/reference/server-database.rst @@ -13,6 +13,8 @@ the server. Authorization for creating, updating, deleting and retrieving information about the user database as well as performing synchronization is handled by the `leap.soledad.server.auth` module. +.. _shared-database: + Shared database --------------- diff --git a/docs/reference/storage-secrets.rst b/docs/reference/storage-secrets.rst new file mode 100644 index 00000000..039075f3 --- /dev/null +++ b/docs/reference/storage-secrets.rst @@ -0,0 +1,37 @@ +.. _storage-secrets: + +Storage secrets +=============== + +Soledad randomly generates secrets that are used to derive encryption keys for +protecting all data that is stored in the server and in the local storage. +These secrets are themselves encrypted using a key derived from the user’s +passphrase, and saved locally on disk. + +The encrypted secrets are stored in a local file in the user's in a JSON +structure that looks like this:: + + encrypted = { + 'version': 2, + 'kdf': 'scrypt', + 'kdf_salt': <base64 encoded salt>, + 'kdf_length': <the length of the derived key>, + 'cipher': <a code indicating the cipher used for encryption>, + 'length': <the length of the plaintext>, + 'iv': <the initialization vector>, + 'secrets': <base64 encoding of ciphertext>, + } + +When a client application first wants to use Soledad, it must provide the +user’s password to unlock the storage secrets. Currently, the storage secrets +are shared among all devices with access to a particular user's Soledad +database. + +The storage secrets are currently backed up in the provider (encrypted with the +user's passphrase) for the case where the user looses or resets her device (see +:ref:`shared-database` for more information). There are plans to make this +feature optional, allowing for less trust in the provider while increasing the +responsibility of the user. + +If the user looses her passphrase, there is currently no way of recovering her +data. |