summaryrefslogtreecommitdiff
path: root/docs/attachments.rst
blob: 7435eb4f6c5eb078ed8f88b26aec11b08773ae9a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
Document attachments
====================

.. contents:: Contents:
   :local:

Reasoning
---------

The type of a Soledad document's content is `JSON <http://www.json.org/>`_,
which is good for efficient lookup and indexing. On the other hand, this is
particularly bad for storing larger amounts of binary data, because:

* the only way to store data in JSON is as unicode string, and this uses more
  space than what is actually needed for binary data storage.

* upon synchronization, the content of a Soledad document needs to be
  completelly transferred and decrypted for the document to be available for
  use.

Document attachments were introduced as a means to efficiently store large
payloads of binary data while avoiding the need to wait for their transfer to
have access to the documents' contents.

Client-side
-----------

In the client, attachments are stored as (SQLite) BLOBs in a separate SQLCipher
database. Encryption of data before it's sent to the server is the same used
for Soledad documents' content during usual synchronization process (AES-256
GCM mode).

See :ref:`client-side-attachment-api` for reference.

Usage example
^^^^^^^^^^^^^

The attachments API is currently available in the `Document` class, and the
document needs to know about the store to be able to manage attachments. When
you create a new document with soledad, that document will already know about
the store that created it, and can put/get/delete an attachment:

.. code-block:: python

    from twisted.internet.defer import inlineCallbacks

    @inlineCallbacks
    def attachment_example(soledad):
        doc = yield soledad.create_doc({})

        state = yield doc.get_attachment_state()
        dirty = yield doc.is_dirty()
        assert state == AttachmentStates.NONE
        assert dirty == False

        yield doc.put_attachment(open('hackers.txt'))
        state = yield doc.get_attachment_state()
        dirty = yield doc.is_dirty()
        assert state | AttachmentState.LOCAL
        assert dirty == True

        yield soledad.put_doc(doc)
        dirty = yield doc.is_dirty()
        assert dirty == False

        yield doc.upload_attachment()
        state = yield doc.get_attachment_state()
        assert state | AttachmentState.REMOTE
        assert state == AttachmentState.SYNCED

        fd = yield doc.get_attachment()
        assert fd.read() == open('hackers.txt').read()

Server-side
-----------

In the server, a simple REST API is served by a `Twisted Resource
<https://twistedmatrix.com/documents/current/api/twisted.web.resource.Resource.html>`_
and attachments are stored in the filesystem as they come in without
modification.

A token is used to allow listing, getting, putting and deleting attachments. It
has to be added as an HTTP auth header, as in::

    Authorization: Token <base64-encoded uuid:token>

Check out the :ref:`server-side-attachments-rest-api` for more information on
how to interact with the server using HTTP.

The :ref:`IBlobsBackend <i-blobs-backend>` interface is provided, so in the
future there can be different ways to store attachments in the server side
(think of a third-party storage, for example). Currently, the
:ref:`FilesystemBlobsBackend <filesystem-blobs-backend>` is the only backend
that implements that interface.

Some characteristics of the :ref:`FilesystemBlobsBackend
<filesystem-blobs-backend>` are:

* Configurable storage path.
* Quota support.
* Username, blob_id and user storage directory sanitization.