summaryrefslogtreecommitdiff
path: root/common/src/leap/soledad
AgeCommit message (Collapse)Author
2017-06-24[pkg] unify client and server into a single python packagedrebs
We have been discussing about this merge for a while. Its main goal is to simplify things: code navigation, but also packaging. The rationale is that the code is more cohesive in this way, and there's only one source package to install. Dependencies that are only for the server or the client will not be installed by default, and they are expected to be provided by the environment. There are setuptools extras defined for the client and the server. Debianization is still expected to split the single source package into 3 binaries. Another avantage is that the documentation can now install a single package with a single step, and therefore include the docstrings into the generated docs. - Resolves: #8896
2017-05-08[refactor] _database -> _dbVictor Shyba
2017-05-01[refactor] create client _database moduledrebs
2017-03-17[refactor] Improve python3 compatibilityefkin
With this commit all tests on py34 tox environment are collected.
2017-02-09[refactor] remove leftover code from previous wsgi authdrebs
2017-02-09[feat] use twisted web http auth and credsdrebs
2017-02-09[feat] reuse the url mapper instead of creating it for every requestdrebs
2016-12-12[style] fix pep8 and confsVictor Shyba
Fixes setup.cfg, adding current exclude rules, simplified tox.ini to use setup.cfg and fixed all.
2016-12-12[feature] batch based on payload sizeVictor Shyba
batch is slower than usual insert for a single doc, so, if a document exceeds the buffer, commit the batch (if any) and put the huge load by traditional insert. refactor coming.
2016-12-12[feature] fix and enable batchVictor Shyba
Batching is now decided by server, this commits enables it.
2016-12-12[feature] make reading attachments optionalVictor Shyba
Will put a file object on doc json string if read_content is False, otherwise it will fetch and fill as usual. This is useful for improving server througput on sync download stream by receiving a bulk-get without attachments and consume the file-objects as they come.
2016-12-12[feature] server download stream from file objectVictor Shyba
couchdb lib returns a file object representing the attachment. This commit dumps the read() call into the wsgi write() call. Doc representation uses 2 lines also, separating metadata from content.
2016-12-12[feature] use transactions on syncVictor Shyba
We were using 1 transaction per doc, which is bad. Reference: http://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite Code now uses 1 transaction for the whole sync.
2016-12-12[feature] get attachments as generator runsVictor Shyba
Instead of getting the attachments as the generator runs, get_docs will now get as needed. Also, deepcopy solves a memory issue where we were feeding the couchdb lib view with blobs while modifying it unintentionally.
2016-11-27[bug] patch twisted logger so it works with twistd --syslogdrebs
2016-11-22[feat] improve missing couch config doc error loggingdrebs
2016-10-21[docs] explain CouchServerState parametersVictor Shyba
create_cmd lacked an explanation and check_schema_versions lacked reasoning on why it defaults to False.
2016-10-21[tests] make check_schema_versions default to FalseVictor Shyba
CouchServerState is spread across test codebase and this option is intended to be used only on server startup. This commit makes it default to False and explicitly set it to True on where it's necessary.
2016-10-03[feature] check for user dbs couch schema versionsdrebs
2016-09-30[test] add flake8 code check and generalize name of tox envdrebs
2016-09-22[feat] centralize logging and use twisted.logger by defaultdrebs
2016-08-29[pkg] remove leftover simplejson imports from l2dbdrebs
2016-08-17[bug] remove misleading ensure_ddocVictor Shyba
ensure_ddoc doesnt make sense anymore as we dont have any ddoc other than _security, which has its own method for setting. 'ensure_security' is explicit and is set internally when user is creating a database, otherwise it will be False as it's only used during creation. This isn't exposed externally (of couch module) to avoid confusion. This confusion was making create-user-db fail to create a security ddoc as it wasn't passing ensure_ddocs=True. -- Resolves: #8388
2016-08-05[bug] create gen document after saving the actual document in couchdrebs
If we create the gen document before saving the actual document in couch, we may run into problems if more than one client is syncing and trying to save documents with the same id at the same time. By moving the gen document creation to after the actual document save in couch, we rely on couch/u1db resolution of conflicts before actually allocating a new generation, and the problem above doesn't occur.
2016-08-01[bug] retry allocation of gen instead of using a lockdrebs
The use of a lock to allocate the next generation of a change in couch backend suffers from at least 2 problems: 1. all modification to the couch database would have to be made through a soledad server entrypoint, otherwise the lock would have no effect. 2. introducing a lock makes code uglier, harder to debug, and prone to undesired blocks. The solution implemented by this commit is not so elegant, but works for what we need right now. Now, concurrent threads updating the couch database will race for the allocation of a new generation, and retry when they fail to do so. There's no high risk of getting blocked for too much time in the while loop because (1) there's always one thread that wins (what makes the expected number of retries to be N/2 if N is the number of concurrent threads), and (2) the number of concurrent attempts to update the user database is limited by the number of devices syncing at the same time.
2016-08-01[feat] use couch _all_docs for get_docs() and get_all_docs()drebs
The previous solution would make use of concurrent get's to couch backend in a pool of threads to implement the get_docs() and get_all_docs() CouchDatabase backend methods. This commit replaces those by a simpler implementation use the `_all_docs` couchdb view api. It passes all needed IDs to the view and r etrieves all documents with content in the same request. A comparison between both implementations shows an improvement of at least 15 times for large number of documents. The table below shows the time for different implementations of get_all_docs() for different number of documents and threads versus _all_docs implementation: +-------+-----------------+------------------+-------------+ | | threads | _all_docs | improvement | +-------+-----------------+------------------+-------------+ | 10 | 0.0728030204773 | 0.00782012939453 | 9.3 | | 100 | 0.609349966049 | 0.0377721786499 | 16.1 | | 1000 | 5.86522197723 | 0.370730876923 | 15.8 | | 10000 | 66.1713931561 | 3.61764383316 | 18.3 | +-------+-----------------+------------------+-------------+
2016-08-01[refactor] simplify couch whats_changed calculationdrebs
2016-08-01[bug] use couch lock to atomize saving of documentdrebs
2016-08-01[feat] standardize metadata storage in couch backend.drebs
2016-08-01[feat] use a lock for updating couch gen datadrebs
2016-08-01[bug] fix order of multipart serialization when writing to couchdrebs
The couch backend makes use of attachments and multipart structure for writing the document to the couch database. For that to work, the order in which attachments are described must match the actual order in which attachments are written to the couch http stream. This was not being properly taken care of, and eventually the json serializer was arbitrarilly ordering the attachments description in a way that it didn't match the actual order of attachments writing. This commit fixes that by using json.dumps() sort_keys parameter and making sure conflicts are always written before content.
2016-08-01[feat] remove usage of design documents in couchdrebs
Design documents are slow and we already have alternatives to all uses we used to make of them, so this commit completelly removes all usage of design documents.
2016-07-25[feat] do not use couch views for sync metadatadrebs
When compared to plain couch document get, the use of the simplest view functions takes around double the time, while the use of the simplest list function can take more than 8 times: get 100 docs: total: 0.440337 secs mean: 0.004403 query 100 views: total: 0.911425 secs mean: 0.009114 query 100 lists: total: 3.711537 secs mean: 0.037115 Besides that, the current implementation of sync metadata storage over couch is dependent of timestamps of document puts, what can lead to metadata corruption if the clock of the system is changed for any reason. Because of these reasons, we seek to change the implementation of database metadata. This commit implements the storage of transaction log data on couch documents with special ids, in the form "gen-xxxxxxxxxx", where the x's are replaced by the generation index. Each generation document holds a dictionary containing the generation, doc_id and transaction_id for the changed document. For each modified document, a generation document is inserted holding the transaction metadata.
2016-07-25[feat] use _local couch docs for metadata storagedrebs
2016-07-13[style] pep8Kali Kaneko
2016-07-12[pkg] remove unneeded oauth codedrebs
2016-07-12[test] toxify testsdrebs
- move tests to root directory - split tests in different subdirectories - setup a small package with common test dependencies in /testing/test_soledad - add tox.ini that will: - install the test_soledad package and other test dependencies - install soledad common, client, server from the repository - run tests contianed in /testing/tests directory using pytest This commit also removes all oauth code from tests, as we have removed the u1db dependency (by importing it into the repo and naming it l2db) and don't neet oauth at all right now.
2016-07-12[refactor] make tests use l2db submoduleKali Kaneko
From this moment on, we embed a fork of u1db called l2db.
2016-07-12[refactor] fork u1dbKali Kaneko
2016-06-22pep8Kali Kaneko
2016-06-22[bug] fix test processing orderNavaL
This moves the reactor time to the loopingcall period. This is necessary as the decryption is now deferred to a thread. The test will exit before the task is executed otherwise.
2016-06-22[style] pep8 compatibility: indent and white spaceNavaL
It was breaking E126 and E202 before
2016-06-08[tests] avoid using get_all_docs on assertsVictor Shyba
EncryptedSyncTestCase.test_sync_very_large_files is still getting an excessive amount of memory on very slow machines (specially on old spinning magnetic disks). This commit checks each doc at a time instead of getting them all. More refinement is necessary for this test to pass on any machine.
2016-06-06[style] remove misused lambdasTulio Casagrande
Pep8 was warning about assignment of lambdas. These lambdas should be partials
2016-06-06[test] use inline deferreds in test for change passphrasedrebs
2016-06-06[test] turn test for _gen_secret into many unit testsdrebs
2016-06-06[bug] ensures docs_received table has the sync_id columnNavaL
For the case where the user already has data synced, this commit will migrate the docs_received table to have the column sync_id. That is required by the refactoring in the previous commits.
2016-06-06[refactor] encdecpool queries and testingVictor Shyba
This commit adds tests for doc ordering and encdecpool control (start/stop). Also optimizes by deleting in batch and checking for a sequence in memory before asking the local staging for documents.
2016-05-23[refactor] remove user_id argument from Soledad initCaio Carrara
The constructor method of Soledad was receiving two arguments for user id. One of them was optional with None as default. It could cause an inconsistent state with uuid set but userid unset. This change remove the optional user_id argument from initialization method and return the uuid if anyone call Soledad.userid method.
2016-05-16[style] pep8Kali Kaneko