soledad.git - [soledad]

Age	Commit message (Collapse)	Author
2016-12-12	[feature] use transactions on sync	Victor Shyba
	We were using 1 transaction per doc, which is bad. Reference: http://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite Code now uses 1 transaction for the whole sync.
2016-12-12	[feature] get attachments as generator runs	Victor Shyba
	Instead of getting the attachments as the generator runs, get_docs will now get as needed. Also, deepcopy solves a memory issue where we were feeding the couchdb lib view with blobs while modifying it unintentionally.
2016-11-27	[bug] patch twisted logger so it works with twistd --syslog	drebs

2016-11-22	[feat] improve missing couch config doc error logging	drebs

2016-10-21	[docs] explain CouchServerState parameters	Victor Shyba
	create_cmd lacked an explanation and check_schema_versions lacked reasoning on why it defaults to False.
2016-10-21	[tests] make check_schema_versions default to False	Victor Shyba
	CouchServerState is spread across test codebase and this option is intended to be used only on server startup. This commit makes it default to False and explicitly set it to True on where it's necessary.
2016-10-03	[feature] check for user dbs couch schema versions	drebs

2016-09-30	[test] add flake8 code check and generalize name of tox env	drebs

2016-09-22	[feat] centralize logging and use twisted.logger by default	drebs

2016-08-29	[pkg] remove leftover simplejson imports from l2db	drebs

2016-08-17	[bug] remove misleading ensure_ddoc	Victor Shyba
	ensure_ddoc doesnt make sense anymore as we dont have any ddoc other than _security, which has its own method for setting. 'ensure_security' is explicit and is set internally when user is creating a database, otherwise it will be False as it's only used during creation. This isn't exposed externally (of couch module) to avoid confusion. This confusion was making create-user-db fail to create a security ddoc as it wasn't passing ensure_ddocs=True. -- Resolves: #8388
2016-08-05	[bug] create gen document after saving the actual document in couch	drebs
	If we create the gen document before saving the actual document in couch, we may run into problems if more than one client is syncing and trying to save documents with the same id at the same time. By moving the gen document creation to after the actual document save in couch, we rely on couch/u1db resolution of conflicts before actually allocating a new generation, and the problem above doesn't occur.
2016-08-01	[bug] retry allocation of gen instead of using a lock	drebs
	The use of a lock to allocate the next generation of a change in couch backend suffers from at least 2 problems: 1. all modification to the couch database would have to be made through a soledad server entrypoint, otherwise the lock would have no effect. 2. introducing a lock makes code uglier, harder to debug, and prone to undesired blocks. The solution implemented by this commit is not so elegant, but works for what we need right now. Now, concurrent threads updating the couch database will race for the allocation of a new generation, and retry when they fail to do so. There's no high risk of getting blocked for too much time in the while loop because (1) there's always one thread that wins (what makes the expected number of retries to be N/2 if N is the number of concurrent threads), and (2) the number of concurrent attempts to update the user database is limited by the number of devices syncing at the same time.
2016-08-01	[feat] use couch _all_docs for get_docs() and get_all_docs()	drebs
	The previous solution would make use of concurrent get's to couch backend in a pool of threads to implement the get_docs() and get_all_docs() CouchDatabase backend methods. This commit replaces those by a simpler implementation use the `_all_docs` couchdb view api. It passes all needed IDs to the view and r etrieves all documents with content in the same request. A comparison between both implementations shows an improvement of at least 15 times for large number of documents. The table below shows the time for different implementations of get_all_docs() for different number of documents and threads versus _all_docs implementation: +-------+-----------------+------------------+-------------+ \| \| threads \| _all_docs \| improvement \| +-------+-----------------+------------------+-------------+ \| 10 \| 0.0728030204773 \| 0.00782012939453 \| 9.3 \| \| 100 \| 0.609349966049 \| 0.0377721786499 \| 16.1 \| \| 1000 \| 5.86522197723 \| 0.370730876923 \| 15.8 \| \| 10000 \| 66.1713931561 \| 3.61764383316 \| 18.3 \| +-------+-----------------+------------------+-------------+
2016-08-01	[refactor] simplify couch whats_changed calculation	drebs

2016-08-01	[bug] use couch lock to atomize saving of document	drebs

2016-08-01	[feat] standardize metadata storage in couch backend.	drebs

2016-08-01	[feat] use a lock for updating couch gen data	drebs

2016-08-01	[bug] fix order of multipart serialization when writing to couch	drebs
	The couch backend makes use of attachments and multipart structure for writing the document to the couch database. For that to work, the order in which attachments are described must match the actual order in which attachments are written to the couch http stream. This was not being properly taken care of, and eventually the json serializer was arbitrarilly ordering the attachments description in a way that it didn't match the actual order of attachments writing. This commit fixes that by using json.dumps() sort_keys parameter and making sure conflicts are always written before content.
2016-08-01	[feat] remove usage of design documents in couch	drebs
	Design documents are slow and we already have alternatives to all uses we used to make of them, so this commit completelly removes all usage of design documents.
2016-07-25	[feat] do not use couch views for sync metadata	drebs
	When compared to plain couch document get, the use of the simplest view functions takes around double the time, while the use of the simplest list function can take more than 8 times: get 100 docs: total: 0.440337 secs mean: 0.004403 query 100 views: total: 0.911425 secs mean: 0.009114 query 100 lists: total: 3.711537 secs mean: 0.037115 Besides that, the current implementation of sync metadata storage over couch is dependent of timestamps of document puts, what can lead to metadata corruption if the clock of the system is changed for any reason. Because of these reasons, we seek to change the implementation of database metadata. This commit implements the storage of transaction log data on couch documents with special ids, in the form "gen-xxxxxxxxxx", where the x's are replaced by the generation index. Each generation document holds a dictionary containing the generation, doc_id and transaction_id for the changed document. For each modified document, a generation document is inserted holding the transaction metadata.
2016-07-25	[feat] use _local couch docs for metadata storage	drebs

2016-07-13	[style] pep8	Kali Kaneko

2016-07-12	[pkg] remove unneeded oauth code	drebs

2016-07-12	[test] toxify tests	drebs
	- move tests to root directory - split tests in different subdirectories - setup a small package with common test dependencies in /testing/test_soledad - add tox.ini that will: - install the test_soledad package and other test dependencies - install soledad common, client, server from the repository - run tests contianed in /testing/tests directory using pytest This commit also removes all oauth code from tests, as we have removed the u1db dependency (by importing it into the repo and naming it l2db) and don't neet oauth at all right now.
2016-07-12	[refactor] make tests use l2db submodule	Kali Kaneko
	From this moment on, we embed a fork of u1db called l2db.
2016-07-12	[refactor] fork u1db	Kali Kaneko

2016-06-22	pep8	Kali Kaneko

2016-06-22	[bug] fix test processing order	NavaL
	This moves the reactor time to the loopingcall period. This is necessary as the decryption is now deferred to a thread. The test will exit before the task is executed otherwise.
2016-06-22	[style] pep8 compatibility: indent and white space	NavaL
	It was breaking E126 and E202 before
2016-06-08	[tests] avoid using get_all_docs on asserts	Victor Shyba
	EncryptedSyncTestCase.test_sync_very_large_files is still getting an excessive amount of memory on very slow machines (specially on old spinning magnetic disks). This commit checks each doc at a time instead of getting them all. More refinement is necessary for this test to pass on any machine.
2016-06-06	[style] remove misused lambdas	Tulio Casagrande
	Pep8 was warning about assignment of lambdas. These lambdas should be partials
2016-06-06	[test] use inline deferreds in test for change passphrase	drebs

2016-06-06	[test] turn test for _gen_secret into many unit tests	drebs

2016-06-06	[bug] ensures docs_received table has the sync_id column	NavaL
	For the case where the user already has data synced, this commit will migrate the docs_received table to have the column sync_id. That is required by the refactoring in the previous commits.
2016-06-06	[refactor] encdecpool queries and testing	Victor Shyba
	This commit adds tests for doc ordering and encdecpool control (start/stop). Also optimizes by deleting in batch and checking for a sequence in memory before asking the local staging for documents.
2016-05-23	[refactor] remove user_id argument from Soledad init	Caio Carrara
	The constructor method of Soledad was receiving two arguments for user id. One of them was optional with None as default. It could cause an inconsistent state with uuid set but userid unset. This change remove the optional user_id argument from initialization method and return the uuid if anyone call Soledad.userid method.
2016-05-16	[style] pep8	Kali Kaneko

2016-04-26	[refactor] cleanup bootstrap process	drebs

2016-04-26	[refactor] remove shared db locking from tests	drebs

2016-04-26	[refactor] remove shared db locking from client	drebs
	Shared db locking was used to avoid the case in which two different devices try to store/modify remotelly stored secrets at the same time. We want to avoid remote locks because of the problems they create, and prefer to crash locally. For the record, we are currently using the user's password to encrypt the secrets stored in the server, and while we continue to do this we will have to re-encrypt the secrets and update the remote storage whenever the user changes her password.
2016-04-01	[pkg] update to versioneer 0.16	Kali Kaneko

2016-01-21	[Fix] fix concurrency problem in test_sync_deferred	Folker Bernitt
	- Use dbsyncer (SQLCipherU1DBSync) instead of SQLCipherDatabase as only the first one supports multiple threads while syncing and is actually used by Soledad.sync
2015-12-22	[docs] incomplete doc for security config parameter	Victor Shyba
	database_security parameter was either undocumented or incomplete. This commit adds a few more doc to make it consistent with latest changes. Closes #7689
2015-12-14	[fix] remove trailing whitespace to please pep8	Christoph Kluenter

2015-12-14	[bug] fix failing tests after last events modification	Kali Kaneko

2015-12-03	[feat] set default to False on batching for now	Victor Shyba
	All batching code has no effect by default with this commit. Since we know that this is a dangerous new feature we will enable them only on our test servers and check them manually before setting it as default or adding more configuration features. Use SyncTarget and server conf file to enable it for testing.
2015-12-03	[feat] generation caching during a batch	Victor Shyba
	Generation cache was removed for simple processing and it should not got back, but during a batch the server wont change its generation. So a little trick to hold that temporary information until batch finishes is needed.
2015-12-03	[feat] add configuration to disable batching	Victor Shyba
	Batch support is optional. This commit adds a 'batching' configuration option to disable it.
2015-12-03	[feat] checks staged docs inside batch	Victor Shyba
	This commit adds checking for consistency on batch. When a doc is needed during a batched sync and it doesnt exists on database, current code will make a partial batch to avoid processing like it doesnt exist.