Age | Commit message (Collapse) | Author |
|
|
|
|
|
We were using 'x'*size as payload, but on real usage the payload will be
random. This commit randomizes the payload using a predefined seed, so
the random payload will be the same across benchmarks.
Using random payloads also improves accuracy of compression or encoding
impacts and we will be evaluating those changes for resouce usage
issues.
Also note that base64 is used on payload. That was needed for utf8
safety, but overhead was removed to leave payloads as defined by
benchmarks.
Base64 was chosen also due its popular usage on MIME encoding, which is
used on mail attachments (our current scenario).
|
|
TestSyncEncrypterPool.test_encrypt_doc_and_get_it_back was trying to do
an operation and asserting the number of attempts. This test is about
putting a doc on encrypter pool and getting it encrypted. If we dont
wait for the encryption operation to succeed, then complex
trial-and-error happens, but if we just ask twisted to wait for one
operation before going to the other, this is not needed.
-- Resolves: #8398
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If the moving of the config document is the last action of the couch
schema migration script, then we can test for successful migration of a
certain db by checking if the config document was already moved. This
commit just changes the order of migration actions to enforce this
situation.
|
|
Previous versions of the couchdb schema used documents "u1db_sync_log"
and "u1db_sync_state" to store sync metadata. At some point this was
changed, but the documents might have stayed as leftovers. This commit
adds the deletion of such documents to the migration script.
|
|
|
|
This isnt a test, but a benchmark. Initialization sounds more like an
operation while instance is just something.
|
|
|
|
We are using lower values on test_encdecpool due high memory usage,
described in #7370. Added a comment to explain it.
|
|
defer parameter wasnt clear
|
|
Otherwise it will add unrelated overhead to results.
|
|
|
|
They arent used so far and using empty dicts to make them work is ugly.
Removing it leaves the return function on setup code clean and readable.
|
|
1000 docs at 100k~500k are exploding memory (4Gb+4Gb swap).
Changed for 100 docs in order to be able to get measures on higher
loads. Now its 10k, 100k and 500k
|
|
Hypothesis: raw vs doc
Added the same sizes set (10k, 100k, 500k, 1M, 10M, 50M) as the document
crypto test, so we can compare how close to raw the higher level
operation is.
|
|
10k, 100k, 500k, 1m, 10m and 50m for encryption and decryption of a
whole document.
|
|
This was discovered during load tests: Trying to process more than 999
docs triggers an error on SQLite due a select query not supporting 999
values to query.
|
|
Most of them are commented as memory usage is going out of control for
now.
|
|
It has a heavy scrypt hashing processing with room for improvement.
|
|
Syncing without any changes was reported as slow. This benchmark will
help measure it.
|
|
Use a new one to avoid reusing the same database.
|
|
function is the default scope, so there is no need to pass this
parameter. Previously, one of the scopes was 'module', but it is a
nested function that fires on demand, so it should clean up itself from
test to test in order to avoid conflict while putting.
|
|
|
|
Creating 20/500k, 100/100k and 1000/10k.
|
|
If we have many scenarios (like 20/500k, 100/100k, 1000,10k) then making
a nested function to generate tests based on scenario parameters
simplifies the code a lot.
|
|
Adapted pytest-benchmark to Twisted as it's synchronous and added
fixtures for benchmarking.
|
|
|
|
ensure_ddoc doesnt make sense anymore as we dont have any ddoc other
than _security, which has its own method for setting. 'ensure_security'
is explicit and is set internally when user is creating a database,
otherwise it will be False as it's only used during creation. This isn't
exposed externally (of couch module) to avoid confusion.
This confusion was making create-user-db fail to create a security ddoc
as it wasn't passing ensure_ddocs=True.
-- Resolves: #8388
|
|
tox was configured to change to the testing/tests directory before
executing pytest, by using tox's "changedir" configuration option. The
reason why this was the case is that we wanted to discover tests inside
the testing/tests directory only.
The problem with that approach is that if we wanted to point to a
specific test file, for example "tests/perf/test_sync.py", we would have
to omit the "tests" part and write "tox perf/test_sync.py" because the
argument would be understood as relative to the changed dir. That is not
practical as doesn't allow to use the shell autocomplete, and is also
not the only way to achieve what we want.
Actually, pytest has a configuration option called "testpaths" where you
can indicate where it should discover tests. This commit changes one
approach by the other and allows to user shell autocomplete for easyness
of testing during development.
|
|
"leapcode" is the LEAP docker hub organisation varac could squat
(https://hub.docker.com/r/leap/ was already taken).
|
|
|
|
We will not maintain support for older versions of debian as that
introduces some unneeded complexity for now. Also, the version pinned
for couchdb python library has a bug that makes some requests slow.
Because of those, we remove the pinning for now.
|
|
If we create the gen document before saving the actual document in
couch, we may run into problems if more than one client is syncing and
trying to save documents with the same id at the same time.
By moving the gen document creation to after the actual document save in
couch, we rely on couch/u1db resolution of conflicts before actually
allocating a new generation, and the problem above doesn't occur.
|
|
|
|
test_processing_order aims to check that unordered docs wont be
processed, but if we let the pool start and advance Twisted LoopingCall
clock right before calling the processing method manually, the process
method will run concurrently and cause a race condition issue.
|
|
|
|
|
|
"tox -e pep8" runs it standalone and "tox" includes the pep8 env.
|
|
|
|
The use of a lock to allocate the next generation of a change in couch
backend suffers from at least 2 problems:
1. all modification to the couch database would have to be made through
a soledad server entrypoint, otherwise the lock would have no effect.
2. introducing a lock makes code uglier, harder to debug, and prone to
undesired blocks.
The solution implemented by this commit is not so elegant, but works for
what we need right now. Now, concurrent threads updating the couch
database will race for the allocation of a new generation, and retry
when they fail to do so.
There's no high risk of getting blocked for too much time in the while
loop because (1) there's always one thread that wins (what makes the
expected number of retries to be N/2 if N is the number of concurrent
threads), and (2) the number of concurrent attempts to update the user
database is limited by the number of devices syncing at the same time.
|
|
The previous solution would make use of concurrent get's to couch
backend in a pool of threads to implement the get_docs() and
get_all_docs() CouchDatabase backend methods.
This commit replaces those by a simpler implementation use the
`_all_docs` couchdb view api. It passes all needed IDs to the view and
r etrieves all documents with content in the same request.
A comparison between both implementations shows an improvement of at
least 15 times for large number of documents. The table below shows the
time for different implementations of get_all_docs() for different
number of documents and threads versus _all_docs implementation:
+-------+-----------------+------------------+-------------+
| | threads | _all_docs | improvement |
+-------+-----------------+------------------+-------------+
| 10 | 0.0728030204773 | 0.00782012939453 | 9.3 |
| 100 | 0.609349966049 | 0.0377721786499 | 16.1 |
| 1000 | 5.86522197723 | 0.370730876923 | 15.8 |
| 10000 | 66.1713931561 | 3.61764383316 | 18.3 |
+-------+-----------------+------------------+-------------+
|
|
|