Age | Commit message (Collapse) | Author |
|
|
|
|
|
It's not being used
|
|
|
|
|
|
|
|
This was discovered during load tests: Trying to process more than 999
docs triggers an error on SQLite due a select query not supporting 999
values to query.
|
|
test_processing_order aims to check that unordered docs wont be
processed, but if we let the pool start and advance Twisted LoopingCall
clock right before calling the processing method manually, the process
method will run concurrently and cause a race condition issue.
|
|
|
|
|
|
For the case where the user already has data synced, this commit will
migrate the docs_received table to have the column sync_id.
That is required by the refactoring in the previous commits.
|
|
Docs created from one failed sync would be there for the next one,
possibly causing a lot of hard to find errors. This commit adds a
sync_id field to track each sync documents isolated and cleans up the
pool on start instead of constructor.
|
|
This commit adds tests for doc ordering and encdecpool control
(start/stop). Also optimizes by deleting in batch and checking for a
sequence in memory before asking the local staging for documents.
|
|
This commit removes the multiprocessing pool and gives a step closer to
make encdecpool simpler. Download speed is now at a constant rate, CPU
usage lower and reactor responding fast when running with a HTTP server
like Pixelated.
|
|
- Move them to a thread so reactor can continue
processing e.g. http requests
|
|
|
|
This line was missing an yield and without it we end up inserting a
document that is being retrieved and bad things happen.
This is the core fix from yesterday debugging session. During sequential
syncs the pool was inserting and querying at the same time and sometimes
repeating or failing to delete documents.
|
|
If we reset the vars after firing the finish callback, other thread can pick
up a dirty state on due concurrency.
|
|
We are getting "too many files open" while running tests with 1024 max
files open. This is a leak from close methods. Some of them should be fixed
on this commit, but further investigation may be necessary.
|
|
|
|
Previous to this modification, the initialization of the sync decrypter pool
could happen concurrently with other database operations. That could cause the
pool to hang because it could be waiting for something that was mistakenly
deleted because of the wrong order of database operations.
This commit implements a standard which we already use in leap.keymanager and
leap.mail which makes some methods wait for the initialization operation
before they are actually called.
Closes: #7386
|
|
multiprocessing.Queue is suitable for process communication, but its not
the ideal for a reactor model. This commit changes it to DeferredQueue,
where consumers and producers doesnt block and Twisted can handle them
better.
|
|
The encryption pool could be stopped twice and would break
on the second attempt because it deletes the encryption queue
variable. Added a condition to make sure it only deletes the
encryption queue if it exists, making it more idempotent
|
|
|
|
|
|
* change close method name to stop
* add start/stop methods to both enc/dec clases
* remove any delayed calls on pool shutdown
|
|
Because of how the incoming document queue is implemented, it could be the
case that a document was sent to async decryption queue more than once. This
commit creates a list of documents to be decrypted, so we avoid sending the
same document to the queue more than once.
|
|
Deferred encryption was disabled because the soledad u1db wrapper for adbapi
did not correctly udated the parameter that controls it. Also, it did not
contain the encrypter pool. This commit moves the sync db and encrypt pool to
the main api, so they can be passed to the wrapper and deferred encryption
can work.
|
|
It makes the code simpler and clearer to use a deferred instead of
having to pull on 'has_finished'.
- Related: #7234
|
|
after suggestions in the review
|
|
|
|
|
|
When async decrypting, we want to finish as fast as possible. When encrypting,
though, we don't have such a rush. With an encryption loop period of 2
seconds, we're able to encrypt 30 documents in one minute (the current bitmask
client sync period), which is meaningful: should moderatelly use the processor
while not syncing and relief from some work when actually syncing.
|
|
Previous to this change, the actual encryption method used to run on its own
thread. When the close method was called from another thread, the queue could
be deleted after the encryption method loop had started, but before the queue
was checked for new items.
By removing that thread and moving the encryption loop to the reactor, that
race condition should disappear.
Closes: #7088.
|
|
Queue exceptions are not in multiprocessing.Queue module, but in plain Queue
instead.
|
|
|
|
When handling this exception Python got lost because the import was
incorrect. Queue.Empty comes from Queue, not from multiprocessing.Queue
|
|
The whole idea of the encrypter/decrypter pool is to be able to use multiple
cores to allow parallel encryption/decryption. Previous to this commit, the
encryptor/decryptor pools could be configured to not use workers and instead
do encryption/decryption inline. That was meant for testing purposes and
defeated the purpose of the pools.
This commit removes the possibility of inline encrypting/decrypting when using
the pools. It also refactors the enc/dec pool code so any failures while using
the pool are correctly grabbed and raised to the top of the sync deferred
chain.
|
|
When we initialized the async decrypter pool in the target's init method we
needed a proxy to ensure we could update the insert doc callback with the
correct method later on. Now we initialize the decrypter only when we need it,
so we don't need this proxy anymore. This commit removes the unneeded proxy.
|
|
We have to make sure any failures in asynchronous decryption code is grabbed
and properly transmitted up the deferred chain so it can be logged. This
commit adds errbacks in the decryption pool that grab any failure and a
check on the http target the failure if that is the case.
|
|
|
|
This change uses twisted deferreds for the whole syncing process and paves the
way to implementing other transport schemes. It removes a lot of threaded code
that used locks and was very difficult to maintain, and lets twisted to the
dirty work. Furthermore, all blocking network i/o is now handled
asynchronously by the twisted.
This commit removes the possibility of interrupting a sync, and we should
reimplement it using cancellable deferreds if we need it.
|
|
The access to the sync db was modified to use twisted.enterprise.adbapi, but
only the asynchronous decryption of incoming documents during sync was
adapted. This commit modifies the asynchornous encryption of documents to also
use the adbapi for accessing the sync db.
|
|
This commit actually does some different things:
* When doing asynchronous decryption of incoming documents in soledad client
during a sync, there was the possibility that a document corresponding to
a newer generation would be decrypted and inserted in the local database
before a document corresponding to an older generation. When this
happened, the metadata about the target database (i.e. its locally-known
generation) would be first updated to the newer generation, and then an
attempt to insert a document corresponding to an older generation would
cause the infamous InvalidGeneration error.
To fix that we use the sync-index information that is contained in the
sync stream to correctly find the insertable docs to be inserted in the
local database, thus avoiding the problem described above.
* Refactor the sync encrypt/decrypt pool to its own file.
* Fix the use of twisted adbapi with multiprocessing.
Closes: #6757.
|