diff options
Diffstat (limited to 'docs/benchmarks.rst')
-rw-r--r-- | docs/benchmarks.rst | 106 |
1 files changed, 0 insertions, 106 deletions
diff --git a/docs/benchmarks.rst b/docs/benchmarks.rst deleted file mode 100644 index 25e39ae7..00000000 --- a/docs/benchmarks.rst +++ /dev/null @@ -1,106 +0,0 @@ -.. _benchmarks: - -Benchmarks -========== - -We currently use `pytest-benchmark <https://pytest-benchmark.readthedocs.io/>`_ -to write tests to assess the time and resources taken by various tasks. - -To run benchmark tests, once inside a cloned Soledad repository, do the -following:: - - tox -e benchmark - -Results of automated benchmarking for each commit in the repository can be seen -in: https://benchmarks.leap.se/. - -Benchmark tests also depend on `tox` and `CouchDB`. See the :ref:`tests` page -for more information on how to setup the test environment. - -Test repetition ---------------- - -``pytest-benchmark`` runs tests multiple times so it can provide meaningful -statistics for the time taken for a tipical run of a test function. The number -of times that the test is run can be manually or automatically configured. - -When automatically configured, the number of runs is decided by taking into -account multiple ``pytest-benchmark`` configuration parameters. See the `the -corresponding documenation -<https://pytest-benchmark.readthedocs.io/en/stable/calibration.html>`_ for more -details on how automatic calibration works. - -To achieve a reasonable number of repetitions and a reasonable amount of time -at the same time, we let ``pytest-benchmark`` choose the number of repetitions -for faster tests, and manually limit the number of repetitions for slower tests. - -Currently, tests for `synchronization` and `sqlcipher asynchronous document -creation` are fixed to run 4 times each. All the other tests are left for -``pytest-benchmark`` to decide how many times to run each one. With this setup, -the benchmark suite is taking approximatelly 7 minutes to run in our CI server. -As the benchmark suite is run twice (once for time and cpu stats and a second -time for memory stats), the whole benchmarks run takes around 15 minutes. - -The actual number of times a test is run when calibration is done automatically -by ``pytest-benchmark`` depends on many parameters: the time taken for a sample -run and the configuration of the minimum number of rounds and maximum time -allowed for a benchmark. For a snapshot of the number of rounds for each test -function see `the soledad benchmarks wiki page -<https://0xacab.org/leap/soledad/wikis/benchmarks>`_. - -Sync size statistics --------------------- - -Currenly, the main use of Soledad is to synchronize client-encrypted email -data. Because of that, it makes sense to measure the time and resources taken -to synchronize an amount of data that is realistically comparable to a user's -email box. - -In order to determine what is a good example of dataset for synchronization -tests, we used the size of messages of one week of incoming and outgoing email -flow of a friendly provider. The statistics that came out from that are (all -sizes are in KB): - -+--------+-----------+-----------+ -| | outgoing | incoming | -+========+===========+===========+ -| min | 0.675 | 0.461 | -+--------+-----------+-----------+ -| max | 25531.361 | 25571.748 | -+--------+-----------+-----------+ -| mean | 252.411 | 110.626 | -+--------+-----------+-----------+ -| median | 5.320 | 14.974 | -+--------+-----------+-----------+ -| mode | 1.404 | 1.411 | -+--------+-----------+-----------+ -| stddev | 1376.930 | 732.933 | -+--------+-----------+-----------+ - -Sync test scenarios -------------------- - -Ideally, we would want to run tests for a big data set (i.e. a high number of -documents and a big payload size), but that may be infeasible given time and -resource limitations. Because of that, we choose a smaller data set and suppose -that the behaviour is somewhat linear to get an idea for larger sets. - -Supposing a data set total size of 10MB, some possibilities for number of -documents and document sizes for testing download and upload can be seen below. -Scenarios marked in bold are the ones that are actually run in the current sync -benchmark tests, and you can see the current graphs for each one by following -the corresponding links: - - -* 10 x 1M -* **20 x 500K** (`upload <https://benchmarks.leap.se/test-dashboard_test_upload_20_500k.html>`_, `download <https://benchmarks.leap.se/test-dashboard_test_download_20_500k.html>`_) -* **100 x 100K** (`upload <https://benchmarks.leap.se/test-dashboard_test_upload_100_100k.html>`_, `download <https://benchmarks.leap.se/test-dashboard_test_download_100_100k.html>`_) -* 200 x 50K -* **1000 x 10K** (`upload <https://benchmarks.leap.se/test-dashboard_test_upload_1000_10k.html>`_, `download <https://benchmarks.leap.se/test-dashboard_test_download_1000_10k.html>`_) - -In each of the above scenarios all the documents are of the same size. If we -want to account for some variability on document sizes, it is sufficient to -come up with a simple scenario where the average, minimum and maximum sizes are -somehow coherent with the above statistics, like the following one: - -* 60 x 15KB + 1 x 1MB |