Age | Commit message (Collapse) | Author |
|
Previously we didn't check if an os_process was in use by a process
before closing it. This ended up generating noproc errors in the
couch_view_updaters which would then spider out to the couch_view_group
processes causing client errors and resetting compaction.
BugzId: 13798
|
|
We have observed periods of couchjs processes spiking into the hundreds
and thousands for short periods of time since the new couch_proc_manager
was released. Today I happened to catch one in the act and poked at
couch_proc_manager's ets table. There seemed to be a few more couchjs
processes with clients than I would have expected so I skimmed the code
looking for a place where we didn't clear the client value (which would
prevent it from being reused so that it would eventually just timeout).
I found a case where if the Pid that checked out the process dies
without the OS process dying, we were forgetting to clear the client in
the ets table. This patch refactors the two places we return processes
into a single function call which clears the OS process client.
|
|
* Design doc languages are converted to lists
* Make sure to monitor every client correctly
|
|
I misread the docs on what was expected for ets:select_receive/3.
|
|
For large numbers of os processes its possible that we have a slowdown
when requesting a new process. The old code matches all possible
processes out of the table to find an appropriate candidate.
We avoid the issue by using ets:select_reverse to also prefer keeping
newer processes and releasing longer lived processes. Length of life is
based on the implicit sorting of pids having newer pids sorting larger.
|
|
Otherwise we'll never reuse it
|
|
|
|
When system load exceeds the ability of os_process_soft_limit to keep
up with demand we enter a fork-use-kill (FUK) cycle. The constant
spawning and destruction os these processes thrashes system resources
and causes general instability.
This patch changes the behavior from killing each process as its
returned to letting it idle for a configurable amount of time (default
five minutes) which allows it to be reused by other clients. This way we
can avoid adding unnecessary load when demand for couchjs processes
exceeds os_process_soft_limit.
As a happy benefit this should also allow os_process_soft_limit to be
set much lower since the number of processes will now more closely
follow actual demand (instead of provisioning for the worst case
scenario).
Conflicts:
apps/couch/src/couch_os_process.erl
apps/couch/src/couch_proc_manager.erl
Conflicts:
apps/couch/src/couch_os_process.erl
|
|
|
|
|
|
|
|
|
|
Squashed commit of the following:
commit a9cd9681f6c88f0f3c019e98e2edfef55cad0129
commit eb38bca08ffbf778b69fbb2d612e23733af82ff5
commit 98a03a079ab24f2c7bd9e0d6d7fac5fa62bfd4eb
commit 9b8ec059165d981e4cd743008ecdf393a4f37f61
commit 3a891c1dd9a17fdd267c423b340dd09c31c89d7a
commit 68351dd181c8a92b5baa9ac23f25c7c191484394
commit e4384a517e2efeac9231701898a6c67213642319
commit cd954661422d0ef146b5bd7792f835dcc4220c84
commit 3bcca92c7c0102d5722dfc6b2c332766cfe0370c
commit 82d15f40f503b2609cf785ce2837e1280edaaa43
commit 70051abbd699e076452d772587c32ee5e09bdcbc
commit 7f01d37781e7774015f6cb34f795b28db9ecc9f5
BugzID: 11572
See also COUCHDB-901
A new config setting is introduced. The following block controls the
maximum number of OS processes that will be reused. Additional OS
processes will still be spawned on-demand, but they'll be terminated
when the clients are through with them.
[query_server_config]
os_process_soft_limit = 100
|