Age | Commit message (Collapse) | Author |
|
We have observed periods of couchjs processes spiking into the hundreds
and thousands for short periods of time since the new couch_proc_manager
was released. Today I happened to catch one in the act and poked at
couch_proc_manager's ets table. There seemed to be a few more couchjs
processes with clients than I would have expected so I skimmed the code
looking for a place where we didn't clear the client value (which would
prevent it from being reused so that it would eventually just timeout).
I found a case where if the Pid that checked out the process dies
without the OS process dying, we were forgetting to clear the client in
the ets table. This patch refactors the two places we return processes
into a single function call which clears the OS process client.
|
|
* Design doc languages are converted to lists
* Make sure to monitor every client correctly
|
|
I misread the docs on what was expected for ets:select_receive/3.
|
|
For large numbers of os processes its possible that we have a slowdown
when requesting a new process. The old code matches all possible
processes out of the table to find an appropriate candidate.
We avoid the issue by using ets:select_reverse to also prefer keeping
newer processes and releasing longer lived processes. Length of life is
based on the implicit sorting of pids having newer pids sorting larger.
|
|
Otherwise we'll never reuse it
|
|
|
|
When system load exceeds the ability of os_process_soft_limit to keep
up with demand we enter a fork-use-kill (FUK) cycle. The constant
spawning and destruction os these processes thrashes system resources
and causes general instability.
This patch changes the behavior from killing each process as its
returned to letting it idle for a configurable amount of time (default
five minutes) which allows it to be reused by other clients. This way we
can avoid adding unnecessary load when demand for couchjs processes
exceeds os_process_soft_limit.
As a happy benefit this should also allow os_process_soft_limit to be
set much lower since the number of processes will now more closely
follow actual demand (instead of provisioning for the worst case
scenario).
Conflicts:
apps/couch/src/couch_os_process.erl
apps/couch/src/couch_proc_manager.erl
Conflicts:
apps/couch/src/couch_os_process.erl
|
|
|
|
Our current implementation for closing an LRU DB involves a full scan
of a public ets table. This scan blocks all other activity in
couch_server and can become a serious bottleneck when the LRU cache hit
rate drops too low. In the worst-case all_dbs_active scenario we end up
with O(N**2) algorithmic complexity.
This patch adds a new index keyed on LRU for faster access to the least
recently used databases. It also moves the ets table to a dict on the
couch_server heap. The downside is an increased message rate inbound on
the couch_server, as clients are no longer allowed to update the LRU
data structures without sending a message.
BugzID: 12879
Conflicts:
apps/couch/src/couch_server.erl
|
|
When a call is made to retrieve a specific revision, latest=true will
retrieve any descendent leaves instead. This enables the replicator to
better keep up with edits that occur whilst it's retrieving revisions
BugzID: 14241
|
|
We were pulling a list of design documents and then ignoring the result
when the #db was a partition of a clustered database. Also, the call to
fabric:reset_validation_funs/1 can occasionally cause a stray rexi_EXIT
message to arrive in the db_updater mailbox (and subsequently kill the
server) if a worker fails. I don't think that's desired behavior,
though it's a debatable point. This patch spawns a middleman process to
act as a sink for those stray messages.
BugzID: 13087
|
|
This patch also adds extra tests of the key tree merging logic as well
as edoc-formatted documentation for the module and a few of the merge
functions. Closes COUCHDB-902.
Thanks Paul Davis, Bob Dionne, Klaus Trainer.
git-svn-id: https://svn.apache.org/repos/asf/couchdb/trunk@1065471 13f79535-47bb-0310-9956-ffa450edef68
|
|
Allow ?callback= for any external that returns json (i.e,
uses "json":{} instead of "data":"data".
BugzID: 12748
|
|
...but log a message instead. Fixes COUCHDB-1445.
|
|
It's not safe to assume we require, or will receive, exactly N replies
(where N is read from the "n" key of the "cluster" section of the
configuaration). This needs proper fabric-ification.
This commit will at least allow replication tests with clusters of
less than N nodes where the documents have attachments (which triggers
the multipart code).
BugzID: 14258
|
|
|
|
|
|
|
|
Conflicts:
apps/couch/src/couch_rep.erl
|
|
This reverts commit faf9071260147275bbac1633b599e85b4a302e8b.
|
|
|
|
|
|
This fixes stale=update_after.
|
|
|
|
|
|
A replication with both an HTTP source and target on the same host and
port could end up in a dead lock due to ibrowse replication pipelining
when attachments are present on the source. The ibrowse http worker
would end up forming a multipart/mime body using anonymous reader
functions for attachment stubs. When the attachment stub functions are
executed it is possible that they end up assigned to the same ibrowse
worker.
This is a bit of a long path but then end result is equivalent to
calling gen_server:call(self(), Args, infinity) from a gen_server
callback.
A quick work around for users is to set up a DNA alias (possibly in
/etc/hosts) or to use a combination of hostname and ip address so that
ibrowse assigns the requests to different pools.
|
|
|
|
|
|
|
|
As Filipe correctly points out, we want the parent to die if the child dies.
|
|
|
|
BugzID: 13133
|
|
|
|
|
|
BigCouch 0.3 cannot parse requests of the form /db/_changes?since="123-foo" so
the recent ?JSON_ENCODE addition to Since in two places causes 0.3 <-> 0.4
replication to fail with json_encode/badterm errors.
This patch applies JSON encoding only when the Since value is not already a
binary (i.e, when it's a [integer(), binary()]) and interop is restored.
BugzID: 12833
|
|
BigCouch 0.3 cannot parse requests of the form /db/_changes?since="123-foo" so
the recent ?JSON_ENCODE addition to Since in two places causes 0.3 <-> 0.4
replication to fail with json_encode/badterm errors.
This patch applies JSON encoding only when the Since value is not already a
binary (i.e, when it's a [integer(), binary()]) and interop is restored.
BugzID: 12833
|
|
Conflicts:
acinclude.m4.in
configure.ac
couchjs/c_src/http.c
src/erlang-oauth/Makefile.am
src/erlang-oauth/oauth.app.in
src/erlang-oauth/oauth_hmac_sha1.erl
src/erlang-oauth/oauth_http.erl
src/erlang-oauth/oauth_plaintext.erl
src/etap/etap_web.erl
|
|
Our headers start with a <<1>> and then four bytes indicating the length
of the header and its checksum. When the header is larger than 4090
bytes it will be split across multiple blocks in the file and will need
to be reassembled on read. The reassembly consists of stripping out
<<0>> from the beginning of each subsequent block in the
remove_block_prefixes/2 function. The bug here is that we tell
remove_block_prefixes that we're starting 1 byte into the current block
instead of 5, so it ends up removing one good byte from the header and
injecting one or more random <<0>>s.
Headers larger than 4k are very rare and generally require a view group
with a huge number of indexes or indexes with fairly large reductions,
which explains why this bug has gone undetected until now.
Closes COUCHDB-1319.
|
|
Our headers start with a <<1>> and then four bytes indicating the length
of the header and its checksum. When the header is larger than 4090
bytes it will be split across multiple blocks in the file and will need
to be reassembled on read. The reassembly consists of stripping out
<<0>> from the beginning of each subsequent block in the
remove_block_prefixes/2 function. The bug here is that we tell
remove_block_prefixes that we're starting 1 byte into the current block
instead of 5, so it ends up removing one good byte from the header and
injecting one or more random <<0>>s.
Headers larger than 4k are very rare and generally require a view group
with a huge number of indexes or indexes with fairly large reductions,
which explains why this bug has gone undetected until now.
Closes COUCHDB-1319.
|
|
|
|
BugzID:12741
|
|
The race condition in couch_server's ets table usage rears its ugly head
by leaving an entry in couch_lru. This patch just addresses the issue by
allowing the client pid to use the db and ignores the fact that for the
duration its over the max_dbs_open setting.
|
|
|
|
|
|
|
|
I was slaving away on this merge for two days.
|
|
|
|
Conflicts:
apps/couch/include/couch_db.hrl
apps/couch/src/couch_db.erl
apps/couch/src/couch_os_process.erl
apps/couch/src/couch_query_servers.erl
apps/couch/src/couch_rep.erl
apps/couch/src/couch_replication_manager.erl
apps/couch/src/couch_view_compactor.erl
apps/couch/src/couch_view_group.erl
apps/couch/src/couch_view_updater.erl
configure.ac
couchjs/c_src/http.c
couchjs/c_src/main.c
couchjs/c_src/utf8.c
etc/windows/couchdb.iss.tpl
src/couchdb/priv/Makefile.am
src/couchdb/priv/couch_js/main.c
test/etap/160-vhosts.t
test/etap/200-view-group-no-db-leaks.t
test/etap/Makefile.am
BugzID: 12645
|
|
The `created` event is emitted on apache couchdb when a database is
created. This patch re-add it to bigcouch.
|
|
|