-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Since our tests are time bounded, creating a connection eats a time that can be used to give more dense workload to tarantool.
I didn't investigated the code much against this question, but I see the problem in the bank-lua test:
jepsen.tarantool/src/tarantool/bank.clj
Lines 54 to 64 in 6165e6e
:transfer | |
(let [{:keys [from to amount]} (:value op) | |
con (cl/open (first (db/primaries test)) test) | |
table (clojure.string/upper-case table-name) | |
r (-> con | |
(sql/query [(str "SELECT _WITHDRAW('" table "'," from "," to "," amount ")")]) | |
first | |
:COLUMN_1)] | |
(if (false? r) | |
(assoc op :type :fail, :value {:from from :to to :amount amount}) | |
(assoc op :type :ok)))))) |
(Consider cl/open
.)
Moreover, the (db/primaries test)
call also creates its own connection and performs a request:
jepsen.tarantool/src/tarantool/client.clj
Lines 63 to 68 in 6165e6e
(defn primary | |
[node] | |
(let [conn (open node test) | |
leader (:COLUMN_1 (first (sql/query conn ["SELECT _LEADER()"])))] | |
;(assert leader) | |
leader)) |
How it should work, I think:
- We should keep the connection until it dies. In case of several instances (of a replicaset), don't know: either current one or all. The latter seems better.
- We should request leadership information using the existing connection / connection pool.
- We should NOT renew a leader each time: instead, do it when the leader is gone.
- If the instance becomes unavailable.
- For write requests we can catch an error about the read-only state.
- For read requests, which should be executed on a leader (if there are such cases), we should implement some server side check, whether the node that executes a request is a leader.
- Re-create died connections or configure tarantool-java to overcome long unavailability, but still reconnect quite fast.
Metadata
Metadata
Assignees
Labels
No labels