Skip to content

Don't allow write entries with same entry id multiple times in bookie #1066

@sijie

Description

@sijie

FEATURE REQUEST

  1. Please describe the feature you are requesting.

multiple entries with same entry id can be written into one bookie in following 2 cases:

  • if ensemble change is disabled, client will reattempt writing an entry to same bookie again after failures. The failures can come from timeouts.
  • if ensemble change is flapping, e.g. [A, B, C] -> [A, B, D] -> [A, B, C].
  • auto recovery can write entries to bookies that used to be excluded from an ensemble.

Currently bookies are allowed writing entries with same entry id multiple times. That means multiple entries of same entry id would appear in entry log files, but only one entry's location will be updated in ledger cache.

Theoticially this is okay for most of the time. However, in practice, we might want to disallow this behavior, since this would introduce potentially "inconsistency" concerns, if entry is corrupted (e.g. memory corruption, client nic corruption) between multiple retries.

An improvement can be:

  • enforce write side checksum verify the checksum in bookie write path #1046
  • compare the entries if a bookie receives a duplicated entry
  • if the duplicated entry is same, respond success back without writing the duplicated entry;
  • if the duplicated entry is not same, reject this write and respond a special error code back.
  1. Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have). Are you currently using any workarounds to address this issue?

must-have

  1. Provide any additional detail on your proposed use case for this feature.

more details about the conversation between @jvrao and @sijie

jujjuri [4:55 PM]
Hi, I have a question about bk.disableEnsembleChangeFeature.isAvailable().

[4:55 PM]
I think this is @sijie’s checkin.
[4:56 PM]
If this feature is enabled and on bookie write failure we are doing unsetSuccessAndSendWriteRequest
[4:56 PM]
it is possible that the bookie write failed for various reasons including timeout
[4:56 PM]
so in that case we could be sending duplicate entries to the bookie

J [4:57 PM]
joined #dev.

sijie [4:57 PM]
@jujjuri checking

sijie [5:02 PM]
@jujjuri: yes. that change is to reattempt sending the requests. because disableEnsembleChange is enabled.
[5:03 PM]
disableEnesembleChange is only enabled when you configure a feature provider
[5:03 PM]
the default feature provider disable all features.

jujjuri [5:03 PM]
I understand @sijie even if we enable that feature
[5:03 PM]
I am saying we could send duplicate entries to bookies

sijie [5:04 PM]
yes

jujjuri [5:04 PM]
and bookies can handle duplicate entries ?

sijie [5:04 PM]
we can send duplicate entries. but what is the concern of sending duplicate entries?
[5:05 PM]
yes it handles duplicate entries

jujjuri [5:05 PM]
what happens ? does it fail?
[5:05 PM]
the second write? and we keep retrying it?
[5:06 PM]
tell me a case -

sijie [5:06 PM]
the second write doesn’t fail.

jujjuri [5:06 PM]
we are trying to write to bookie1 entryId 100
[5:06 PM]
it failed for timeout
[5:06 PM]
and we resubmitted it

sijie [5:07 PM]
you might have multiple entries in entrylog files, and only one will be indexed.

jujjuri [5:07 PM]
firs write succeeded; but second write failed
[5:07 PM]
with entryExists or something right?
[5:07 PM]
hmm
[5:07 PM]
that is my question

sijie [5:07 PM]
no the second write doesn’t fail

jujjuri [5:07 PM]
if we endup multiple entries in the entrylog
[5:07 PM]
what if they are different?

sijie [5:08 PM]
why they will be different
[5:08 PM]
it is same situation even without this code path

jujjuri [5:08 PM]
I don't know. some bug in the client code or buffer corruption;

sijie [5:08 PM]
we will still send an entry multiple times.
[5:08 PM]
think about ensemble changes.

jujjuri [5:08 PM]
sure
[5:09 PM]
but as per metadata

sijie [5:09 PM]
an ensemble is changed from [A, B, C] to [A, B, D], then back to [A, B, C]

jujjuri [5:09 PM]
the failed bookie is not considered
[5:09 PM]
sure..
[5:11 PM]
Another Q: As part of 34e8bf200c6f3797bd6fa4c5d86646e9eb7f0d3b @ivankelly added a tracking for pendingWriteRequests in PendingAddOp.java; I don't see the use of this. Buffer if refcounted and released at BookieClient level. I am wondering if there is any real use of this counter.

sijie [5:12 PM]
so the question isn’t related to whether that feature is enabled or not. the question is more about whether the bookies are allowed have multiple entries of same entry id. that is a valid question/concern, we can enforce 1) write side checksum 2) compare the entries if received a duplicated entry 3) if an entry is same, respond successfully but don’t write again. if an entry isn’t same, reject the write and respond some error code.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions