KIP-227: Candidate and Validator Evaluation #27

jiseongnoh · 2024-11-08T04:43:34Z

Proposed changes

Propose a framework for quantitatively assessing the performance and stability of candidates and validators in the Kaia Chain network

Types of changes

Please put an x in the boxes related to your change.

Bugfix
KIP Proposal
KIP Improvement

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Used the suggested template: https://github.com/kaiachain/KIPs/blob/main/kip-template.md
I have read the CLA and signed by comment I have read the CLA Document and I hereby sign the CLA in first time contribution
I have added necessary documentation (if appropriate)
Any dependent changes have been merged and published in downstream modules

Related issues

Please leave the issue numbers or links related to this PR here.

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you proposed and what alternatives you have considered, etc.

github-actions · 2024-11-08T04:43:47Z

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

jiseongnoh · 2024-11-08T04:47:31Z

I have read the CLA Document and I hereby sign the CLA

hyeonLewis · 2025-01-10T06:57:34Z

@jiseongnoh I assigned 227 to this KIP according to the numbering rule in KIP-201. Thanks.

hyunsooda

I undertood that this KIP addresses how to measure the node's performance. How to exploit measured numbers is not the scope of this KIP and delegated for KIP-286(validator's lifecylcle). Is it correct?

hyunsooda · 2026-01-23T02:34:41Z

KIPs/kip-227.md

+| Constant                           | Value/Definition                                                        |
+| :--------------------------------- | :---------------------------------------------------------------------- |
+| `FORK_BLOCK`                       | TBD                                                                     |
+| `CANDIDATE_READY_TIMEOUT`          | 200 milliseconds (0.2 seconds)                                          |


Does this timeout apply to each message type individually? Specifically, is there a 200ms limit for each stage: Pre-prepare, Prepare, and Commit?

Thanks for the question.
CANDIDATE_READY_TIMEOUT is evaluation-only and applies only to the VRank CandidateReady reception deadline. It is not related to IBFT consensus and does not apply to IBFT message types.

Ah, I see. I previously assumed candidates participated in consensus without voting power, but I was mistaken. Is that message just a artifical message like ping-pong?

Yes, CandidateReady is synthetic in the sense that it exists purely for measurement. However, a reporter cannot fabricate a success because it requires a valid candidate signature bound to the target block.

Understood. How are message transmission and reception processed? I couldn’t find a minimal protocol for this, so I assume message testing occurs over the full EPOCH_LENGTH (one day). Additionally, if these are just artificial 'ping-pong' messages, it would be difficult to infer a candidate's machine specifications. Verifying computational capability may not align with simple message transfers. Are there any major parts that I've missed?

KIPs/kip-227.md

hyunsooda · 2026-01-23T02:59:13Z

KIPs/kip-227.md

+
+**Definition**: Measures the number of times a candidate fails to transmit the expected `CandidateReady` message during a block proposal cycle, after removing the highest `F` of the failure counts to address measurement distortions.
+
+**Measurement Method**: During the evaluation period, if the next proposer `N+1` receives a block proposal from proposer `N` and does not receive the `CandidateReady` message within the specified timeout (`CANDIDATE_READY_TIMEOUT`), the total failure count for `C` increases by 1\.


As I understood, only one of the validators measures the candidate, and the node is the next block proposer.

Is it correct that only one validator measures the candidate's performance for each block?

Why do the next block proposer measure, rather than the current proposer? (maybe relavant with the round change)

mfReport for block N is committed in block N+1 by the proposer of the next finalized block (N+1).
Reporters rotate every block, so candidates are observed by many validators over time, which makes per-reporter counts and TMFS top-F filtering meaningful against Byzantine reporters.
We commit in N+1 for consistency with pfReport and to avoid a timing race. CandidateReady has a strict deadline (200ms), and including it in block N would force delaying block production.

Thanks for the details. I see that both are N+1 commit.

mfReport: Wondering why it is committed N+1? Even if it's committed in N, I believe the distribution of diversity would be similar.

pfReport: Got it. Is that specific number 200 derived from 250ms of the execution time because the response must be guaranteed to arrive before making the proposal?

We commit mfReport/crReport in N+1 to protect block time for N. If embedded in N, the proposer must wait up to CANDIDATE_READY_TIMEOUT before finalizing the header, which delays block time.
By committing in N+1, we keep block N production on time and treat candidate readiness as an asynchronous measurement artifact that can be recorded in the next block without delaying block time.

jiseongnoh · 2026-01-23T06:56:00Z

I undertood that this KIP addresses how to measure the node's performance. How to exploit measured numbers is not the scope of this KIP and delegated for KIP-286(validator's lifecylcle). Is it correct?

Yes, that is correct.
KIP-227 specifies (1) what data is reported in the block header (vrank), and (2) how performance metrics (PFS/TMFS/CMFS) are measured and computed in a verifiable way. It does not define operational policies such as promotion, demotion, validator lifecycle transitions, or penalty actions.

How these measured metrics are used for validator/candidate lifecycle decisions is out of scope for this KIP and is expected to be defined in a separate proposal.

ian0371 · 2026-01-26T06:20:31Z

KIPs/kip-227.md

+
+### Changes to Block Validation Process
+
+Once `FORK_BLOCK` is reached, validators must validate the newly added `vrank` field in the block header. The values of the subfields (`pfReport` and `mfReport`) are used to evaluate node performance using the components of the VRank framework.


validators must validate the newly added vrank field

Validate what? For pfReport, validate that it only contains the proposers that caused RC?
How about mfReport? signature validation?

Let’s add explicit validation rules for vrank after FORK_BLOCK
(i) pfReport MUST be a deterministic, verifiable list of round-change proposers in round order
(ii) crReport MUST contain at most one entry per candidate and each entry MUST carry a valid CandidateReady signature bound to (block_number, proposal_hash).

ian0371 · 2026-01-26T06:37:00Z

KIPs/kip-227.md

+
+## VRank Score Components
+
+The VRank framework evaluates node performance using three independent metrics. These metrics apply separately to validators and candidates, allowing for a more focused assessment of each role's responsibilities.


Suggested change

The VRank framework evaluates node performance using three independent metrics. These metrics apply separately to validators and candidates, allowing for a more focused assessment of each role's responsibilities.

The VRank framework evaluates node performance using three independent metrics. Each metric measures events occurred in an epoch, and resets at the epoch start block. These metrics apply separately to validators and candidates, allowing for a more focused assessment of each role's responsibilities.

ian0371 · 2026-01-26T06:39:04Z

KIPs/kip-227.md

+
+**Measurement Method**: If a validator fails to propose a block, resulting in a round change, the proposal failure count increases by one .
+
+**Consensus Method**: The proposer of block `N+1` records the proposal failure information of block `N` in the form of a list of `(round number, proposer)` within the block header. Validators compare their own records of proposal failures in block `N` with the records in the header of block `N+1` to reach consensus.


I think we should record all round changes of block N in the block N (not block N+1), so we can remove it from the council when generating block N+1. cc @hyeonLewis

ian0371 · 2026-01-26T07:11:24Z

KIPs/kip-227.md

+**Measurement Method**: If a validator fails to propose a block, resulting in a round change, the proposal failure count increases by one .
+
+**Consensus Method**: The proposer of block `N+1` records the proposal failure information of block `N` in the form of a list of `(round number, proposer)` within the block header. Validators compare their own records of proposal failures in block `N` with the records in the header of block `N+1` to reach consensus.
+


Let's add the following:

Suggested change

**Score**: Given all headers in the range `[k*EPOCH_LENGTH, (k+1)*EPOCH_LENGTH - 1]`, count each validator's proposal failures from the `pfReport` field.

ian0371 · 2026-01-26T07:19:08Z

KIPs/kip-227.md

+**Definition**: Measures the number of times a candidate fails to transmit the expected `CandidateReady` message during a block proposal cycle, after removing the highest `F` of the failure counts to address measurement distortions.
+
+**Measurement Method**: During the evaluation period, if the next proposer `N+1` receives a block proposal from proposer `N` and does not receive the `CandidateReady` message within the specified timeout (`CANDIDATE_READY_TIMEOUT`), the total failure count for `C` increases by 1\.
+


Suggested change

**Consensus Method**: The proposer of block N+1 records the candidate failure information of block N in the form of a list of `(candidate ID, signature)` within the block header.

**Score**: Given all headers in the range `[k*EPOCH_LENGTH + 1, (k+1)*EPOCH_LENGTH - 1]`, for each candidate, count the candidate failures from the `mfReport` field grouped by each proposer. Sum the failure counts per proposer, then exclude the highest `F` counts to produce the filtered score.

Btw, I think the range should start with k*EPOCH_LENGTH + 1 (unlike PFS) because a new candidate set will be refreshed from getCandidate(k*EPOCH_LENGTH), and their messages will be recorded starting from the next block. cc @hyeonLewis

Please correct me if I'm wrong.

If the candidate sends msg to k*EPOCH_LENGTH proposer for block k*EPOCH_LENGTH-1, it means the candidate should send the message before it becomes CandTesting state, since it's still at k*EPOCH_LENGTH-1. So the candidates should start from sending msg to k*EPOCH_LENGTH+1 proposer for block k*EPOCH_LENGTH.

But, for the last candidate msg, we can't accept until (k+1)*EPOCH_LENGTH since we must finalize the state transition at the initialize of block (k+1)*EPOCH_LENGTH, which doesn't have time to wait the candidate msg. So we can select only the [k*EPOCH_LENGTH+1, (k+1)*EPOCH_LENGTH-1] range (which is 86,399 blocks).

@hyeonLewis You're right, thanks for clarifying.

which doesn't have time to wait the candidate msg.

getCandidate((k+1)*EPOCH_LENGTH) should be determined before (k+1)*EPOCH_LENGTH block consensus, so we cannot physically collect (k+1)*EPOCH_LENGTH header data.

I think we should add this to rationale as well.

~~Sending N block msg to N proposer directly also works, but I couldn't find a rationale. @jiseongnoh Could you share the rationale for this?~~

It's obvious, since the proposer N should re-broadcast the candidate message to the validators. I think we should add this to ratioanle as well.

ian0371 · 2026-01-26T07:25:09Z

KIPs/kip-227.md

+**Measurement Method**:  
+If 10 consecutive proposers report that candidate `C` fails to send the `CandidateReady` message, it is recorded as 1 instance of 10-CFs, and the CMFS increases by 1 after 15 such instances.  
+If 15 consecutive proposers report that candidate `C` fails to send the `CandidateReady` message, it is recorded as 1 instance of 15-CFs, and the CMFS increases by 2 after 10 such instances.
+


Suggested change

**Consensus Method**: Same as TMFS

**Score**: Given all headers in the range `[k*EPOCH_LENGTH + 1, (k+1)*EPOCH_LENGTH - 1]`, for each candidate, track consecutive candidate failures from the `mfReport` field. Count instances of 10-CFs and 15-CFs. The score is `(count_10_cf // 15) * 1 + (count_15_cf // 10) * 2`.

ian0371 · 2026-01-27T02:16:23Z

KIPs/kip-227.md

+            # Reset the start point
+            state['cf_start'] = 0
+```
+


Let's define CandidateReady something like this:

message CandidateReady { uint64 block_number; byte[32] proposal_hash; signature sig; // sign_ecdsa(proposal_hash) }

ian0371 · 2026-01-27T02:16:36Z

KIPs/kip-227.md

+**The `vrank` field comprises two subfields: `pfReport` and `mfReport`**  
+**`pfReport`**: A list of block proposers that caused a round change in the previous block, recorded in the order of the rounds.
+
+**`mfReport`**: A mapping that records the candidates who submitted the `CandidateReady` message in the previous block along with their signatures, formatted as `[candidate ID, signature]`.


Why is it mfReport? Shouldn't it be cfReport?

mfReport was originally intended as shorthand for message failure report, since TMFS/CMFS treat a missing CandidateReady entry (within CANDIDATE_READY_TIMEOUT) as a failure when computing scores.

However, we agree the name is ambiguous and invites confusion (e.g., “mf” vs “cf”). To make the semantics explicit and reduce reviewer/implementer ambiguity, we will rename mfReport to crReport (CandidateReady Report) across the spec and implementation.

ian0371 · 2026-01-27T02:18:15Z

KIPs/kip-227.md

+
+Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
+
+Add Comment


jiseongnoh requested review from aidan-kwon, blukat29, ian0371 and scott-klaytn as code owners November 8, 2024 04:43

hyeonLewis changed the title ~~Kip-202: Candidate and Validator Evaluation~~ KIP-227: Candidate and Validator Evaluation Jan 10, 2025

jiseongnoh added 2 commits January 7, 2026 12:06

first draft VRank KIP

f4b618f

fix typos

58438e9

hyeonLewis force-pushed the kip-202 branch from 8621fba to 58438e9 Compare January 7, 2026 03:07

hyeonLewis self-requested a review as a code owner January 7, 2026 03:07

tidy

883ecfa

hyunsooda reviewed Jan 23, 2026

View reviewed changes

ian0371 reviewed Jan 26, 2026

View reviewed changes

ian0371 reviewed Jan 27, 2026

View reviewed changes


		Definition: Measures the number of times a candidate fails to transmit the expected `CandidateReady` message during a block proposal cycle, after removing the highest `F` of the failure counts to address measurement distortions.

		Measurement Method: During the evaluation period, if the next proposer `N+1` receives a block proposal from proposer `N` and does not receive the `CandidateReady` message within the specified timeout (`CANDIDATE_READY_TIMEOUT`), the total failure count for `C` increases by 1\.


		### Changes to Block Validation Process

		Once `FORK_BLOCK` is reached, validators must validate the newly added `vrank` field in the block header. The values of the subfields (`pfReport` and `mfReport`) are used to evaluate node performance using the components of the VRank framework.


		## VRank Score Components

		The VRank framework evaluates node performance using three independent metrics. These metrics apply separately to validators and candidates, allowing for a more focused assessment of each role's responsibilities.


		Measurement Method: If a validator fails to propose a block, resulting in a round change, the proposal failure count increases by one .

		Consensus Method: The proposer of block `N+1` records the proposal failure information of block `N` in the form of a list of `(round number, proposer)` within the block header. Validators compare their own records of proposal failures in block `N` with the records in the header of block `N+1` to reach consensus.



	Score: Given all headers in the range `[kEPOCH_LENGTH, (k+1)EPOCH_LENGTH - 1]`, count each validator's proposal failures from the `pfReport` field.


	Consensus Method: Same as TMFS

	Score: Given all headers in the range `[kEPOCH_LENGTH + 1, (k+1)EPOCH_LENGTH - 1]`, for each candidate, track consecutive candidate failures from the `mfReport` field. Count instances of 10-CFs and 15-CFs. The score is `(count_10_cf // 15) * 1 + (count_15_cf // 10) * 2`.


		Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

		Add Comment

KIP-227: Candidate and Validator Evaluation #27

Are you sure you want to change the base?

KIP-227: Candidate and Validator Evaluation #27

Conversation

jiseongnoh commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Types of changes

Checklist

Related issues

Further comments

Uh oh!

github-actions bot commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiseongnoh commented Nov 8, 2024

Uh oh!

hyeonLewis commented Jan 10, 2025

Uh oh!

hyunsooda left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jiseongnoh commented Jan 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hyeonLewis Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ian0371 Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hyeonLewis Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ian0371 Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

jiseongnoh commented Nov 8, 2024 •

edited

Loading

github-actions bot commented Nov 8, 2024 •

edited

Loading

hyeonLewis Jan 27, 2026 •

edited

Loading

ian0371 Jan 27, 2026 •

edited

Loading

hyeonLewis Jan 28, 2026 •

edited

Loading

ian0371 Jan 27, 2026 •

edited

Loading