Skip to content

Conversation

kevin-wu24
Copy link
Contributor

@kevin-wu24 kevin-wu24 commented Sep 17, 2025

Just because a controller node sets --no-initial-controllers flag does
not mean it is necessarily running kraft.version=1. The more precise
meaning is that the controller node being formatted does not know what
kraft version the cluster should be in, and therefore it is only safe to
assume kraft.version=0. Only by setting
--standalone,--initial-controllers, or --no-initial-controllers
AND not specifying the controller.quorum.voters static config, is it
known kraft.version > 0.

For example, it is a valid configuration (although confusing) to run a
static quorum definedby controller.quorum.voters but have all the
controllers format with --no-initial-controllers. In this case,
specifying --no-initial-controllers alongside a metadata version that
does not support kraft.version=1 causes formatting to fail, which is
a regression.

Additionally, the formatter should not check the kraft.version against
the release version, since kraft.version does not actually depend on any
release version. It should only check the kraft.version against the
static voters config/format arguments.

This PR also cleans up the integration test framework to match the
semantics of formatting an actual cluster.

Reviewers: TengYao Chi kitingiao@gmail.com, Kuan-Po Tseng
brandboat@gmail.com, Chia-Ping Tsai chia7712@gmail.com

@github-actions github-actions bot added triage PRs from the community core Kafka Broker kraft small Small PRs labels Sep 17, 2025
@github-actions github-actions bot removed the small Small PRs label Sep 18, 2025
Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider deprecating --no-initial-controllers? It seems redundant and has no effect for either dynamic or static voter configurations.

Copy link
Contributor

@frankvicky frankvicky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @brandboat
After this patch, it seems that the --no-initial-controller flag doesn't provide too much value.

@chia7712
Copy link
Member

@kevin-wu24 thanks for this patch.

it is a valid configuration to run a static quorum defined
by controller.quorum.voters but have all the controllers format with
--no-initial-controllers

Excuse me, what is the purpose of having both controller.quorum.voters and --no-initial-controllers when starting a static quorum cluster?

I assumed --no-initial-controllers is used only in building a dynamic quorum cluster.

@kevin-wu24
Copy link
Contributor Author

kevin-wu24 commented Sep 18, 2025

Thanks for the reviews @chia7712 @brandboat @frankvicky. Hopefully I can clear up the intended semantics of --no-initial-controllers a bit.

Should we consider deprecating --no-initial-controllers? It seems redundant and has no effect for either dynamic or static voter configurations.

No, we should not deprecate --no-initial-controllers, since it does have an effect for a dynamic voter configuration, but not for static voter configuration. Specifically, when the static config controller.quorum.voters is not set in kafka.properties, one of --standalone, --initial-controllers, or --no-initial-controllers, must be set. Otherwise, the storage tool fails. This shows clear intention from the user to format a dynamic quorum.

Actually, when we specify --no-initial-controllers, and also do not specify controller.quorum.voters, it is safe to assume kraft.version > 0, so I will update the PR accordingly.

Excuse me, what is the purpose of having both controller.quorum.voters and --no-initial-controllers when starting a static quorum cluster?

I assumed --no-initial-controllers is used only in building a dynamic quorum cluster.

--no-initial-controllers is only used in building a dynamic quorum cluster, meaning that when a node specifies controller.quorum.voters the flag acts as a no-op. However, specifying the flag should never break formatting because the meaning of --no-initial-controllers is that "this node may not know the intended KRaft version of cluster, and will rely on KRaft to figure out the version". This flag breaking previously valid static configurations is a regression from #17836.

Comment on lines -185 to +207
StringBuilder quorumVoterStringBuilder = new StringBuilder();
String prefix = "";
for (int nodeId : nodes.controllerNodes().keySet()) {
quorumVoterStringBuilder.append(prefix).
append(nodeId).
append("@").
append("localhost").
append(":").
append(socketFactoryManager.getOrCreatePortForListener(nodeId, controllerListenerName));
prefix = ",";
if (!standalone && initialVoterSet.isEmpty()) {
StringBuilder quorumVoterStringBuilder = new StringBuilder();
String prefix = "";
for (int nodeId : nodes.controllerNodes().keySet()) {
quorumVoterStringBuilder.append(prefix).
append(nodeId).
append("@").
append("localhost").
append(":").
append(socketFactoryManager.getOrCreatePortForListener(nodeId, controllerListenerName));
prefix = ",";
}
props.put(QuorumConfig.QUORUM_VOTERS_CONFIG, quorumVoterStringBuilder.toString());
} else {
StringBuilder bootstrapServersStringBuilder = new StringBuilder();
String prefix = "";
for (int nodeId : nodes.controllerNodes().keySet()) {
bootstrapServersStringBuilder.append(prefix).
append("localhost").
append(":").
append(socketFactoryManager.getOrCreatePortForListener(nodeId, controllerListenerName));
prefix = ",";
}
props.put(QuorumConfig.QUORUM_BOOTSTRAP_SERVERS_CONFIG, bootstrapServersStringBuilder.toString());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the integration test framework to better mirror a real cluster for dynamic quorum (i.e. voters config should not be specified and we should use bootstrap servers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this.

@kevin-wu24 kevin-wu24 changed the title KAFKA-19719: --no-intial-controllers should not assume kraft.version=1 KAFKA-19719: --no-initial-controllers should not assume kraft.version=1 Sep 18, 2025
@github-actions github-actions bot removed the triage PRs from the community label Sep 19, 2025
Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevin-wu24 thanks for this patch. Should we backport this fix to 4.0 and 4.1?

(Option(initialControllers).isDefined || isStandalone)) {
val staticVotersEmpty = config.quorumConfig.voters().isEmpty
formatter.setHasDynamicQuorum(staticVotersEmpty)
if (!staticVotersEmpty && (Option(initialControllers).isDefined || isStandalone)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still feel it is weird that users configure controller.quorum.voters with --no-initial-controllers, but you are right that we should keep the compatibility.

Should we add another warning to remind users that --no-initial-controllers should not be used with static voters even though it is no-op?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add another warning to remind users that --no-initial-controllers should not be used with static voters even though it is no-op?

I can add it to the flag's help output.

new TestKitNodes.Builder().
setNumBrokerNodes(1).
setNumControllerNodes(1).
setFeature(KRaftVersion.FEATURE_NAME, KRaftVersion.KRAFT_VERSION_1.featureLevel()).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the method setFeature is useless now. Could you please remove it?

);
formatter.setInitialControllers(DynamicVoters.parse(dynamicVotersBuilder.toString()));
}
formatter.setHasDynamicQuorum(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a clarifying comment to explain the actual case here? For example, the nodeId != CONTROLLER_ID_OFFSET means the controller nodes are using --no-initial-controllers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't under the added comment. This execute when standalone. Irrespective of the node id. Doesn't that mean --standalone and --no-initial-controllers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what the comment is saying. One node formats with --standalone and the others are formatting with --no-initial-controllers. For both arguments we need to call formatter.setHasDynamicQuorum(true); for this configuration.

@kevin-wu24
Copy link
Contributor Author

kevin-wu24 commented Sep 19, 2025

Thanks for the review @chia7712.

Should we backport this fix to 4.0 and 4.1?

Yeah, we should backport it since it affects both versions.

@chia7712
Copy link
Member

I send a message to 4.0.1 RC thread (https://lists.apache.org/thread/4gsgcdgybyrnwg3hodnlyxg33ckj30wy). wait for the response from @clolov

Copy link
Member

@FrankYang0529 FrankYang0529 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. Leave some minor comments.

"--no-initial-controllers."
"Cannot set kraft.version to " + configuredKRaftVersionLevel.get() +
" unless controller.quorum.voters is empty and one of the flags --standalone, " +
"--initial-controllers, or -no-initial-controllers is used. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: -no-initial-controllers -> --no-initial-controllers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will fix

"--no-initial-controllers is used. For dynamic controllers support, try using one of " +
"--standalone, --initial-controllers, or --no-initial-controllers.",
"Cannot set kraft.version to 1 unless controller.quorum.voters is empty and " +
"one of the flags --standalone, --initial-controllers, or -no-initial-controllers is used. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

"--no-initial-controllers is used. For dynamic controllers support, try using one of " +
"--standalone, --initial-controllers, or --no-initial-controllers.",
"Cannot set kraft.version to 1 unless controller.quorum.voters is empty and " +
"one of the flags --standalone, --initial-controllers, or -no-initial-controllers is used. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @kevin-wu24

if (config.quorumConfig.voters().isEmpty && formatter.initialVoters().isEmpty) {
if (!namespace.getBoolean("no_initial_controllers") &&
config.processRoles.contains(ProcessRole.ControllerRole) &&
staticVotersEmpty && formatter.initialVoters().isEmpty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a newline after && and before formatter.initialVoters().isEmpt.


reconfigurableQuorumOptions.addArgument("--no-initial-controllers", "-N")
.help("Used to initialize a server without a dynamic quorum topology.")
.help("Used to initialize a server without a dynamic quorum topology. When setting this flag, " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the word topology. It matches the phrase used for --standalone.

.action(storeTrue())

reconfigurableQuorumOptions.addArgument("--initial-controllers", "-I")
.help("Used to initialize a server with a specific dynamic quorum topology. The argument " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Do you mind removing the word topology?

Comment on lines 220 to 221
public Formatter setHasDynamicQuorum(boolean staticVotersEmpty) {
this.hasDynamicQuorum = staticVotersEmpty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter name should be something like hasDynamicQuorum.

"--no-initial-controllers."
"Cannot set kraft.version to " + configuredKRaftVersionLevel.get() +
" unless controller.quorum.voters is empty and one of the flags --standalone, " +
"--initial-controllers, or --no-initial-controllers is used. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space after --no-initial-controllers

FormatterContext formatter1 = testEnv.newFormatter();
formatter1.formatter.setReleaseVersion(MetadataVersion.IBP_3_8_IV0);
// This MV does not support kraft.version = 1
formatter1.formatter.setHasDynamicQuorum(emptyStaticVoters);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit odd. Why not name the test parameter hasDynamicQuorum?

Comment on lines 517 to 521
if (emptyStaticVoters) {
assertEquals((short) 1, formatter1.formatter.featureLevels.getOrDefault("kraft.version", (short) 0));
} else {
assertEquals((short) 0, formatter1.formatter.featureLevels.getOrDefault("kraft.version", (short) 1));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to differentiate between missing value and the default in the test? Why not use featureLevels.get("kraft.version")? This comment applies to a few places in this file.

Comment on lines -185 to +207
StringBuilder quorumVoterStringBuilder = new StringBuilder();
String prefix = "";
for (int nodeId : nodes.controllerNodes().keySet()) {
quorumVoterStringBuilder.append(prefix).
append(nodeId).
append("@").
append("localhost").
append(":").
append(socketFactoryManager.getOrCreatePortForListener(nodeId, controllerListenerName));
prefix = ",";
if (!standalone && initialVoterSet.isEmpty()) {
StringBuilder quorumVoterStringBuilder = new StringBuilder();
String prefix = "";
for (int nodeId : nodes.controllerNodes().keySet()) {
quorumVoterStringBuilder.append(prefix).
append(nodeId).
append("@").
append("localhost").
append(":").
append(socketFactoryManager.getOrCreatePortForListener(nodeId, controllerListenerName));
prefix = ",";
}
props.put(QuorumConfig.QUORUM_VOTERS_CONFIG, quorumVoterStringBuilder.toString());
} else {
StringBuilder bootstrapServersStringBuilder = new StringBuilder();
String prefix = "";
for (int nodeId : nodes.controllerNodes().keySet()) {
bootstrapServersStringBuilder.append(prefix).
append("localhost").
append(":").
append(socketFactoryManager.getOrCreatePortForListener(nodeId, controllerListenerName));
prefix = ",";
}
props.put(QuorumConfig.QUORUM_BOOTSTRAP_SERVERS_CONFIG, bootstrapServersStringBuilder.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this.

);
formatter.setInitialControllers(DynamicVoters.parse(dynamicVotersBuilder.toString()));
}
formatter.setHasDynamicQuorum(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't under the added comment. This execute when standalone. Irrespective of the node id. Doesn't that mean --standalone and --no-initial-controllers?

Comment on lines -96 to -99
public Builder setFeature(String featureName, short level) {
this.bootstrapMetadata = bootstrapMetadata.copyWithFeatureRecord(featureName, level);
return this;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finally removing this configuration.

Comment on lines 347 to 353
.help("Used to initialize a server without a dynamic quorum topology. When setting this flag, " +
.help("Used to initialize a controller without specifying a dynamic quorum. When setting this flag, " +
"the controller.quorum.voters config should not be set, and controller.quorum.bootstrap.servers is set instead.")
.action(storeTrue())

reconfigurableQuorumOptions.addArgument("--initial-controllers", "-I")
.help("Used to initialize a server with a specific dynamic quorum topology. The argument " +
.help("Used to initialize a controller with the specified dynamic quorum. The argument " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--standalone can only be used on the controller node but --no-initial-controllers and --initial-controllers can be used in the controller and broker nodes.

I.e. you can change the message back to using the word server instead of the word controller.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevin-wu24 thanks for this patch. just a couple of small comments left

Option(initialControllers).
foreach(v => formatter.setInitialControllers(DynamicVoters.parse(v)))
if (isStandalone) {
formatter.setInitialControllers(createStandaloneDynamicVoters(config))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that --initial-controllers becomes a no-op when --standalone is defined. Should we add a warning for this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the dynamic quorum arguments, you cannot specify multiple at the same time:

val reconfigurableQuorumOptions = formatParser.addMutuallyExclusiveGroup()

configuredKRaftVersionLevel.get() +
" if one of the flags --standalone, --initial-controllers, or --no-initial-controllers is used. " +
"For dynamic controllers support, try removing the --feature flag for kraft.version."
"Cannot set kraft.version to " + configuredKRaftVersionLevel.get() +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind using 0 instead of configuredKRaftVersionLevel.get()? for example:

                    throw new FormatterException(
                        "Cannot set kraft.version to 0 if controller.quorum.voters is empty and one of the flags --standalone, " +
                        "--initial-controllers, or --no-initial-controllers is used. For dynamic controllers support, " +
                        "try removing the --feature flag for kraft.version."
                    );

private static final int DEFAULT_NODE_ID = 1;

private static final Uuid DEFAULT_CLUSTER_ID = Uuid.fromString("b3dGE68sQQKzfk80C_aLZw");
private static final String KRAFT_VERSION = "kraft.version";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be replaced by KRaftVersion.FEATURE_NAME?

StringBuilder dynamicVotersBuilder = new StringBuilder();
String prefix = "";
if (standalone) {
if (nodeId == TestKitDefaults.CONTROLLER_ID_OFFSET) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The true port of the standalone controller is TestKitDefaults.BROKER_ID_OFFSET + TestKitDefaults.CONTROLLER_ID_OFFSET. We will likely not change the value of BROKER_ID_OFFSET in the future, but should we still use BROKER_ID_OFFSET + CONTROLLER_ID_OFFSET instead?

https://github.com/apache/kafka/blob/trunk/test-common/test-common-runtime/src/main/java/org/apache/kafka/common/test/TestKitNodes.java#L175

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll change it just in case we ever change BROKER_ID_OFFSET.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevin-wu24 thanks for updates. LGTM

@jsancio Did you get around to reviewing it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants