-
Notifications
You must be signed in to change notification settings - Fork 244
Description
Describe the bug
I observed apache/pulsar#18236 when doing a helm upgrade while testing out 3.0.0-candidate-1 for #326.
To Reproduce
My steps are listed in apache/pulsar#18236. The only difference is that instead of restarting the broker forcefully, I ran helm install test -f pulsar-chart-3.0.0/examples/values-minikube.yaml --version 2.9.4 apache-pulsar-dist-dev/pulsar to start the cluster and then I ran helm upgrade test -f pulsar-chart-3.0.0/examples/values-minikube.yaml --version 3.0.0 apache-pulsar-dist-dev/pulsar to upgrade it, which triggered a broker shutdown.
Here are the values:
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
## deployed withh emptyDir
volumes:
persistence: true
# disabled AntiAffinity
affinity:
anti_affinity: false
# disable auto recovery
components:
autorecovery: false
pulsar_manager: true
zookeeper:
replicaCount: 1
securityContext:
fsGroup: 0
fsGroupChangePolicy: "Always"
bookkeeper:
replicaCount: 1
securityContext:
fsGroup: 0
fsGroupChangePolicy: "Always"
broker:
replicaCount: 1
configData:
## Enable `autoSkipNonRecoverableData` since bookkeeper is running
## without persistence
autoSkipNonRecoverableData: "true"
# storage settings
managedLedgerDefaultEnsembleSize: "1"
managedLedgerDefaultWriteQuorum: "1"
managedLedgerDefaultAckQuorum: "1"
PULSAR_EXTRA_OPTS: "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005"
## disable monitoring stack
kube-prometheus-stack:
enabled: false
prometheusOperator:
enabled: false
grafana:
enabled: false
alertmanager:
enabled: false
prometheus:
enabled: false
proxy:
replicaCount: 1Expected behavior
The helm upgrade should shutdown the broker gracefully. I suspect that the broker was not shutting down gracefully because of the code path that was executed. The broker had to load the cursor data from the bookkeeper instead of the zookeeper, as described in apache/pulsar#18237.
The main goal of this issue is to verify what kind of shutdown the broker has. Perhaps it is the case that there is an issue with the clean shutdown in Apache Pulsar 2.9.3.
Additional context
I reproduced this issue in minikube and gke. Both were running k8s 1.23.