Helm upgrade errors imply broker is not shutdown gracefully

**Describe the bug**
I observed https://github.com/apache/pulsar/issues/18236 when doing a helm upgrade while testing out 3.0.0-candidate-1 for #326.

**To Reproduce**
My steps are listed in https://github.com/apache/pulsar/issues/18236. The only difference is that instead of restarting the broker forcefully, I ran `helm install test -f pulsar-chart-3.0.0/examples/values-minikube.yaml --version 2.9.4 apache-pulsar-dist-dev/pulsar` to start the cluster and then I ran `helm upgrade test -f pulsar-chart-3.0.0/examples/values-minikube.yaml --version 3.0.0 apache-pulsar-dist-dev/pulsar` to upgrade it, which triggered a broker shutdown.

Here are the values:

```yaml
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

## deployed withh emptyDir
volumes:
  persistence: true

# disabled AntiAffinity
affinity:
  anti_affinity: false

# disable auto recovery
components:
  autorecovery: false
  pulsar_manager: true

zookeeper:
  replicaCount: 1
  securityContext:
    fsGroup: 0
    fsGroupChangePolicy: "Always"

bookkeeper:
  replicaCount: 1
  securityContext:
    fsGroup: 0
    fsGroupChangePolicy: "Always"

broker:
  replicaCount: 1
  configData:
    ## Enable `autoSkipNonRecoverableData` since bookkeeper is running
    ## without persistence
    autoSkipNonRecoverableData: "true"
    # storage settings
    managedLedgerDefaultEnsembleSize: "1"
    managedLedgerDefaultWriteQuorum: "1"
    managedLedgerDefaultAckQuorum: "1"
    PULSAR_EXTRA_OPTS: "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005"

## disable monitoring stack
kube-prometheus-stack:
  enabled: false
  prometheusOperator:
    enabled: false
  grafana:
    enabled: false
  alertmanager:
    enabled: false
  prometheus:
    enabled: false

proxy:
  replicaCount: 1
```

**Expected behavior**
The helm upgrade should shutdown the broker gracefully. I suspect that the broker was not shutting down gracefully because of the code path that was executed. The broker had to load the cursor data from the bookkeeper instead of the zookeeper, as described in https://github.com/apache/pulsar/pull/18237.

The main goal of this issue is to verify what kind of shutdown the broker has. Perhaps it is the case that there is an issue with the clean shutdown in Apache Pulsar 2.9.3.

**Additional context**
I reproduced this issue in minikube and gke. Both were running k8s 1.23.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helm upgrade errors imply broker is not shutdown gracefully #331

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Helm upgrade errors imply broker is not shutdown gracefully #331

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions