Skip to content

Fix pre-existing lint issues (partial)#60

Merged
renecannao merged 12 commits intomasterfrom
issue55-lint-cleanup
Mar 24, 2026
Merged

Fix pre-existing lint issues (partial)#60
renecannao merged 12 commits intomasterfrom
issue55-lint-cleanup

Conversation

@renecannao
Copy link
Copy Markdown

@renecannao renecannao commented Mar 24, 2026

Summary

Addresses #55 by fixing the majority of pre-existing golangci-lint issues across the codebase.

Changes by category:

  • unused (12 fixed): Removed dead functions, variables, constants, and struct fields
  • ineffassign (10 fixed): Removed ineffectual variable assignments
  • errcheck (~90 fixed): Added explicit error handling for unchecked return values from log calls, Close(), metrics.Register(), JSON operations, and goroutine-launched functions
  • govet (2 fixed): Fixed copylocks warnings by avoiding value copies of types containing sync primitives
  • staticcheck (~50 fixed):
    • Replaced deprecated io/ioutil with os/io
    • Fixed error string capitalization and punctuation (ST1005)
    • Removed redundant type declarations (ST1023)
    • Simplified bool comparisons and return statements (S1002, S1008)
    • Used strings.Contains, strings.ReplaceAll, bytes.Equal (S1003, QF1004, S1004)
    • Fixed function return order to put error last (ST1008)
    • Added explicit types to all iota constants (SA9004)
    • Renamed error var to ErrFoo convention (ST1012)
    • Removed deprecated rand.Seed and PreferServerCipherSuites
    • Replaced fmt.Println(fmt.Sprintf(...)) with fmt.Printf (S1038)
    • Used raw strings for regexp patterns (S1007)

Remaining issues (39)

A tail of ~39 issues remains, primarily:

  • ST1006 receiver naming ("this" used pervasively in legacy code)
  • Errcheck calls in deeply nested goroutine patterns
  • A few ineffassign in complex control flow

These are tracked for a follow-up to avoid an overly large PR.

Test plan

  • go build -o /dev/null ./go/cmd/orchestrator succeeds
  • go test ./go/... -count=1 all packages pass
  • golangci-lint run ./go/... reduced from 124 to 39 issues

Summary by CodeRabbit

  • Chores

    • Broad internal maintenance: modernized APIs, updated resource cleanup, and standardized string helpers.
    • Unified logging/error-call handling and suppressed non-actionable errors to reduce noisy logs.
    • Strengthened type declarations and removed unused helpers for clearer code.
    • Tests and tooling updated to use current filesystem and IO APIs.
  • Refactor

    • Internal function signatures and return-value orderings adjusted where needed; no user-facing behavior changes.

Remove 12 unused functions, variables, constants, and struct fields
identified by the golangci-lint unused checker.
Fix 10 cases of variables assigned but never subsequently read,
by using blank identifiers or removing unnecessary initialization.
Add explicit error handling or blank identifier assignments for 50
unchecked return values from log calls, Close(), metrics.Register(),
and other functions.
Fix 2 copylocks warnings by passing TopologyRecovery as pointer in
log call and changing ElectedNode to return *NodeHealth instead of
NodeHealth (which contains sync.Once).
- Fix error string capitalization and punctuation (ST1005)
- Omit redundant type declarations (ST1023)
- Remove nil pointer dereference risk (SA5011)
- Rename test var with unit-specific suffix (ST1011)
- Remove empty branches (SA9003)
- Use strings.ReplaceAll and strings.Contains (QF1004, S1003)
- Remove unused value assignments (SA4006)
- Simplify bool comparisons (S1002)
- Return error as last argument per convention (ST1008)
- Add explicit types to all iota constants (SA9004)
- Replace deprecated ioutil with os/io (SA1019)
- Remove deprecated PreferServerCipherSuites (SA1019)
- Use bytes.Equal instead of bytes.Compare (S1004)
- Rename error var to ErrFoo convention (ST1012)
- Remove deprecated rand.Seed call (SA1019)
- Fix receiver names from "this" to short form (ST1006)
- Remove unnecessary map access guard (S1036)
- Add nolint for x509 deprecated functions with no replacement
Apply golangci-lint auto-fixes for remaining staticcheck findings
and replace deprecated io/ioutil usage in remaining files.
Handle unchecked error returns from log calls, Close(), metrics,
JSON operations, and goroutine-launched functions by adding blank
identifier assignments or wrapping goroutine calls.
Continue fixing unchecked error returns including goroutine-launched
functions, multi-return function assignments, and remaining log/Close
calls across http, raft, logic, and inst packages.
Continue the errcheck cleanup with another pass fixing goroutine
wrapping, multi-return assignments, and remaining unchecked calls.
Fix staticcheck (error strings, fmt.Printf, SA4006 unused values,
S1008 simplification, S1007 raw strings, ST1006 receiver names),
errcheck (log calls, metrics, audit operations, Close calls),
and ineffassign issues. Replace deprecated ioutil in remaining files.
Address remaining errcheck, staticcheck, and ineffassign findings
from the latest linter pass.
Copilot AI review requested due to automatic review settings March 24, 2026 12:47
@gemini-code-assist
Copy link
Copy Markdown

Warning

Gemini is experiencing higher than usual traffic and was unable to create the summary. Please try again in a few hours by commenting /gemini summary.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a58e8270-2e55-412b-8a91-639694542cca

📥 Commits

Reviewing files that changed from the base of the PR and between 3960b59 and 7a32584.

📒 Files selected for processing (5)
  • go/http/api.go
  • go/inst/pool_dao.go
  • go/os/process.go
  • go/raft/file_snapshot.go
  • go/ssl/ssl.go

📝 Walkthrough

Walkthrough

This PR applies widespread code hygiene changes: modernizes deprecated ioutil/os usages, replaces implicit ignored return-values with explicit blank-identifier assignments, simplifies string/regex handling, removes unused helpers/imports, and updates several constant typings and a set of exported function return-order signatures in instance topology.

Changes

Cohort / File(s) Summary
Error & Return-Value Handling
go/agent/agent_dao.go, go/app/cli.go, go/http/api.go, go/http/health.go, go/http/render.go, go/inst/instance.go, go/inst/instance_topology.go, go/logic/orchestrator.go, go/raft/raft.go, go/inst/instance_dao.go, go/inst/analysis_dao.go, go/inst/audit_dao.go
Systematically convert previously unchecked returns (logs, audits, metrics, ops) to explicit discards (_ = ... or _, _ = ...) and wrap async calls in closures to ignore returned errors.
Deprecated ioutil → io/os Modernization
go/agent/agent_dao.go, go/kv/consul_test.go, go/os/process.go, go/raft/file_snapshot.go, go/raft/http_client.go, go/ssl/ssl.go, go/ssl/ssl_test.go
Replace ioutil APIs with io/os counterparts (ReadAll, ReadFile, CreateTemp, WriteFile), and adjust related imports and error handling.
Return-Signature / API Surface Changes (topology)
go/inst/instance_topology.go
Reordered return values for several exported functions: swap positions of error and []error in multiple topology functions; updated all internal call sites to match new ordering.
String/Regex & Small API Simplifications
go/db/tls.go, go/inst/instance_binlog.go, go/inst/instance_binlog_dao.go, go/inst/instance_dao_test.go, go/inst/tag.go, go/logic/topology_recovery.go, go/metrics/graphite.go, go/inst/oracle_gtid_set_entry.go
Replace strings.Replace(..., -1) with strings.ReplaceAll, strings.Index(...) >= 0 with strings.Contains, switch some splits to strings.Split, and minor parsing/regex literal adjustments.
Resource Cleanup Hardening
go/app/http.go, go/db/db.go, go/proxysql/client.go, go/proxysql/topology.go, go/raft/file_snapshot.go, go/raft/fsm.go, go/raft/http_client.go
Convert plain defer Close() to guarded closures that explicitly ignore Close()/Flush() errors (defer func(){ _ = ... }()) and check Commit() errors where applicable.
Constant Typing & Small Type Changes
go/inst/analysis.go, go/inst/instance_topology_dao.go, go/inst/replication_thread_state.go, go/logic/topology_recovery.go, go/process/health.go
Add explicit type annotations to previously untyped constants and adjust small type/field changes (e.g., ActiveNode*NodeHealth).
Unused Code/Helper Removal
go/attributes/attributes_dao.go, go/config/config.go, go/db/db.go, go/http/httpbase.go, go/inst/resolve_dao.go, go/kv/consul_txn.go
Remove unused helpers, imports, and dead code blocks (several internal helper functions and unused variables/imports deleted).
Concurrency / KV Store Adjustments
go/kv/consul.go, go/kv/consul_txn.go
Removed a mutex field from consul txn store types and simplified grouping logic; many logging calls converted to explicit discards (may affect observability/concurrency assumptions).
Tests & Minor Test Modernizations
go/collection/collection_test.go, go/kv/consul_test.go, go/ssl/ssl_test.go
Test updates for modern IO APIs and explicit ignoring of returned errors in writers/encoders; minor naming/variable tweaks.
Receiver Renames & Minor Refactors
go/kv/zk.go, go/logic/snapshot_data.go, go/util/token.go, go/inst/instance_key.go
Rename method receivers (this → conventional names), use raw-string regex literals, and simplify trivial formatting wrappers.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 I hopped through code with linting cheer,

Blank IDs placed where returns were queer,
Deprecated crumbs swept into os/io,
Strings trimmed clean so parsers glow,
A rabbit’s nibble left the tree more clear.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix pre-existing lint issues (partial)' accurately summarizes the main objective of this changeset, which is to resolve golangci-lint issues across the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 80.25% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue55-lint-cleanup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR incrementally reduces the repository’s pre-existing golangci-lint findings by removing dead code, modernizing deprecated APIs (notably io/ioutil), and making previously-ignored return values explicit (logging, closes, JSON encodes, goroutine calls, etc.).

Changes:

  • Replaced deprecated io/ioutil usage with os/io equivalents across multiple packages.
  • Made many previously-ignored return values explicit (e.g., log calls, Close(), JSON encodes, goroutine-invoked functions) to satisfy linting.
  • Performed assorted staticcheck-style cleanups (string ops simplifications, error message normalization, signature return-order adjustments, removal of unused code).

Reviewed changes

Copilot reviewed 72 out of 72 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
go/util/token.go Receiver naming cleanup (this -> t).
go/ssl/ssl_test.go Temp file creation updated to os.CreateTemp; unlink handling made explicit.
go/ssl/ssl.go Replaced ioutil reads with os.ReadFile; updated error strings; added staticcheck suppressions for deprecated x509 PEM APIs.
go/raft/rel_store.go Made transaction rollback errors explicitly ignored.
go/raft/raft.go Renamed exported error var to Err...; made multiple return values explicit (including goroutines).
go/raft/http_client.go Replaced ioutil.ReadAll with io.ReadAll; added NewRequest error handling; explicit close handling.
go/raft/fsm.go Explicitly ignored logging and close return values.
go/raft/file_snapshot.go Replaced ioutil.ReadDir with os.ReadDir; CRC compare simplification; explicit closes/flushes; signature tweak for unused param.
go/proxysql/topology.go Explicitly ignored close errors.
go/proxysql/init.go Explicitly ignored log return value.
go/proxysql/hook.go Explicitly ignored log return values.
go/proxysql/client.go Explicitly ignored DB close errors.
go/process/host.go Explicitly ignored fatal log return.
go/process/health_dao.go Explicitly ignored ExecOrchestrator result; explicit logging return ignored.
go/process/health.go Avoid copylocks by changing ActiveNode to pointer; explicit types for constants; explicit error ignoring.
go/process/election_dao.go Changed ElectedNode to return *NodeHealth (avoids copying sync primitives).
go/os/unixcheck.go Explicitly ignored log return values.
go/os/process.go Replaced ioutil.TempFile/WriteFile with os.CreateTemp/os.WriteFile; explicit log return ignored.
go/metrics/graphite.go Simplified string replacements using strings.ReplaceAll.
go/logic/topology_recovery_dao.go Explicitly ignored return values from parsing helpers.
go/logic/topology_recovery.go Wide set of staticcheck/lint cleanups: ReplaceAll, error string normalization, return-order adjustments, explicit ignored errors.
go/logic/snapshot_data.go Receiver naming cleanup; explicit ignored returns.
go/logic/orchestrator.go Explicitly ignored metric registration errors; many goroutine callsites wrapped to consume returns.
go/logic/command_applier.go Explicitly ignored return values from side-effect functions.
go/kv/zk.go Removed deprecated rand.Seed; receiver naming cleanup.
go/kv/consul_txn.go Simplified map grouping; removed unused mutex field; explicit ignored log returns.
go/kv/consul_test.go Replaced ioutil.ReadAll with io.ReadAll; explicit ignored JSON encode/Fprintf returns.
go/kv/consul.go Removed unused mutex field; explicit ignored log returns.
go/inst/tag_dao.go Simplified string var initialization.
go/inst/tag.go ReplaceAll and minor string formatting simplification.
go/inst/resolve_dao.go Explicitly ignored metric register returns; removed unused helper; simplified var init.
go/inst/resolve.go Explicitly ignored log return.
go/inst/replication_thread_state.go Added explicit type for constants.
go/inst/pool_dao.go Explicitly ignored rollback/commit returns.
go/inst/oracle_gtid_set_entry.go Corrected SplitN(...,-1) to Split.
go/inst/instance_topology_dao.go Explicit type on constants; error string normalization; explicit ignored log/close returns.
go/inst/instance_topology.go Multiple API signature return-order adjustments (error last) and corresponding callsite updates; simplifications and explicit ignored log returns.
go/inst/instance_test.go Removed unused test variable.
go/inst/instance_key_map_test.go Explicitly ignored ReadJson return.
go/inst/instance_key.go Raw string literal for regexp pattern.
go/inst/instance_dao_test.go ReplaceAll simplification.
go/inst/instance_dao.go Explicitly ignored metric/register/append/log/scan returns; string helper simplifications; switch simplification.
go/inst/instance_binlog_dao.go Removed unused constant; Contains simplifications; minor var simplifications.
go/inst/instance_binlog.go Contains simplification for skip checks.
go/inst/instance.go Simplified boolean return; explicit ignored log return; string join simplification.
go/inst/downtime_dao.go Explicitly ignored AuditOperation return.
go/inst/downtime.go Simplified duration calc using time.Until.
go/inst/dead_instance_filter.go Explicitly ignored metric register return.
go/inst/cluster_alias_dao.go Explicitly ignored AuditOperation return.
go/inst/candidate_database_instance_dao.go Removed unused fmt usage for raw query string; explicit ignored AuditOperation return.
go/inst/audit_dao.go Explicitly ignored metric register and syslog/info returns; refactored goroutine write path to satisfy errcheck.
go/inst/analysis_dao.go Explicitly ignored metric register return; removed unused consensus helper; explicit goroutine wrapper for errcheck.
go/inst/analysis.go Added explicit AnalysisCode typing to most constants.
go/http/web.go Explicitly ignored fmt.Fprintf returns.
go/http/render.go Explicitly ignored log returns on encode/template errors.
go/http/raft_reverse_proxy.go Explicitly ignored log returns.
go/http/httpbase.go Removed unused helpers related to cluster hint parsing.
go/http/health.go Explicitly ignored JSON encode returns.
go/http/api.go Many explicit ignored returns and error-message normalization; updated multiple callsites for changed inst API signatures.
go/http/agents_api.go Explicitly ignored w.Write returns.
go/discovery/queue.go Explicitly ignored warning log return.
go/db/tls.go Explicitly ignored metric register/log return; ReplaceAll simplification.
go/db/db.go Removed unused helper; explicit ignored init/migration calls; multiple explicit ignored log returns.
go/config/config.go Removed unused deprecated var list and dead code; explicit ignored fatal/log returns; lowercased some error strings.
go/collection/collection_test.go Renamed local var; explicit ignored Append return.
go/collection/collection.go Explicitly ignored expiration return; removed redundant nil-check after lock.
go/cmd/orchestrator/main.go Explicitly ignored fatal/syslog/audit syslog returns.
go/attributes/attributes_dao.go Removed unused HTTP helper; explicit ignored log return.
go/app/http.go Adjusted ContinuousRegistration call to pass string; explicit ignored warning; explicit unix listener close.
go/app/cli.go Removed unused helper; many explicit ignored fatal/log returns; updated callsites for inst signature changes; output simplifications.
go/agent/instance_topology_agent.go Explicitly ignored JSON unmarshal and AuditOperation returns.
go/agent/agent_dao.go Replaced ioutil.ReadAll with io.ReadAll; explicit close; explicit goroutine wrapper; explicit ignored log/unmarshal returns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread go/ssl/ssl_test.go
Comment on lines 186 to 192
func writeFakeFile(content string) string {
f, err := ioutil.TempFile("", "ssl_test")
f, err := os.CreateTemp("", "ssl_test")
if err != nil {
return ""
}
ioutil.WriteFile(f.Name(), []byte(content), 0644)
_ = os.WriteFile(f.Name(), []byte(content), 0644)
return f.Name()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeFakeFile creates a temp file but never closes the returned *os.File, and it ignores the os.WriteFile error. Close the file handle (or write via the opened file and close), and propagate/handle the write error so tests fail reliably when file creation/writes fail.

Copilot uses AI. Check for mistakes.
Comment thread go/http/api.go Outdated
}
if act != "FQDN" && act != "hostname" && act != "custom" {
log.Warning("PrependMessagesWithOrcIdentity option has unsupported value '%+v'")
_ = log.Warning("PrependMessagesWithOrcIdentity option has unsupported value '%+v'")
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log call uses log.Warning (non-formatting) with a printf-style placeholder and no argument, so the message will literally contain "%+v" and won't include the unsupported value. Use log.Warningf and pass act (or change the message to not use formatting).

Suggested change
_ = log.Warning("PrependMessagesWithOrcIdentity option has unsupported value '%+v'")
_ = log.Warningf("PrependMessagesWithOrcIdentity option has unsupported value '%+v'", act)

Copilot uses AI. Check for mistakes.
Comment thread go/http/api.go Outdated
extraConfigFile := r.URL.Query().Get("config")
config.Reload(extraConfigFile)
inst.AuditOperation("reload-configuration", nil, "Triggered via API")
_ = inst.AuditOperation("reload-configuration", nil, "Triggered via API")
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not gofmt-formatted/indented (starts at column 1 inside the function). Please run gofmt on the file to match repo formatting conventions and avoid noisy diffs.

Suggested change
_ = inst.AuditOperation("reload-configuration", nil, "Triggered via API")
_ = inst.AuditOperation("reload-configuration", nil, "Triggered via API")

Copilot uses AI. Check for mistakes.
Comment thread go/inst/pool_dao.go
Comment on lines 35 to 49
tx, err := dbh.Begin()
if _, err := tx.Exec(`delete from database_instance_pool where pool = ?`, pool); err != nil {
tx.Rollback()
_ = tx.Rollback()
return log.Errore(err)
}
query := `insert into database_instance_pool (hostname, port, pool, registered_at) values (?, ?, ?, now())`
for _, instanceKey := range instanceKeys {
if _, err := tx.Exec(query, instanceKey.Hostname, instanceKey.Port, pool); err != nil {
tx.Rollback()
_ = tx.Rollback()
return log.Errore(err)
}
}
tx.Commit()
_ = tx.Commit()

return nil
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writePoolInstances ignores the error from tx.Commit(). A commit failure should be returned so callers don't get a false success (and to avoid leaving the transaction open). Capture and return the commit error (and consider a deferred rollback for early returns).

Copilot uses AI. Check for mistakes.
Comment thread go/raft/file_snapshot.go
Comment on lines 3 to 16
import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"github.com/proxysql/golib/log"
"hash"
"hash/crc64"
"io"
"io/ioutil"


"os"
"path/filepath"
"sort"
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import block here isn't gofmt'd (extra blank lines and mixed stdlib/third-party imports). Please run gofmt on this file so imports are grouped/ordered consistently.

Copilot uses AI. Check for mistakes.
Comment thread go/inst/analysis.go
FirstTierReplicaFailingToConnectToMaster AnalysisCode = "FirstTierReplicaFailingToConnectToMaster"
BinlogServerFailingToConnectToMaster AnalysisCode = "BinlogServerFailingToConnectToMaster"
// Group replication problems
DeadReplicationGroupMemberWithReplicas = "DeadReplicationGroupMemberWithReplicas"
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeadReplicationGroupMemberWithReplicas is the only constant in this block that isn't explicitly typed as AnalysisCode. For consistency (and to avoid subtle type inference differences), declare it as AnalysisCode = ... like the others.

Suggested change
DeadReplicationGroupMemberWithReplicas = "DeadReplicationGroupMemberWithReplicas"
DeadReplicationGroupMemberWithReplicas AnalysisCode = "DeadReplicationGroupMemberWithReplicas"

Copilot uses AI. Check for mistakes.
Comment thread go/ssl/ssl.go
Comment on lines 148 to 156
pemBlock, rest := pem.Decode(pemData)
if len(rest) > 0 {
log.Warning("Didn't parse all of", pemFile)
_ = log.Warning("Didn't parse all of", pemFile)
}

if x509.IsEncryptedPEMBlock(pemBlock) {
if x509.IsEncryptedPEMBlock(pemBlock) { //nolint:staticcheck // SA1019 no replacement available
// Decrypt and get the ASN.1 DER bytes here
pemData, err = x509.DecryptPEMBlock(pemBlock, pemPass)
pemData, err = x509.DecryptPEMBlock(pemBlock, pemPass) //nolint:staticcheck // SA1019 no replacement available
if err != nil {
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadPEMData calls pem.Decode and then passes pemBlock directly into x509.IsEncryptedPEMBlock / DecryptPEMBlock. If pem.Decode fails, pemBlock will be nil and this will panic. Handle the nil pemBlock case and return a descriptive error (or treat as unencrypted) before calling x509 helpers.

Copilot uses AI. Check for mistakes.
Comment thread go/ssl/ssl.go
Comment on lines 189 to +193
pemBlock, _ := pem.Decode(pemData)
if len(pemBlock.Bytes) == 0 {
return false
}
return x509.IsEncryptedPEMBlock(pemBlock)
return x509.IsEncryptedPEMBlock(pemBlock) //nolint:staticcheck // SA1019 no replacement available
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsEncryptedPEM assumes pem.Decode returns a non-nil block and dereferences pemBlock.Bytes. If pem.Decode fails, pemBlock is nil and this will panic. Add a nil check on pemBlock before accessing pemBlock.Bytes / calling x509.IsEncryptedPEMBlock.

Copilot uses AI. Check for mistakes.
Comment thread go/os/process.go
Comment on lines 75 to 82
commandBytes := []byte(commandText)
tmpFile, err := ioutil.TempFile("", "orchestrator-process-cmd-")
tmpFile, err := os.CreateTemp("", "orchestrator-process-cmd-")
if err != nil {
return nil, "", log.Errorf("generateShellScript() failed to create TempFile: %v", err.Error())
}
// write commandText to temporary file
ioutil.WriteFile(tmpFile.Name(), commandBytes, 0640)
_ = os.WriteFile(tmpFile.Name(), commandBytes, 0640)
shellArguments := append([]string{}, tmpFile.Name())
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateShellScript leaks the temp file descriptor returned by os.CreateTemp (tmpFile is never closed) and ignores the error from os.WriteFile. Close tmpFile and handle the write error (ideally write to tmpFile directly, Sync/Close, and return an error on failure).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (21)
go/inst/tag_dao.go (1)

1-1: ⚠️ Potential issue | 🔴 Critical

Critical: Fix formatting issues before merging.

The pipeline reports that gofmt -s -w go/ produced diffs, indicating the code is not properly formatted. Please run gofmt -s -w go/ locally and commit the formatting changes to pass CI.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/tag_dao.go` at line 1, The repository has go formatting issues; run
gofmt with simplification across the Go sources and commit the results: execute
`gofmt -s -w go/` locally (or your editor's equivalent), verify the changes (git
add and commit), and push so CI no longer reports diffs for the package files
such as those under go/ (e.g., tag_dao.go).
go/inst/tag.go (1)

1-1: ⚠️ Potential issue | 🔴 Critical

Critical: Fix formatting issues before merging.

The pipeline reports that gofmt -s -w go/ produced diffs, indicating the code is not properly formatted. Please run gofmt -s -w go/ locally and commit the formatting changes to pass CI.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/tag.go` at line 1, The code in go/inst/tag.go (and other files under
the go/ directory) is not gofmt-formatted; run the formatter and commit the
results by executing `gofmt -s -w go/` locally, verify the changes include
go/inst/tag.go, stage and commit those formatting changes so CI no longer
reports diffs.
go/inst/dead_instance_filter.go (1)

1-141: ⚠️ Potential issue | 🟡 Minor

CI formatting check is failing; run gofmt before merge.

The pipeline reports gofmt -s -w go/ produced diffs. Please run gofmt and commit the resulting formatting changes to unblock CI.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/dead_instance_filter.go` around lines 1 - 141, Run gofmt -s -w on the
Go sources (e.g. run `gofmt -s -w .` or `gofmt -s -w go/`) and commit the
resulting changes to fix CI; ensure the formatting changes apply to this file
(dead_instance_filter.go) which contains deadInstancesFilter, RegisterInstance,
UnregisterInstance, and InstanceRecheckNeeded so the file compiles and matches
gofmt's style before pushing.
go/raft/rel_store.go (1)

1-1: ⚠️ Potential issue | 🟡 Minor

Pipeline failure: gofmt formatting check failed.

The CI pipeline reports that gofmt -s -w go/ produced diffs and exited with code 1. Please run gofmt -s -w go/raft/rel_store.go locally and commit the formatting fixes before merging.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/raft/rel_store.go` at line 1, Run gofmt with the simplify flag on the
failing Go source to fix formatting: execute `gofmt -s -w` for the offending
file (rel_store.go) or the package, stage and commit the changes so CI will
pass; ensure there are no remaining gofmt diffs before pushing.
go/cmd/orchestrator/main.go (1)

1-174: ⚠️ Potential issue | 🟡 Minor

Pipeline failure: gofmt formatting check failed.

The CI indicates that gofmt -s -w go/ produced diffs. Please run the formatter locally to fix any formatting issues:

gofmt -s -w go/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/cmd/orchestrator/main.go` around lines 1 - 174, gofmt check failed: the
file containing package main and function main (go/cmd/orchestrator/main.go) is
not gofmt'ed; run the formatter and commit the changes (e.g., run `gofmt -s -w
go/` locally) to rewrite imports/spacing/formatting in main.go (including the
import block and the main function/flag declarations) so the file conforms to
gofmt rules, then add and push the formatted file.
go/inst/instance.go (1)

1-662: ⚠️ Potential issue | 🔴 Critical

Critical: Fix formatting to pass CI checks.

The pipeline reports that gofmt -s -w go/ produced diffs, indicating the code does not conform to Go's standard formatting. Run gofmt -s -w go/inst/instance.go locally to fix the formatting before merging.

#!/bin/bash
# Run this command to fix formatting issues:
gofmt -s -w go/inst/instance.go
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/instance.go` around lines 1 - 662, The file containing the Instance
type and associated methods (e.g. Instance struct, NewInstance, MarshalJSON,
CanReplicateFrom, HumanReadableDescription) is not gofmt-formatted; run gofmt -s
-w on that file to apply standard Go formatting (and commit the changes) so CI
passes.
go/attributes/attributes_dao.go (2)

110-119: ⚠️ Potential issue | 🔴 Critical

Fix empty-result guard to prevent panic.

Line 116 checks attributeName length, but Line 119 dereferences attributes[0]. If no rows are returned, this panics. Guard len(attributes) instead.

💡 Proposed fix
 func GetHostAttribute(hostname string, attributeName string) (string, error) {
 	whereClause := `where hostname=? and attribute_name=?`
 	attributes, err := getHostAttributesByClause(whereClause, sqlutils.Args(hostname, attributeName))
 	if err != nil {
 		return "", err
 	}
-	if len(attributeName) == 0 {
+	if len(attributes) == 0 {
 		return "", log.Errorf("No attribute found for %+v, %+v", hostname, attributeName)
 	}
 	return attributes[0].AttributeValue, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/attributes/attributes_dao.go` around lines 110 - 119, In GetHostAttribute,
the empty-result guard is checking len(attributeName) instead of the returned
rows, which allows a panic when attributes is empty; change the guard to check
len(attributes)==0, return an error (use log.Errorf or fmt.Errorf consistently)
with a message that includes hostname and attributeName, and avoid dereferencing
attributes[0] unless the slice has at least one element (i.e., move the
attributes[0].AttributeValue return into the else path after the length check).

1-143: ⚠️ Potential issue | 🟠 Major

CI is currently blocked by formatting.

Pipeline reports gofmt -s -w go/ diffs. Please run gofmt and commit the resulting changes before merge.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/attributes/attributes_dao.go` around lines 1 - 143, The file fails gofmt
checks; run gofmt -s -w on the Go sources (e.g., run `gofmt -s -w go/`) and
commit the formatted changes so CI passes; make sure to reformat this file which
contains functions like SetHostAttributes, getHostAttributesByClause,
GetHostAttribute, SetGeneralAttribute and GetHostAttributesByAttribute and
include only the formatting edits in the commit.
go/inst/oracle_gtid_set_entry.go (1)

81-85: ⚠️ Potential issue | 🟡 Minor

Rerun gofmt -s on this file before merge.

strings.Split(...) is a fine simplification here, but CI is still failing the formatting check. Please format go/inst/oracle_gtid_set_entry.go so this lint-cleanup PR goes green.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/oracle_gtid_set_entry.go` around lines 81 - 85, Run gofmt -s to fix
formatting issues around the block using strings.Split(gtid_str[1], ":") and the
following variable declarations (default_iv, tag_ivs, tip); reformat that
section so it passes gofmt (simplify/align var declarations and spacing) and
re-run CI to ensure the lint/format check succeeds.
go/config/config.go (1)

1-721: ⚠️ Potential issue | 🟡 Minor

Formatting gate is still failing in CI.

gofmt -s -w go/ produced diffs in the pipeline. Please run gofmt and include the formatting-only changes before merge.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/config/config.go` around lines 1 - 721, CI is failing due to gofmt
formatting changes; run gofmt -s -w on the repository (at least the go/
directory or this file) and commit the formatting-only changes so the file
containing package config and functions like newConfiguration,
postReadAdjustments, read, Read, ForceRead, Reload, MarkConfigurationLoaded, and
WaitForConfigurationToBeLoaded is properly formatted; ensure you only include
the gofmt-induced edits, then push the commit to satisfy the formatting gate.
go/ssl/ssl_test.go (1)

3-12: ⚠️ Potential issue | 🟡 Minor

Fix import ordering to satisfy gofmt.

In the import block, net/http must be sorted before os alphabetically. Go's gofmt tool enforces alphabetical ordering of imports regardless of aliases.

Suggested fix
 import (
 	"crypto/tls"
 	"crypto/x509"
 	"encoding/pem"
 	"fmt"
-	"os"
 	nethttp "net/http"
+	"os"
 	"reflect"
 	"strings"
 	"syscall"
 	"testing"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/ssl/ssl_test.go` around lines 3 - 12, The import block in ssl_test.go is
not alphabetically ordered for gofmt (net/http aliased as nethttp must come
before os); reorder the imports so that nethttp ("net/http") appears before "os"
while preserving the alias and other imports (look for the import list
containing nethttp, "os", "reflect", etc.) to satisfy gofmt.
go/ssl/ssl.go (1)

3-11: ⚠️ Potential issue | 🟡 Minor

Fix import ordering to pass gofmt checks.

net/http should be sorted before os according to lexicographic ordering; the current order is incorrect.

Suggested diff
 import (
 	"crypto/tls"
 	"crypto/x509"
 	"encoding/pem"
 	"errors"
 	"fmt"
-	"os"
 	nethttp "net/http"
+	"os"
 	"strings"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/ssl/ssl.go` around lines 3 - 11, Imports in the ssl.go import block are
not lexicographically ordered; move the nethttp "net/http" entry so it appears
before "os" in the import list (i.e., reorder the import block in go/ssl/ssl.go
so nethttp is listed prior to os) and then run gofmt/goimports to ensure
formatting passes; no other code changes required.
go/process/health_dao.go (1)

40-48: ⚠️ Potential issue | 🟡 Minor

History insert failures are still silently dropped.

Line 40 ignores the ExecOrchestrator error with _, _, but inside onceHistory.Do this means a transient DB error becomes non-retryable for the process lifetime. The UPDATE operation below (line 51) and error handling patterns elsewhere in the file (line 125) show that DB errors should be logged.

Suggested fix
 	nodeHealth.onceHistory.Do(func() {
-		_, _ = db.ExecOrchestrator(`
+		if _, execErr := db.ExecOrchestrator(`
 			insert ignore into node_health_history
 				(hostname, token, first_seen_active, extra_info, command, app_version)
 			values
 				(?, ?, NOW(), ?, ?, ?)
 			`,
 			nodeHealth.Hostname, nodeHealth.Token, nodeHealth.ExtraInfo, nodeHealth.Command,
 			nodeHealth.AppVersion,
-		)
+		); execErr != nil {
+			_ = log.Errorf("WriteRegisterNode: failed inserting node_health_history for %s: %+v", nodeHealth.Hostname, execErr)
+		}
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/process/health_dao.go` around lines 40 - 48, The insert into
node_health_history currently swallows errors (using "_, _ =
db.ExecOrchestrator") inside onceHistory.Do; capture the returned error from
ExecOrchestrator instead and log it using the file's existing DB error pattern
(same style as the later UPDATE and the handling around line 125) so transient
insert failures are recorded (include nodeHealth.Hostname/token/context in the
log). Ensure you keep the onceHistory.Do semantics but do not ignore the
ExecOrchestrator error — log it.
go/raft/file_snapshot.go (1)

3-15: ⚠️ Potential issue | 🟡 Minor

Run gofmt on the import block.

CI is already failing here because of the extra blank line in the imports.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/raft/file_snapshot.go` around lines 3 - 15, The import block in
file_snapshot.go has an extra blank line and inconsistent formatting causing
gofmt/CI failure; open the import block (the grouped import section at the top
of the file), remove the stray blank line between "io" and "os"/"path/filepath"
and then run "gofmt" (or "go fmt") to reformat the imports so the grouped
imports are properly ordered and formatted.
go/logic/snapshot_data.go (1)

157-178: ⚠️ Potential issue | 🟠 Major

Don't treat a partial snapshot restore as success.

Both error paths are now explicitly ignored. If inst.ForgetInstance or inst.WriteInstance fails, Restore still returns nil and leaves local topology state out of sync with the snapshot.

Suggested fix
 		for _, existingKey := range existingKeys {
 			if !snapshotInstanceKeyMap.HasKey(existingKey) {
-				_ = inst.ForgetInstance(&existingKey)
+				if err := inst.ForgetInstance(&existingKey); err != nil {
+					return log.Errore(err)
+				}
 				discardedKeys++
 			}
 		}
@@
 		for _, minimalInstance := range snapshotData.MinimalInstances {
 			if !existingKeysMap.HasKey(minimalInstance.Key) {
-				if err := inst.WriteInstance(minimalInstance.ToInstance(), false, nil); err == nil {
-					discoveredKeys++
-				} else {
-					_ = log.Errore(err)
-				}
+				if err := inst.WriteInstance(minimalInstance.ToInstance(), false, nil); err != nil {
+					return log.Errore(err)
+				}
+				discoveredKeys++
 			}
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/logic/snapshot_data.go` around lines 157 - 178, The loops in Restore
ignore errors from inst.ForgetInstance and inst.WriteInstance, causing partial
snapshot restores to be treated as success; modify the code in Restore so that
when inst.ForgetInstance returns an error you stop processing and return that
error (wrapped with context, e.g., "forget instance <key> failed"), and
similarly when inst.WriteInstance returns a non-nil error return that error
(wrapped with context, e.g., "write instance <key> failed") instead of logging
and continuing; update the handling around snapshotInstanceKeyMap/existingKeys
and MinimalInstances loops to propagate these errors back to the caller so
Restore fails on any persistence/topology operation error.
go/db/db.go (1)

283-333: ⚠️ Potential issue | 🔴 Critical

Return errors from deployStatements instead of terminating with log.Fatale().

The function declares an error return type but always returns nil, ignoring critical failures in db.Begin(), Scan(), Exec(), and tx.Commit() operations. These errors are handled via log.Fatale(), which terminates the process instead of propagating the error to the caller. Additionally, the transaction is never rolled back on error. The function contract should be honored by properly returning errors and rolling back the transaction on failure.

Suggested fix
 	tx, err := db.Begin()
 	if err != nil {
-		_ = log.Fatale(err)
+		return log.Fatale(err)
 	}
@@
 	originalSqlMode := ""
 	if config.Config.IsMySQL() {
-		_ = tx.QueryRow(`select @@session.sql_mode`).Scan(&originalSqlMode)
+		if err := tx.QueryRow(`select @@session.sql_mode`).Scan(&originalSqlMode); err != nil {
+			_ = tx.Rollback()
+			return log.Fatale(err)
+		}
 		if _, err := tx.Exec(`set @@session.sql_mode=REPLACE(@@session.sql_mode, 'NO_ZERO_DATE', '')`); err != nil {
-			_ = log.Fatale(err)
+			_ = tx.Rollback()
+			return log.Fatale(err)
 		}
 		if _, err := tx.Exec(`set @@session.sql_mode=REPLACE(@@session.sql_mode, 'NO_ZERO_IN_DATE', '')`); err != nil {
-			_ = log.Fatale(err)
+			_ = tx.Rollback()
+			return log.Fatale(err)
 		}
 	}
@@
 	if config.Config.IsMySQL() {
 		if _, err := tx.Exec(`set session sql_mode=?`, originalSqlMode); err != nil {
-			_ = log.Fatale(err)
+			_ = tx.Rollback()
+			return log.Fatale(err)
 		}
 	}
 	if err := tx.Commit(); err != nil {
-		_ = log.Fatale(err)
+		return log.Fatale(err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/db/db.go` around lines 283 - 333, The deployStatements flow currently
calls log.Fatale on db.Begin, tx.QueryRow/Scan, tx.Exec and tx.Commit which
exits the process and never returns errors or rolls back the transaction; change
these to propagate errors to the caller by returning them (wrap with context
where helpful), and ensure the transaction is properly rolled back on any error
(use defer to rollback if tx is non-nil and commit hasn't succeeded). Replace
the log.Fatale calls around db.Begin, originalSqlMode Scan, all tx.Exec calls,
and tx.Commit with returns of the error (or wrapped fmt.Errorf) and only log
non-fatal issues where appropriate (e.g., use log.Errorf when continuing); also
ensure the session sql_mode restore (set session sql_mode=?) runs in a defer or
is executed even on early returns so originalSqlMode is restored before
returning.
go/app/cli.go (1)

1321-1327: ⚠️ Potential issue | 🟠 Major

Propagate PutInstanceTag failures to the CLI.

This command prints a success result even when the tag write fails, so callers get a false green response.

💡 Suggested fix
 			tag, err := inst.ParseTag(*config.RuntimeCLIFlags.Tag)
 			if err != nil {
 				log.Fatale(err)
 			}
-			_ = inst.PutInstanceTag(instanceKey, tag)
+			if err := inst.PutInstanceTag(instanceKey, tag); err != nil {
+				_ = log.Fatale(err)
+			}
 			fmt.Println(instanceKey.DisplayString())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/app/cli.go` around lines 1321 - 1327, The CLI currently ignores errors
from inst.PutInstanceTag and prints instanceKey.DisplayString regardless; update
the flow so that after calling inst.PutInstanceTag(instanceKey, tag) you check
its returned error, and if non-nil propagate it to the CLI (e.g. return an
error/exit non-zero or call log.Fatal with the error) and only print
instanceKey.DisplayString() on success; touch the block around
FigureInstanceKey, ParseTag, PutInstanceTag and ensure PutInstanceTag's error is
handled instead of being discarded.
go/inst/instance_dao.go (2)

1102-1107: ⚠️ Potential issue | 🟠 Major

Don't silently drop gtid_subtract failures.

If this query errors, instance.GtidErrant stays empty and the instance stops reporting errant GTIDs altogether. That hides a topology problem instead of surfacing it.

🐛 Proposed fix
-				_ = db.QueryRow("select gtid_subtract(?, ?)", redactedExecutedGtidSet.String(), redactedMasterExecutedGtidSet.String()).Scan(&instance.GtidErrant)
+				if err := db.QueryRow(
+					"select gtid_subtract(?, ?)",
+					redactedExecutedGtidSet.String(),
+					redactedMasterExecutedGtidSet.String(),
+				).Scan(&instance.GtidErrant); err != nil {
+					logReadTopologyInstanceError(instanceKey, "select gtid_subtract(...)", err)
+				}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/instance_dao.go` around lines 1102 - 1107, The code currently ignores
errors from db.QueryRow(...).Scan(&instance.GtidErrant) causing failures in
gtid_subtract to be dropped; change the block that calls db.QueryRow with
redactedExecutedGtidSet.String() and redactedMasterExecutedGtidSet.String() so
you capture the error returned by Scan, and on err != nil log it (including the
two GTID set strings and error) and set instance.GtidErrant to a visible error
marker (e.g. formatted error string) instead of leaving it empty; ensure you
reference the same symbols (redactedExecutedGtidSet, NewOracleGtidSet,
instance.masterExecutedGtidSet, instance.MasterUUID, db.QueryRow,
instance.GtidErrant) so the change is made in the same place.

1425-1435: ⚠️ Potential issue | 🟠 Major

Don't treat JSON decode failures as empty topology.

These blank assignments make malformed slave_hosts or replication_group_members payloads look like “no replicas/members”. That silently corrupts the in-memory topology view.

🐛 Proposed fix
-	_ = instance.Replicas.ReadJson(replicasJSON)
+	if err := instance.Replicas.ReadJson(replicasJSON); err != nil {
+		_ = log.Errorf("readInstanceRow: cannot decode slave_hosts for %+v: %v", instance.Key, err)
+	}
@@
-	_ = instance.ReplicationGroupMembers.ReadJson(m.GetString("replication_group_members"))
+	if err := instance.ReplicationGroupMembers.ReadJson(m.GetString("replication_group_members")); err != nil {
+		_ = log.Errorf("readInstanceRow: cannot decode replication_group_members for %+v: %v", instance.Key, err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/instance_dao.go` around lines 1425 - 1435, The code currently ignores
JSON decoding errors from instance.Replicas.ReadJson(replicasJSON) and
instance.ReplicationGroupMembers.ReadJson(...), which can silently treat
malformed payloads as empty topology; change both calls to capture and handle
their returned error values (from ReadJson) and propagate or log them instead of
discarding — for example, check the error from instance.Replicas.ReadJson and
instance.ReplicationGroupMembers.ReadJson, return or wrap that error from the
surrounding function (or at minimum log it with context including instance.Key
and the field name like "replicas" or "replication_group_members") so malformed
slave_hosts/replication_group_members payloads do not silently overwrite the
in-memory topology.
go/kv/consul_txn.go (1)

82-98: ⚠️ Potential issue | 🟠 Major

Don't keep returning a store from the constructor error branch.

If initialization fails here, store.client stays nil, but later methods in this file call this.client.Txn() / this.client.Catalog() unconditionally. That turns a startup config problem into a later nil-pointer panic.

🛡️ Minimal fix
func (this *consulTxnStore) doWriteTxn(txnOps consulapi.TxnOps, queryOptions *consulapi.QueryOptions) (err error) {
	if this.client == nil {
		return fmt.Errorf("consul client not initialized")
	}
	// existing logic...
}

func (this *consulTxnStore) DistributePairs(kvPairs []*KVPair) (err error) {
	if this.client == nil {
		return fmt.Errorf("consul client not initialized")
	}
	// existing logic...
}

updateDatacenterKVPairs should also only run behind the same guard.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/kv/consul_txn.go` around lines 82 - 98, The constructor can leave
consulTxnStore.client nil which causes nil-pointer panics when methods call
this.client.Txn() / this.client.Catalog(); guard all public methods that use the
client (e.g., consulTxnStore.doWriteTxn, consulTxnStore.DistributePairs, and
updateDatacenterKVPairs) with an early check that returns a clear error like
"consul client not initialized" if this.client == nil, or alternatively ensure
the constructor returns an error and does not return a usable store when
consulapi.NewClient fails; apply the same guard to any other methods that call
this.client to prevent runtime panics.
go/inst/instance_topology.go (1)

2575-2590: ⚠️ Potential issue | 🔴 Critical

Guard promotedBinlogServer before dereferencing it.

getMostUpToDateActiveBinlogServer() can return promotedBinlogServer == nil with err == nil when there are binlog-server replicas but none are currently valid. Line 2590 then dereferences promotedBinlogServer.Key and panics inside a recovery path instead of returning a normal error.

Nil-check before repointing
 	promotedBinlogServer, binlogServerReplicas, err = getMostUpToDateActiveBinlogServer(masterKey)

 	if err != nil {
 		return resultOnError(err)
 	}
+	if promotedBinlogServer == nil {
+		return resultOnError(fmt.Errorf("no active binlog server replica found for %+v", *masterKey))
+	}

 	repointedBinlogServers, _, err = RepointTo(binlogServerReplicas, &promotedBinlogServer.Key)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/instance_topology.go` around lines 2575 - 2590, In
RegroupReplicasBinlogServers, guard promotedBinlogServer before dereferencing
promotedBinlogServer.Key: after calling
getMostUpToDateActiveBinlogServer(checking its returned err), add a nil-check
for promotedBinlogServer and return via resultOnError or an explicit error if
it's nil (i.e., when there are replicas but none valid) instead of calling
RepointTo; only call RepointTo(binlogServerReplicas, &promotedBinlogServer.Key)
when promotedBinlogServer != nil so you never dereference a nil
promotedBinlogServer.
🟡 Minor comments (7)
go/os/unixcheck.go-45-58 (1)

45-58: ⚠️ Potential issue | 🟡 Minor

Resolve formatting diffs before merge.

CI reports gofmt -s -w go/ produced diffs. Please run gofmt and commit the formatting changes so the formatting gate passes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/os/unixcheck.go` around lines 45 - 58, The CI is failing because files
under go/ (including go/os/unixcheck.go around the user.LookupGroupId /
currentUser.GroupIds code paths) are not gofmt-formatted; run gofmt -s -w ./go
(or gofmt -s -w on the repository root) to apply the canonical formatting to
files like unixcheck.go (the log.Errorf lines referencing authUser, currentUser,
gids, user.LookupGroupId), verify the changes, and commit the resulting
formatting-only diff so the formatting gate passes.
go/collection/collection_test.go-99-99 (1)

99-99: ⚠️ Potential issue | 🟡 Minor

Assert Append succeeds instead of discarding its error.

Line 99 suppresses failures from Append, which can hide regressions in this test path.

Suggested test fix
-		_ = c.Append(tm)
+		if err := c.Append(tm); err != nil {
+			t.Fatalf("TestAppend: Append returned error: %v", err)
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/collection/collection_test.go` at line 99, The test currently discards the
error from c.Append(tm); change it to assert the call succeeds by capturing its
return value and failing the test on non-nil error (e.g., err := c.Append(tm);
if err != nil { t.Fatalf("Append failed: %v", err) } or use require.NoError(t,
err) if testify/require is in use) so failures from Append are surfaced; update
the line with the assertion and keep the rest of the test unchanged.
go/db/tls.go-66-66 (1)

66-66: ⚠️ Potential issue | 🟡 Minor

Run gofmt on this file.

CI is already failing on Line 66 because of the manually aligned comment after strings.ReplaceAll(...).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/db/tls.go` at line 66, Run gofmt on the file to fix the formatting issue:
remove the manually aligned comment spacing after the assignment to query (the
line with query = strings.ReplaceAll(query, "%", "%%") and its trailing comment)
so it conforms to gofmt rules; you can either put a single space before the
comment or move the comment to its own line. Ensure the line referencing
strings.ReplaceAll and the variable query is formatted by gofmt.
go/kv/consul.go-40-44 (1)

40-44: ⚠️ Potential issue | 🟡 Minor

Fix struct field alignment causing gofmt failure.

The pipeline specifically reports a gofmt failure at line 38-38 for struct field alignment in consulStore. The distributionReentry field appears to have inconsistent indentation compared to other fields.

Suggested fix
 type consulStore struct {
-	client                        *consulapi.Client
-	kvCache                       *cache.Cache
-	distributionReentry int64
+	client              *consulapi.Client
+	kvCache             *cache.Cache
+	distributionReentry int64
 }

Run gofmt -s -w go/kv/consul.go to auto-fix the alignment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/kv/consul.go` around lines 40 - 44, The struct consulStore has
inconsistent field alignment (the distributionReentry field) causing gofmt
failure; open the consulStore type declaration and realign the fields so
indentation matches the other fields (ensure distributionReentry int64 lines up
with client and kvCache), then run gofmt -s -w go/kv/consul.go (or apply gofmt
styling) to fix formatting; references: consulStore, distributionReentry,
go/kv/consul.go.
go/inst/audit_dao.go-63-76 (1)

63-76: ⚠️ Potential issue | 🟡 Minor

Run gofmt on this goroutine block.

CI is already failing on this hunk, and the indentation here does not match gofmt output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/audit_dao.go` around lines 63 - 76, The goroutine block using an
inner anonymous function (the go func() { _ = func() error { ... }() }) is
misformatted and fails gofmt; reformat this block to match gofmt output (fix
indentation and brace alignment) around the anonymous goroutine and inner func
that opens the audit file with os.OpenFile(config.Config.AuditLogFile, ...),
variable declarations f and err, the deferred close, the fmt.Sprintf line
building text, and the f.WriteString call; you can simply run gofmt (or
goimports) on the file to correct spacing and alignment so the go func, inner
func, defer, and return statements are properly indented and braces align.
go/kv/consul_txn.go-69-73 (1)

69-73: ⚠️ Potential issue | 🟡 Minor

Run gofmt on the struct block.

This alignment change is one of the CI-reported formatting diffs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/kv/consul_txn.go` around lines 69 - 73, Run gofmt on the consulTxnStore
struct block to fix CI formatting: reformat the struct definition for type
consulTxnStore so the fields (client, kvCache, distributionReentry) use standard
gofmt alignment/spacing (tabs) and consistent indentation; you can run `gofmt
-w` (or `goimports -w`) on the file to apply the change.
go/http/api.go-3162-3165 (1)

3162-3165: ⚠️ Potential issue | 🟡 Minor

Fix the gofmt blocker in ReloadConfiguration.

The audit line is misindented, which matches the current CI failure for this file.

🧹 Proposed fix
 	extraConfigFile := r.URL.Query().Get("config")
 	config.Reload(extraConfigFile)
-_ = inst.AuditOperation("reload-configuration", nil, "Triggered via API")
+	_ = inst.AuditOperation("reload-configuration", nil, "Triggered via API")
 
 	Respond(w, &APIResponse{Code: OK, Message: "Config reloaded", Details: extraConfigFile})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/http/api.go` around lines 3162 - 3165, The misindented audit call in
ReloadConfiguration causes a gofmt failure; locate the ReloadConfiguration
handler and fix the indentation of the audit call (the line invoking
inst.AuditOperation("reload-configuration", ...)) so it aligns with surrounding
statements (same indent level as config.Reload(...) and Respond(...)), then run
gofmt/goimports to ensure the file is formatted; keep the call and return
behavior unchanged.
🧹 Nitpick comments (7)
go/inst/instance_key_map_test.go (1)

63-70: Consider checking the error rather than discarding it in tests.

While _ = m.ReadJson(json) satisfies the errcheck linter, silently discarding errors in tests can mask failures—if ReadJson fails, the subsequent assertions would fail with confusing messages unrelated to the root cause.

Suggested improvement
 func TestInstanceKeyMapReadJSON(t *testing.T) {
 	json := `[{"Hostname":"host1","Port":3306},{"Hostname":"host2","Port":3306}]`
 	m := *NewInstanceKeyMap()
-	_ = m.ReadJson(json)
+	err := m.ReadJson(json)
+	test.S(t).ExpectNil(err)
 	test.S(t).ExpectEquals(len(m), 2)
 	test.S(t).ExpectTrue(m[key1])
 	test.S(t).ExpectTrue(m[key2])
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/instance_key_map_test.go` around lines 63 - 70, In
TestInstanceKeyMapReadJSON, don’t discard the error returned by
m.ReadJson(json); capture it (err := m.ReadJson(json)) and fail the test
immediately if non-nil (e.g. t.Fatalf("ReadJson failed: %v", err) or use your
test helper) before proceeding to assertions about len(m), m[key1], and m[key2];
this ensures ReadJson failures surface clearly in TestInstanceKeyMapReadJSON
rather than causing misleading assertion failures.
go/inst/dead_instance_filter.go (1)

48-48: Consider explicit error handling for metric registration calls.

The metric registration pattern _ = metrics.Register(...) is used consistently throughout the codebase to suppress errors. While the "discoveries.dead_instances" key itself is unique and not duplicated, explicitly handling registration errors (logging, GetOrRegister pattern, or panicking on critical metrics) would improve observability and make failure modes explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/inst/dead_instance_filter.go` at line 48, Replace the ignored
metrics.Register call for the "discoveries.dead_instances" counter with explicit
error handling: call metrics.Register("discoveries.dead_instances",
deadInstancesCounter) into an err variable and handle the error (log via your
logger, fallback to metrics.GetOrRegister("discoveries.dead_instances",
deadInstancesCounter), or panic for critical failure) instead of discarding it;
update the registration in dead_instance_filter.go where deadInstancesCounter is
registered so failures are observable and have a clear fallback.
go/raft/rel_store.go (1)

108-122: Rollback error handling changes look correct.

Assigning rollback errors to the blank identifier is appropriate here—when we're already returning the original failure, there's no meaningful recovery if the rollback also fails. The original error is preserved and returned.

One observation: the first Prepare failure (lines 105-107) exits without calling rollback, while subsequent failures do rollback. This inconsistency is pre-existing, but you may want to address it for consistency.

♻️ Optional: consistent rollback on all error paths
 	stmt, err := tx.Prepare(`delete from raft_store where store_key = ?`)
 	if err != nil {
+		_ = tx.Rollback()
 		return err
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/raft/rel_store.go` around lines 108 - 122, The Prepare call that fails
before subsequent rollback paths (the tx.Prepare(...) just after the first
stmt.Exec) returns without rolling back; update that error path to call
tx.Rollback() (assigning its error to the blank identifier) before returning the
original err so all failure paths consistently attempt to rollback; modify the
code around the tx.Prepare(...) and its err check in rel_store.go (referencing
tx.Prepare, stmt, tx.Rollback) to perform the rollback then return the original
error.
go/agent/instance_topology_agent.go (1)

75-75: Consider logging ignored audit failures for visibility.

Line 75 fully drops the audit error. Even if audit is best-effort, logging failure helps incident/debug workflows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/agent/instance_topology_agent.go` at line 75, The call to
inst.AuditOperation("align-via-relaylogs", &instance.Key, fmt.Sprintf("aligned
%+v by relaylogs from %+v", instance.Key, otherInstance.Key)) currently ignores
the returned error; change it to capture the error and log any non-nil result so
audit failures are visible (e.g., err := inst.AuditOperation(...); if err != nil
{ /* use existing logger or inst.Logger to log the error with context: operation
name and instance keys */ }). Ensure the log includes the operation string
"align-via-relaylogs" and the instance.Key/otherInstance.Key for debugging.
go/logic/topology_recovery_dao.go (1)

561-561: Acceptable for reading historical/audit data, but consider logging parse failures.

Discarding parse errors from ReadReplicaHostsFromString and ReadCommaDelimitedList during recovery audit reads is acceptable since partial data is preferable to failing the entire query. However, silently ignoring these errors could mask data quality issues.

Consider logging at debug level if parsing fails, to aid troubleshooting data inconsistencies:

if err := topologyRecovery.LostReplicas.ReadCommaDelimitedList(...); err != nil {
    log.Debugf("Failed to parse lost_slaves: %v", err)
}

Note: Pipeline indicates gofmt formatting differences. Please run gofmt -s -w go/.

Also applies to: 571-572, 733-733

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/logic/topology_recovery_dao.go` at line 561, The code currently discards
errors from parsing calls (e.g.,
topologyRecovery.AnalysisEntry.ReadReplicaHostsFromString(m.GetString("slave_hosts"))
and LostReplicas.ReadCommaDelimitedList(...) at the other sites); update those
call sites to capture the returned error and, on failure, write a debug-level
log (e.g., using log.Debugf or the package's logger) describing which field
failed to parse and the error (include the input string via m.GetString("...")
for context) so parsing failures are visible without aborting recovery reads;
also run gofmt -s -w go/ after changes to fix formatting.
go/logic/command_applier.go (2)

299-300: Consider propagating the error instead of discarding it.

Similar to injectedPseudoGTID, healthReport discards the error from ReportToRaftLeader while other handlers return their errors. This could hide raft communication failures.

Suggested fix
 func (applier *CommandApplier) healthReport(value []byte) interface{} {
 	var authenticationToken string
 	if err := json.Unmarshal(value, &authenticationToken); err != nil {
 		return log.Errore(err)
 	}
-	_ = orcraft.ReportToRaftLeader(authenticationToken)
-	return nil
+	return orcraft.ReportToRaftLeader(authenticationToken)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/logic/command_applier.go` around lines 299 - 300, The healthReport handler
currently discards the error from orcraft.ReportToRaftLeader (similar to
injectedPseudoGTID); change the handler to capture the returned error from
orcraft.ReportToRaftLeader(authenticationToken) and propagate it (return the
error or wrap with context) instead of assigning to "_" so raft communication
failures surface to callers; update healthReport's signature/return path as
needed to pass the error through consistently with other handlers like
injectedPseudoGTID.

117-118: Consider propagating the error instead of discarding it.

Unlike other command handlers in this file (e.g., forget, forgetCluster, beginDowntime) that return the error from the underlying operation, injectedPseudoGTID silently discards the error from RegisterInjectedPseudoGTID. This inconsistency could mask registration failures.

Suggested fix
 func (applier *CommandApplier) injectedPseudoGTID(value []byte) interface{} {
 	var clusterName string
 	if err := json.Unmarshal(value, &clusterName); err != nil {
 		return log.Errore(err)
 	}
-	_ = inst.RegisterInjectedPseudoGTID(clusterName)
-	return nil
+	return inst.RegisterInjectedPseudoGTID(clusterName)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/logic/command_applier.go` around lines 117 - 118, The injectedPseudoGTID
handler currently discards the result of inst.RegisterInjectedPseudoGTID by
doing "_ = inst.RegisterInjectedPseudoGTID(clusterName)" and returning nil;
change it to propagate the error (e.g., capture the returned error from
inst.RegisterInjectedPseudoGTID and return it or wrap it with context) so
failures aren't silently ignored—update the injectedPseudoGTID function to
return the error from RegisterInjectedPseudoGTID instead of dropping it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a70f8bfd-77c8-42ed-8e48-f14d7fcfc755

📥 Commits

Reviewing files that changed from the base of the PR and between 5740466 and 3960b59.

📒 Files selected for processing (72)
  • go/agent/agent_dao.go
  • go/agent/instance_topology_agent.go
  • go/app/cli.go
  • go/app/http.go
  • go/attributes/attributes_dao.go
  • go/cmd/orchestrator/main.go
  • go/collection/collection.go
  • go/collection/collection_test.go
  • go/config/config.go
  • go/db/db.go
  • go/db/tls.go
  • go/discovery/queue.go
  • go/http/agents_api.go
  • go/http/api.go
  • go/http/health.go
  • go/http/httpbase.go
  • go/http/raft_reverse_proxy.go
  • go/http/render.go
  • go/http/web.go
  • go/inst/analysis.go
  • go/inst/analysis_dao.go
  • go/inst/audit_dao.go
  • go/inst/candidate_database_instance_dao.go
  • go/inst/cluster_alias_dao.go
  • go/inst/dead_instance_filter.go
  • go/inst/downtime.go
  • go/inst/downtime_dao.go
  • go/inst/instance.go
  • go/inst/instance_binlog.go
  • go/inst/instance_binlog_dao.go
  • go/inst/instance_dao.go
  • go/inst/instance_dao_test.go
  • go/inst/instance_key.go
  • go/inst/instance_key_map_test.go
  • go/inst/instance_test.go
  • go/inst/instance_topology.go
  • go/inst/instance_topology_dao.go
  • go/inst/oracle_gtid_set_entry.go
  • go/inst/pool_dao.go
  • go/inst/replication_thread_state.go
  • go/inst/resolve.go
  • go/inst/resolve_dao.go
  • go/inst/tag.go
  • go/inst/tag_dao.go
  • go/kv/consul.go
  • go/kv/consul_test.go
  • go/kv/consul_txn.go
  • go/kv/zk.go
  • go/logic/command_applier.go
  • go/logic/orchestrator.go
  • go/logic/snapshot_data.go
  • go/logic/topology_recovery.go
  • go/logic/topology_recovery_dao.go
  • go/metrics/graphite.go
  • go/os/process.go
  • go/os/unixcheck.go
  • go/process/election_dao.go
  • go/process/health.go
  • go/process/health_dao.go
  • go/process/host.go
  • go/proxysql/client.go
  • go/proxysql/hook.go
  • go/proxysql/init.go
  • go/proxysql/topology.go
  • go/raft/file_snapshot.go
  • go/raft/fsm.go
  • go/raft/http_client.go
  • go/raft/raft.go
  • go/raft/rel_store.go
  • go/ssl/ssl.go
  • go/ssl/ssl_test.go
  • go/util/token.go
💤 Files with no reviewable changes (2)
  • go/http/httpbase.go
  • go/inst/instance_test.go

Comment thread go/agent/agent_dao.go
Comment on lines 359 to 364
mySQLRunningUri := fmt.Sprintf("%s/mysql-status?token=%s", uri, token)
body, err := readResponse(httpGet(mySQLRunningUri))
if err == nil {
err = json.Unmarshal(body, &agent.MySQLRunning)
_ = json.Unmarshal(body, &agent.MySQLRunning)
}
// Actually an error is OK here since "status" returns with non-zero exit code when MySQL not running
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C3 'mysql-status|seed-command-completed|seed-command-succeeded|MySQLRunning|seedCommandCompleted|seedCommandSucceeded|Checking MySQL status on target|10 iterations have passed without progress' go

Repository: ProxySQL/orchestrator

Length of output: 4012


Handle JSON decode failures separately from legitimate false values.

Lines 359–364 and 522–533 silently discard json.Unmarshal errors, causing boolean fields to default to false when responses contain malformed JSON. This pattern masks decode failures:

  • Line 685 checks MySQLRunning to prevent executing MySQL operations on a running instance
  • Line 751's polling loop depends on seedCommandCompleted returning an accurate completion state

If an agent endpoint returns malformed JSON with a successful HTTP status, the code treats it identically to a legitimate false response. Consider explicitly handling decode errors or separating error states from boolean results.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/agent/agent_dao.go` around lines 359 - 364, The code currently swallows
json.Unmarshal errors when populating agent.MySQLRunning (from
readResponse(httpGet(...))) and similarly for seedCommandCompleted, causing
malformed JSON to be indistinguishable from a legitimate false; update the
unmarshalling logic to detect and surface decode errors instead of silently
defaulting to false: when calling json.Unmarshal for the MySQL status and for
seed command/status responses, capture the error and either (a) return/log it
and avoid overwriting the existing boolean, or (b) unmarshal into a *bool
(pointer) or a small struct so you can tell nil (decode failure) vs false, and
propagate that error/state up (or set a separate error field like
MySQLRunningErr/SeedCommandErr) so callers (e.g., the MySQL guard that checks
MySQLRunning and the polling loop that calls seedCommandCompleted) can
distinguish malformed responses from a real false.

Comment thread go/agent/agent_dao.go
Comment on lines 545 to 549
for _, seedOperation := range seedOperations {
AbortSeedCommand(seedOperation.TargetHostname, seedId)
_, _ = AbortSeedCommand(seedOperation.TargetHostname, seedId)
AbortSeedCommand(seedOperation.SourceHostname, seedId)
}
updateSeedComplete(seedId, errors.New("Aborted"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't mark the seed aborted when the abort RPCs are unchecked.

This loop discards both agent-side abort results and then unconditionally marks the seed as aborted on Line 549. If either agent misses the abort, orchestrator can report success while the copy is still running.

Proposed fix
 func AbortSeed(seedId int64) error {
 	seedOperations, err := AgentSeedDetails(seedId)
 	if err != nil {
 		return log.Errore(err)
 	}

+	var abortErr error
 	for _, seedOperation := range seedOperations {
-		_, _ = AbortSeedCommand(seedOperation.TargetHostname, seedId)
-		AbortSeedCommand(seedOperation.SourceHostname, seedId)
+		if _, err := AbortSeedCommand(seedOperation.TargetHostname, seedId); err != nil && abortErr == nil {
+			abortErr = err
+		}
+		if _, err := AbortSeedCommand(seedOperation.SourceHostname, seedId); err != nil && abortErr == nil {
+			abortErr = err
+		}
 	}
+	if abortErr != nil {
+		return log.Errore(abortErr)
+	}
 	updateSeedComplete(seedId, errors.New("Aborted"))
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/agent/agent_dao.go` around lines 545 - 549, The loop currently ignores
AbortSeedCommand results and then unconditionally calls
updateSeedComplete(seedId, errors.New("Aborted")); change it to inspect and
aggregate the errors returned by AbortSeedCommand for both
seedOperation.TargetHostname and seedOperation.SourceHostname (for each
seedOperation in seedOperations), and only mark the seed as aborted via
updateSeedComplete(seedId, err) when all abort RPCs succeeded (or pass a
combined error when any fail); specifically, replace the discarded returns from
AbortSeedCommand with captured error variables, accumulate non-nil errors (e.g.,
in a slice or combined error) and use that aggregated error when calling
updateSeedComplete so the orchestrator won’t report success if any abort RPC
failed.

Comment thread go/agent/agent_dao.go
Comment on lines +705 to +706
sourceAgent, _ = GetAgent(sourceHostname)
_, _ = submitSeedStateEntry(seedId, fmt.Sprintf("MySQL data volume on source host %s is %d bytes", sourceHostname, sourceAgent.MountPoint.MySQLDiskUsage), "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve the GetAgent error after mounting.

If this refresh fails, sourceAgent.MountPoint.MySQLDiskUsage falls back to stale/zero data and the free-space guard on Line 720 can pass incorrectly. That can wipe the target and continue the seed without ever re-reading the mounted volume size.

Proposed fix
-	sourceAgent, _ = GetAgent(sourceHostname)
-	_, _ = submitSeedStateEntry(seedId, fmt.Sprintf("MySQL data volume on source host %s is %d bytes", sourceHostname, sourceAgent.MountPoint.MySQLDiskUsage), "")
+	sourceAgent, err = GetAgent(sourceHostname)
+	if err != nil {
+		_, _ = Unmount(sourceHostname)
+		return updateSeedStateEntry(seedStateId, err)
+	}
+	_, _ = submitSeedStateEntry(seedId, fmt.Sprintf("MySQL data volume on source host %s is %d bytes", sourceHostname, sourceAgent.MountPoint.MySQLDiskUsage), "")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/agent/agent_dao.go` around lines 705 - 706, After remounting the source
volume you must preserve and check the error returned by GetAgent instead of
discarding it; update the code around the call to GetAgent so that the returned
error is assigned (e.g., sourceAgent, err := GetAgent(sourceHostname)), verify
err is nil before using sourceAgent.MountPoint.MySQLDiskUsage, and if GetAgent
fails return or log and abort the seed operation (using seedId and
submitSeedStateEntry to record the failure) so stale/zero MySQLDiskUsage cannot
bypass the free-space guard.

var content string
onResponse := func(contentBytes []byte) {
json.Unmarshal(contentBytes, &content)
_ = json.Unmarshal(contentBytes, &content)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

head -n 100 go/agent/instance_topology_agent.go | tail -n 75

Repository: ProxySQL/orchestrator

Length of output: 2621


🏁 Script executed:

sed -n '1,100p' go/agent/instance_topology_agent.go

Repository: ProxySQL/orchestrator

Length of output: 2655


🏁 Script executed:

rg -n "func RelaylogContentsTail" --type go

Repository: ProxySQL/orchestrator

Length of output: 218


🏁 Script executed:

rg -A 20 "func RelaylogContentsTail" --type go | head -50

Repository: ProxySQL/orchestrator

Length of output: 663


🏁 Script executed:

rg -A 30 "func executeAgentCommand" go/agent/agent_dao.go | head -50

Repository: ProxySQL/orchestrator

Length of output: 1896


Don't ignore relay-log JSON decode failures.

Line 33 discards json.Unmarshal errors; that allows invalid or empty payloads to flow into ApplyRelaylogContents. The callback is invoked synchronously before RelaylogContentsTail returns, so you can capture the error and propagate it through the existing error handling path.

Suggested fix
 func SyncReplicaRelayLogs(instance, otherInstance *inst.Instance) (*inst.Instance, error) {
 	var err error
+	var decodeErr error
 	var found bool
 	var nextCoordinates *inst.BinlogCoordinates
 	var content string
 	onResponse := func(contentBytes []byte) {
-		_ = json.Unmarshal(contentBytes, &content)
+		if e := json.Unmarshal(contentBytes, &content); e != nil {
+			decodeErr = e
+		}
 	}
@@
 	if _, err := RelaylogContentsTail(otherInstance.Key.Hostname, nextCoordinates, &onResponse); err != nil {
 		goto Cleanup
 	}
+	if decodeErr != nil {
+		err = decodeErr
+		goto Cleanup
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_ = json.Unmarshal(contentBytes, &content)
func SyncReplicaRelayLogs(instance, otherInstance *inst.Instance) (*inst.Instance, error) {
var err error
var decodeErr error
var found bool
var nextCoordinates *inst.BinlogCoordinates
var content string
onResponse := func(contentBytes []byte) {
if e := json.Unmarshal(contentBytes, &content); e != nil {
decodeErr = e
}
}
// ... (intervening code)
if _, err := RelaylogContentsTail(otherInstance.Key.Hostname, nextCoordinates, &onResponse); err != nil {
goto Cleanup
}
if decodeErr != nil {
err = decodeErr
goto Cleanup
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/agent/instance_topology_agent.go` at line 33, Replace the ignored
json.Unmarshal call so that decoding errors are captured and propagated: change
the `_ = json.Unmarshal(contentBytes, &content)` usage in RelaylogContentsTail
(the call that eventually invokes ApplyRelaylogContents) to check the returned
error, log or return it through the existing error path, and avoid calling
ApplyRelaylogContents with invalid/empty content; ensure the function signature
and callers of RelaylogContentsTail (and any error-handling path already
present) are used to propagate the JSON decode error instead of discarding it.

Comment thread go/http/api.go
Comment on lines 360 to 365
if orcraft.IsRaftEnabled() {
orcraft.PublishCommand("forget-cluster", clusterName)
_, _ = orcraft.PublishCommand("forget-cluster", clusterName)
} else {
inst.ForgetCluster(clusterName)
_ = inst.ForgetCluster(clusterName)
}
Respond(w, &APIResponse{Code: OK, Message: fmt.Sprintf("Cluster forgotten: %+v", clusterName)})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Propagate command submission failures before returning OK.

These handlers now discard the raft/local command error and still reply with success. If the publish fails—or inst.ForgetCluster fails in the non-raft path—the caller gets a false positive while nothing changed.

🐛 Proposed fix
 	if orcraft.IsRaftEnabled() {
-		_, _ = orcraft.PublishCommand("forget-cluster", clusterName)
+		_, err = orcraft.PublishCommand("forget-cluster", clusterName)
 	} else {
-		_ = inst.ForgetCluster(clusterName)
+		err = inst.ForgetCluster(clusterName)
+	}
+	if err != nil {
+		Respond(w, &APIResponse{Code: ERROR, Message: err.Error()})
+		return
 	}
 	Respond(w, &APIResponse{Code: OK, Message: fmt.Sprintf("Cluster forgotten: %+v", clusterName)})
-	_, _ = orcraft.PublishYield(chi.URLParam(r, "node"))
+	if _, err := orcraft.PublishYield(chi.URLParam(r, "node")); err != nil {
+		Respond(w, &APIResponse{Code: ERROR, Message: err.Error()})
+		return
+	}
 	Respond(w, &APIResponse{Code: OK, Message: "Asynchronously yielded"})
@@
-	_, _ = orcraft.PublishYieldHostnameHint(hint)
+	if _, err := orcraft.PublishYieldHostnameHint(hint); err != nil {
+		Respond(w, &APIResponse{Code: ERROR, Message: err.Error()})
+		return
+	}
 	Respond(w, &APIResponse{Code: OK, Message: fmt.Sprintf("Asynchronously yielded by hint %s", hint), Details: hint})

Also applies to: 3023-3039

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/http/api.go` around lines 360 - 365, The current handler ignores errors
from orcraft.PublishCommand and inst.ForgetCluster and always returns OK;
capture the returned error from orcraft.PublishCommand(...) and from
inst.ForgetCluster(clusterName), and if err != nil call Respond with a non-OK
APIResponse that includes the error message (instead of proceeding to the
success Respond). Update the branch that uses orcraft.IsRaftEnabled() to check
the publish error and the non-raft branch to check the inst.ForgetCluster error,
returning early on failure via Respond(APIResponse{Code: <error code>, Message:
fmt.Sprintf("failed to forget cluster: %v", err)}) so callers don’t receive a
false success.

Comment thread go/os/process.go Outdated
Comment thread go/process/health.go
}
}
health.AvailableNodes, err = ReadAvailableNodes(true)
health.AvailableNodes, _ = ReadAvailableNodes(true)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Surface ReadAvailableNodes failures from HealthTest.

This query is part of the health payload, but its error is dropped here. That can return a cached-looking healthy result while the read side is broken and leave callers with no signal beyond an empty AvailableNodes.

💡 Suggested fix
-	health.AvailableNodes, _ = ReadAvailableNodes(true)
+	if health.AvailableNodes, err = ReadAvailableNodes(true); err != nil {
+		health.Error = err
+		return health, log.Errore(err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/process/health.go` at line 138, The call to ReadAvailableNodes(true) drops
its error and can mask read-side failures in HealthTest; change the code around
health.AvailableNodes to capture the error (e.g., available, err :=
ReadAvailableNodes(true)), and if err is non-nil propagate or include it in the
HealthTest response (return the error or set an explicit failure field/message)
instead of discarding it so callers receive a clear signal when
ReadAvailableNodes fails.

Comment thread go/raft/file_snapshot.go
Comment on lines +463 to +467
defer func() { _ = fh.Close() }()

// Buffer the file IO
buffered := bufio.NewWriter(fh)
defer buffered.Flush()
defer func() { _ = buffered.Flush() }()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "file_snapshot.go" -type f

Repository: ProxySQL/orchestrator

Length of output: 143


🏁 Script executed:

cat -n go/raft/file_snapshot.go | head -500 | tail -100

Repository: ProxySQL/orchestrator

Length of output: 2892


🏁 Script executed:

# Get the exact function context around lines 463-467
sed -n '450,480p' go/raft/file_snapshot.go

Repository: ProxySQL/orchestrator

Length of output: 744


🏁 Script executed:

# Get the full writeMeta function
sed -n '430,490p' go/raft/file_snapshot.go

Repository: ProxySQL/orchestrator

Length of output: 1338


🏁 Script executed:

# Also check how writeMeta is called to understand return value handling
rg "writeMeta" go/raft/file_snapshot.go -A 3 -B 3

Repository: ProxySQL/orchestrator

Length of output: 601


Check Flush and Close return errors before reporting metadata success.

enc.Encode only writes into the buffer. The deferred _ = buffered.Flush() and _ = fh.Close() silently ignore errors, allowing writeMeta to return success even when Flush fails due to disk-full or I/O errors, leaving a partial meta.json on disk.

Suggested fix
-	defer func() { _ = fh.Close() }()
-
 	// Buffer the file IO
 	buffered := bufio.NewWriter(fh)
-	defer func() { _ = buffered.Flush() }()
 
 	// Write out as JSON
 	enc := json.NewEncoder(buffered)
 	if err := enc.Encode(&s.meta); err != nil {
+		_ = fh.Close()
 		return err
 	}
-	return nil
+	if err := buffered.Flush(); err != nil {
+		_ = fh.Close()
+		return err
+	}
+	return fh.Close()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
defer func() { _ = fh.Close() }()
// Buffer the file IO
buffered := bufio.NewWriter(fh)
defer buffered.Flush()
defer func() { _ = buffered.Flush() }()
// Buffer the file IO
buffered := bufio.NewWriter(fh)
// Write out as JSON
enc := json.NewEncoder(buffered)
if err := enc.Encode(&s.meta); err != nil {
_ = fh.Close()
return err
}
if err := buffered.Flush(); err != nil {
_ = fh.Close()
return err
}
return fh.Close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/raft/file_snapshot.go` around lines 463 - 467, writeMeta currently encodes
into a bufio.Writer (buffered) and defers buffered.Flush() and fh.Close() while
ignoring their return errors, which can hide disk/full I/O failures; change the
logic in writeMeta (the enc.Encode -> buffered -> fh sequence) to call
buffered.Flush() and fh.Close() explicitly after enc.Encode returns, check and
propagate any errors from Flush() and Close(), and only report metadata write
success when both Flush and Close succeed (include context that the failure
occurred while flushing/writing meta.json).

Comment thread go/ssl/ssl_test.go
Comment on lines 186 to 192
func writeFakeFile(content string) string {
f, err := ioutil.TempFile("", "ssl_test")
f, err := os.CreateTemp("", "ssl_test")
if err != nil {
return ""
}
ioutil.WriteFile(f.Name(), []byte(content), 0644)
_ = os.WriteFile(f.Name(), []byte(content), 0644)
return f.Name()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and read the specific file and function
fd "ssl_test.go" go/ | head -5

Repository: ProxySQL/orchestrator

Length of output: 83


🏁 Script executed:

# Read the file around the specified lines
cat -n go/ssl/ssl_test.go | sed -n '180,200p'

Repository: ProxySQL/orchestrator

Length of output: 890


🏁 Script executed:

# Let's also check the full context of the writeFakeFile function
cat -n go/ssl/ssl_test.go | grep -A 20 "func writeFakeFile"

Repository: ProxySQL/orchestrator

Length of output: 1166


🏁 Script executed:

# Search for usages of writeFakeFile
rg "writeFakeFile" go/ssl/ssl_test.go -A 2 -B 2

Repository: ProxySQL/orchestrator

Length of output: 1449


🏁 Script executed:

# Check if there are cleanup mechanisms or deferred closes
rg "os.Remove|defer" go/ssl/ssl_test.go | head -20

Repository: ProxySQL/orchestrator

Length of output: 571


writeFakeFile leaks file descriptors and silently ignores write errors.

Line 187 creates a file handle via os.CreateTemp that is never closed, and line 191 ignores write errors with the blank assignment. Tests calling this function accumulate open file descriptors, which can cause flaky test failures when hitting OS limits. Additionally, if the write fails, the function returns the filename anyway, potentially causing tests to fail with missing or incomplete files.

Suggested fix
 func writeFakeFile(content string) string {
 	f, err := os.CreateTemp("", "ssl_test")
 	if err != nil {
 		return ""
 	}
-	_ = os.WriteFile(f.Name(), []byte(content), 0644)
+	if _, err := f.WriteString(content); err != nil {
+		_ = f.Close()
+		_ = os.Remove(f.Name())
+		return ""
+	}
+	if err := f.Close(); err != nil {
+		_ = os.Remove(f.Name())
+		return ""
+	}
 	return f.Name()
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func writeFakeFile(content string) string {
f, err := ioutil.TempFile("", "ssl_test")
f, err := os.CreateTemp("", "ssl_test")
if err != nil {
return ""
}
ioutil.WriteFile(f.Name(), []byte(content), 0644)
_ = os.WriteFile(f.Name(), []byte(content), 0644)
return f.Name()
func writeFakeFile(content string) string {
f, err := os.CreateTemp("", "ssl_test")
if err != nil {
return ""
}
if _, err := f.WriteString(content); err != nil {
_ = f.Close()
_ = os.Remove(f.Name())
return ""
}
if err := f.Close(); err != nil {
_ = os.Remove(f.Name())
return ""
}
return f.Name()
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go/ssl/ssl_test.go` around lines 186 - 192, The helper writeFakeFile leaks
the os.CreateTemp file descriptor and ignores write errors; update writeFakeFile
to close the temp file handle (f.Close()) and check the result of the write
operation (either check error returned by os.WriteFile or use f.Write and check
that error), and if writing fails remove the temp file and return an empty
string so callers don't get a path to an incomplete file; reference the function
writeFakeFile, the temp file handle f (f.Name()), os.CreateTemp and os.WriteFile
when making the change.

Comment thread go/ssl/ssl.go
@renecannao renecannao merged commit c38a16b into master Mar 24, 2026
3 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants