Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion analyze/prompts/_base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,19 @@

## Functions (mutually recursive group)
{{range .Functions}}
### {{.Name}}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons I avoided including comments is, because my gut feeling is that it might make LLM think of the wrong thing. i.e. comment contains something that the code contradicts.

There might be a better way to handle this though. e.g. cross check against docs and the func content instead of just trusting godoc/body together.

{{if .Godoc}}{{.Godoc}}
{{end}}
```go
{{.Body}}
```
{{end -}}
{{- else}}
{{- if .Godoc}}

## Documentation
{{.Godoc}}
{{end -}}
```go
{{.Body}}
```
Expand Down Expand Up @@ -77,7 +85,9 @@ Signature: {{.Signature}}

{{- define "issues-format"}}

Respond with JSON. The "line" field must be the line number. The "code" field must contain the exact line of code where the issue occurs, copied verbatim from the code block above.
Severity scale: "critical" = exploitable security flaw or data corruption; "high" = incorrect behavior in normal use; "medium" = incorrect in edge cases or degraded reliability; "low" = poor practice with high probability of causing future bugs; "info" = minor concern.

Respond with JSON. The "line" field must be the line number within the function code block shown above, counting from 1 at the first line of the function — not the absolute file line number. The "code" field must contain the exact line of code where the issue occurs, copied verbatim from the code block above.

Example:
{"issues": [{"function": "ParseConfig", "line": 42, "code": " query := \"SELECT * FROM users WHERE id=\" + id", "severity": "critical", "message": "SQL injection via string concatenation", "suggestion": "Use parameterized query"}]}
Expand Down
23 changes: 21 additions & 2 deletions analyze/prompts/baseline.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,25 @@
You are reviewing Go code for common issues and mistakes.
You are reviewing Go code for logical correctness.
{{template "function-context" .}}
{{- template "summary-context" .}}

Is this function correct?
Does this function correctly implement what it claims to do? Look for:

Implementation vs. Intent:
- Behavior that contradicts the declared purpose, invariants, or documented error conditions
- A code path that returns success when it should return an error, or vice versa
- A precondition stated in the summary that the implementation does not actually enforce

Logic Errors:
- Inverted boolean conditions (== where != is needed, < where > is needed)
- Wrong logical operator (&& vs ||) that produces incorrect short-circuit behavior
- Off-by-one errors in loop bounds, index expressions, or size calculations
- Unreachable code following an unconditional return, panic, or continue
- A switch or if-else chain that silently omits a case that must be handled
- Integer overflow or underflow in arithmetic that is not guarded

Control Flow:
- A loop that cannot terminate under reachable inputs
- Early return that skips required cleanup or postcondition logic
- Error path that falls through to success-path code

{{- template "issues-format" .}}
25 changes: 21 additions & 4 deletions analyze/prompts/concurrency.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,33 @@ You are reviewing Go code for concurrency issues.
{{- template "external-funcs-context" .}}

## Concurrency Review Checklist
Look for:

Safety:
- Race conditions on shared state without synchronization
- Goroutine leaks (goroutines that never terminate)
- Missing mutex locks or unlocks
- Deadlock potential from lock ordering
- Channel misuse (send on closed channel, nil channel operations)
- Missing select default cases causing blocking
- select without a default case where non-blocking dispatch was clearly intended
- Improper use of sync.WaitGroup (Add after Wait, wrong count)
- Context cancellation not checked in long operations
- Shared state modified in goroutines without protection
- Context cancellation not checked in long-running operations
- sync.Mutex copied by value
- sync.RWMutex used but reads acquire a full write lock instead of RLock — serializes all reads unnecessarily

Design:
- Multiple goroutines writing to the same external resource — designate one owner goroutine;
all others produce immutable values and send them to the owner via a channel
- Mutex reached for as the first response to a shared-state problem when an immutable-value-
and-channel design would eliminate the shared state entirely
- errgroup or fan-in channel not used to collect results from parallel goroutines — instead,
results accumulated via a shared slice with a mutex, or goroutines write directly to an
external resource rather than sending immutable values to an owner
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see all of these Design category things being necessary in some scenarios due to performance.


Time:
- time.Now() compared against another time.Now() across goroutines to determine ordering —
wall clocks drift and NTP can move time backward; use a monotonic measurement (time.Since,
time.Now().Add) for elapsed time and timeouts; use wall-clock time only for display
- time.Now(), rand, or os.Getenv called inside concurrent logic instead of injected at the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These while not ideal, I'm not sure how well this would catch issues.

boundary — hidden global state causes flaky tests and makes concurrent behavior non-deterministic

{{- template "issues-format" .}}
36 changes: 36 additions & 0 deletions analyze/prompts/context-auth.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
You are reviewing Go code for context and authorization pattern violations.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context use issues and authorization seem very different categories to me and can be handled separately.

{{template "function-context" .}}
{{- template "summary-context" .}}
{{- template "callees-context" .}}

## Context Patterns

- context.Context stored as a struct field — it must be passed as an explicit function parameter
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite always the case. Sometimes contexts are used to control lifecycle for several goroutines. Alternatively, it could be passed inside a wrapper struct, which contains additional info.

- context.Value used to retrieve a typed domain value with an untyped key (string, int
literal) or an exported key type — keys must be unexported typed constants
(type contextKey int; const userContextKey contextKey = iota) so different packages cannot
collide; provide package-level accessors (UserFromContext) rather than exposing the key
- context.Value used to pass business data, feature flags, or configuration that should be
explicit function parameters — Value is for request-scoped cross-cutting concerns only
(trace IDs, auth identity)
- Context not propagated to a downstream call — DB methods (QueryContext, ExecContext),
outbound HTTP (http.NewRequestWithContext), and gRPC calls must receive the caller's ctx

## Authorization

Layer responsibility: the transport layer (HTTP handler or middleware) authenticates the caller
and stores identity on the context. Service and data layers extract identity from context and
enforce authorization. Look for violations of this boundary:

- Service or repository method that calls context.WithValue to attach user identity — only
the transport layer should set auth context; downstream layers only read it
- Authorization enforced by fetching all rows and then filtering in application code — the
constraint must be embedded in the SQL WHERE clause so the database enforces it, not
application code after the fact
- Service method that accepts userID as an explicit parameter and passes it as an optional
filter rather than extracting it from context and treating it as a mandatory constraint
- Authorization decision made exclusively in middleware with no enforcement in the service or
data layer — callers that bypass the middleware (background jobs, CLI, tests) get no enforcement

Only report actual violations.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether these comments actually help LLMs. A separate stage to filter out non-problems might be a better approach.

{{- template "issues-format" .}}
22 changes: 20 additions & 2 deletions analyze/prompts/correctness.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ You are reviewing Go code for correctness issues.

Error Handling:
- Ignored errors (assigned to _ or not checked)
- Missing error wrapping (use fmt.Errorf with %w)
- Errors propagated without any wrapping or context (use fmt.Errorf with %w, or Op-based domain wrapping if a domain Error type is in use)
- Panic for recoverable errors
- Error shadowing in inner scopes

Expand All @@ -23,7 +23,25 @@ Resource Management:
- Missing defer for cleanup
- defer in loops (accumulates until function returns)
- Resources not released on error paths
- Context cancellation not checked in long operations
- Missing timeouts on network operations

Error Type Discipline:
- Lookup-by-ID returning (nil, nil) instead of an ENOTFOUND error
- External errors (sql.ErrNoRows, os.ErrNotExist) escaping the implementation boundary untranslated
- Leaf errors carrying both Code and Err (leaf: Code+Message only; wrapping: Op+Err only)
- Missing Op wrap on significant function boundaries
- Type-asserting *Error directly at the call site instead of calling ErrorCode() and
ErrorMessage() — direct type assertions couple callers to the concrete Error struct and
break when the error is wrapped

SQL / Data Access:
- defer tx.Rollback() not called immediately after a successful BeginTx — without it a failed
commit or a panic will leave the transaction open and hold database locks
- rows.Close() not deferred immediately after a successful QueryContext call
- Result slice initialized with var (encodes as JSON null) instead of make([]*T, 0)
- rows.Err() not checked after the iteration loop
- N+1 query pattern: one query issued per record inside a loop when a JOIN or batch fetch would retrieve the same data in one round trip
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These last three aren't really correctness issues. It might make sense to have a separate database related pass.

- Transaction held open while waiting for user input or an external HTTP response — transactions must be short; never span a user interaction
- Transaction exposed to service callers — *Tx or *sql.Tx accepted as a parameter or returned from a service method; transactions are an implementation detail and must not leak across the service boundary

{{- template "issues-format" .}}
84 changes: 84 additions & 0 deletions analyze/prompts/distributed.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
You are reviewing Go code for distributed systems reliability violations.
{{template "function-context" .}}
{{- template "summary-context" .}}
{{- template "callees-context" .}}
{{- template "external-funcs-context" .}}

## Timeouts

- Outbound HTTP call made with http.DefaultClient or a client with no Timeout set — a slow
upstream will hold the goroutine indefinitely, eventually exhausting the pool
- Database call made with a context that has no deadline — long-running queries will not be
cancelled when the caller gives up or the request is aborted
- Context passed to a network call is context.Background() or context.TODO() rather than the
request-scoped context received from the caller

## Retries

- Retry loop uses time.Sleep with a fixed interval — use exponential backoff with random jitter
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.Sleep is not great in those scenarios, because it doesn't respond to context cancellation.

(base * 2^n + jitter) to avoid coordinated retry storms when many clients fail simultaneously
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also need an upper limit. Or maybe in general "use best practice exponential backoff" without specifying exactly which one.

- Retry loop does not classify errors as retryable before retrying — 400, 404, and validation
errors must not be retried; only transient errors (timeouts, 503, connection reset) should be
- Retry loop has no maximum attempt count or deadline — will retry indefinitely under sustained failure

## Idempotency

- Mutating operation (create entity, charge, send notification, enqueue job) that does not
check or assign a client-supplied idempotency key before executing — duplicate requests
produce duplicate side effects
- Idempotency key stored only in an in-memory map or cache — must be persisted durably so
deduplication survives process restarts; in-memory deduplication is lost on crash

## Fan-out and Isolation

- Independent downstream calls issued sequentially when they could run concurrently — use
errgroup and issue them in parallel; sequential fan-out multiplies latency unnecessarily
- No per-upstream goroutine limit or semaphore — a single slow dependency can exhaust all
available goroutines and starve unrelated request handling

## Dual Writes

- Two separate systems (database + cache, database + search index, two databases) written to
in sequence without a compensating mechanism — if the second write fails the systems are
permanently inconsistent; treat the event log or primary database as the source of truth and
let derived systems subscribe via CDC

## Consistency

- Read-modify-write cycle (SELECT then UPDATE) without SELECT FOR UPDATE or a compare-and-
swap on a version/ETag column — concurrent requests will silently overwrite each other
- SELECT FOR UPDATE issued without a context that has a deadline — if the lock holder stalls,
the waiting transaction will block indefinitely
- Counter or balance incremented with a read-modify-write cycle instead of an atomic
UPDATE counter = counter + ? — the read value may be stale by the time the write executes
- Last-write-wins (LWW) conflict resolution used where data loss is unacceptable — LWW
silently discards concurrent writes; use a version column, vector clock, or merge function
- Distributed lock used without a fencing token — if the lock holder pauses (GC, network
stall) after acquiring the lock, another node will acquire it; both nodes then believe they
hold the lock; every write to the shared resource must include the monotonic token and the
resource must reject writes with a stale token
- Database table polled in a tight loop to detect new work — polling causes table-scan load
and lock contention; use a purpose-built message queue (Kafka, RabbitMQ, SQS) or
LISTEN/NOTIFY instead

## Caching

- Cache used as the sole copy of data with no independent source of truth — when the cache is
empty the system must be able to repopulate it; if it cannot, the cache is actually a database
and should be treated as one
- Cached value written without an explicit TTL or invalidation strategy — unbounded TTLs cause
stale data to persist indefinitely across deployments and schema changes
- Cache miss under high load not protected against stampede — when a popular key expires many
requests miss simultaneously and all rebuild concurrently; use probabilistic early expiration,
request coalescing, or a background refresh pattern

## Encoding

- json.Marshal / json.Encode used to serialize data crossing an internal service boundary —
use binary encoding (Protocol Buffers, Thrift, Avro) for internal inter-service communication;
JSON is acceptable only for external-facing APIs
- Protocol Buffers field removed from a message definition without marking it reserved —
old consumers will re-use the field number for a different type, silently corrupting data

Only report actual violations.
{{- template "issues-format" .}}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what's the limit of how many checks an LLM can reasonably keep track of. My gut feeling says it's maybe 10-20 related things, then it starts to miss specific issues.

I haven't looked at research or done any experiments, so it would be nice to know.

Either way, this does feel a rather long list.

47 changes: 47 additions & 0 deletions analyze/prompts/domain-types.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
You are reviewing Go code for domain type and CRUD convention violations.
{{template "function-context" .}}
{{- template "summary-context" .}}
{{- template "callees-context" .}}

## CRUD Conventions

FindByID / FindOne:
- Returns (nil, nil) when the entity does not exist — must return ENOTFOUND
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not that clear.

- Does not wrap the callee's error with an Op before returning
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was already handled somewhere else.


FindMany / List:
- Result slice declared with var (encodes as JSON null) — initialize with make([]*T, 0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make([]*T) would be fine, the 0 doesn't add anything.

- Returns ENOTFOUND for an empty result set — an empty list is not an error
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENOTFOUND is more of a C style error name.

- Second return value omitted — FindMany must return ([]*T, int, error) where int is the
total count for pagination regardless of Limit/Offset

Create:
- Does not mutate the input pointer to set ID, CreatedAt, or UpdatedAt on success
- Returns the new entity as a separate value rather than populating the caller's struct

Update:
- Does not return the updated object on error — callers need it to replay a form
- Accepts individual fields rather than an *Update struct with pointer fields
- Uses a non-pointer field type in an Update struct — nil cannot express "do not change this field"

Delete:
- Accepts more than the primary key as parameters — authorization and ownership checks belong
inside the implementation, not in the signature
- Checks authorization outside the implementation (in the handler) rather than enforcing it
inside the data layer using UserIDFromContext

Filter structs:
- Filter field is a non-pointer type — nil cannot express "not filtered on"
- Accepts a raw offset integer for pagination — use a cursor (last-seen ID) to avoid skipped
or duplicated rows under concurrent writes

Interface contracts:
- Exported service interface method missing a godoc comment that names the error codes it
can return — callers cannot know which errors to handle without this contract
- Service interface method whose first parameter is not context.Context — all service methods
must accept context.Context as their first argument for cancellation and auth propagation
- Caching or decoration wrapper that re-implements domain logic rather than delegating to the
wrapped service — wrappers must satisfy the same interface and add one concern only

Only report actual violations of these conventions.
{{- template "issues-format" .}}
50 changes: 50 additions & 0 deletions analyze/prompts/functional-core.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
You are reviewing Go code for functional core / imperative shell violations.
{{template "function-context" .}}
{{- template "summary-context" .}}
{{- template "callees-context" .}}
{{- template "external-funcs-context" .}}

## I/O Mixed into Domain Logic

A domain or business logic function should accept values and return new values without
performing I/O. A function that contains computation (sorting, filtering, ranking, validation,
pricing, scheduling) is difficult to test and reason about when it also performs I/O. Look for
functions that mix both:

- Direct calls into database/sql, sqlx, pgx, or any ORM package alongside business logic
- Outbound HTTP calls (http.Get, http.Post, http.Client.Do) or gRPC client calls alongside
computation — move the I/O to the caller and pass the result in as a parameter
- time.Now() — inject the current time as a parameter; a hidden time.Now() call makes the
function non-deterministic and causes intermittent test failures
- rand.Float64(), rand.Intn(), or any global random call — inject the rand.Source or the
already-generated value; global random state cannot be controlled in tests
- os.Getenv, os.ReadFile, os.Open, or direct filesystem access — read at the boundary and
pass values in; domain functions should not know where configuration comes from
- Dependency accepted as a concrete struct type rather than an interface — concrete types
make the function untestable without the real implementation; accept the narrowest interface
that provides the methods the function actually calls

## Mutation

- Method mutates a field on its receiver when it could return a new value — domain types
should be transformed (accept value, return new value), not mutated in place; receiver
mutation makes behavior invisible to callers that hold the original value
- Function modifies a slice or map passed as a parameter — return a new collection so callers
are not surprised by hidden side effects on their own data structures

## Shell Code Absorbing Logic

The imperative shell (service methods, HTTP handlers, background workers) should be a flat
sequence of I/O calls and value assignments with minimal branching. When the shell accumulates
logic it can no longer be understood without reading it all, and that logic cannot be tested
without triggering I/O. Look for:

- Sorting, filtering, ranking, or business rule evaluation done inline in a service method or
handler body rather than delegated to a pure function — extract the computation; call it
with real values in the shell
- Shell code with three or more conditional branches where the branch condition depends only
on values already in scope (no I/O needed to resolve it) — this logic belongs in a pure function

Do not flag I/O in functions whose documented purpose is I/O (repository methods, HTTP
handlers, background pollers, main wiring code).
{{- template "issues-format" .}}
Loading