Skip to content

Log patterns: Log patterns related code#300

Open
dawkaka wants to merge 35 commits intomasterfrom
log-patterns-chp
Open

Log patterns: Log patterns related code#300
dawkaka wants to merge 35 commits intomasterfrom
log-patterns-chp

Conversation

@dawkaka
Copy link
Contributor

@dawkaka dawkaka commented Jan 22, 2026

Saving and presenting log patterns in ui and log patterns anomaly detection

  • Unusual increase in log patterns within a time window
  • New detected log patterns

Closes #

How to test

@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.



-- | Get a pattern by ID
getLogPatternById :: DB es => LogPatternId -> Eff es (Maybe LogPattern)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the _selectWhere pattern, instead of enumerating the fields one by one

Comment on lines 250 to 264
Issues.LogPattern ->
"Describe this log pattern issue and its implications.\n"
<> "Title: "
<> issue.title
<> "\n"
<> "Service: "
<> fromMaybe "unknown-service" issue.service
Issues.LogPatternRateChange ->
"Describe this log pattern rate change and its implications.\n"
<> "Title: "
<> issue.title
<> "\n"
<> "Service: "
<> fromMaybe "unknown-service" issue.service

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the [text|] or any other quasiquote. so its easier to visually see the pattern of this message without the haskell semigroup noise

@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.

Add no-focus-ring class to exclude AI search input from global
:focus-visible outline styling.
@claude

This comment was marked as outdated.

Comment on lines 288 to 307
SELECT
lp.id,
lp.project_id,
lp.log_pattern,
lp.pattern_hash,
lp.baseline_state,
lp.baseline_volume_hourly_mean,
lp.baseline_volume_hourly_stddev,
COALESCE(counts.current_count, 0)::INT AS current_hour_count
FROM apis.log_patterns lp
LEFT JOIN (
SELECT log_pattern, COUNT(*) AS current_count
FROM otel_logs_and_spans
WHERE project_id = ?
AND timestamp >= date_trunc('hour', NOW())
AND log_pattern IS NOT NULL
GROUP BY log_pattern
) counts ON counts.log_pattern = lp.log_pattern
WHERE lp.project_id = ?
AND lp.state != 'ignored' AND lp.baseline_state = 'established'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is log_pattern supposed to join on otel_logs_and_spans when they're not in the same database?

Or is log_patterns supposed to be a timeseries table in timefusion as well? if thats the case then you cant make queries on timeseries tables that dont depend on timestamp range

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and you never join on two time series tables. some databases might support the join operation, but the performance is always horrible in that case

Comment on lines 573 to 584
[text|
SELECT lp.log_pattern, count(*) as p_count
FROM apis.log_patterns lp
INNER JOIN otel_logs_and_spans ols
ON lp.log_pattern = ols.log_pattern AND lp.project_id::text = ols.project_id
WHERE lp.project_id = ?
AND lp.state != 'ignored'
AND ${whereCondition}
GROUP BY lp.log_pattern
ORDER BY p_count DESC
OFFSET ? LIMIT 15
|]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as here. how is log_pattern joining on otel_logs_and_spans?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otel_logs_and_spans has a log_pattern column

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is log_pattern going to be in timefusion or postgres? if its in timefusion, you do joins. And you can't query it without time range being part of the query.


-- | Get pattern stats from otel_logs_and_spans
-- Returns median and MAD (Median Absolute Deviation) for robust baseline calculation
getPatternStats :: DB es => Projects.ProjectId -> Text -> Int -> Eff es (Maybe PatternStats)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you not using our widgets or atleast KQL for stats and numbers? isnt this for display?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for detecting spikes


-- | Calculate baselines for log patterns
-- Uses hourly counts from otel_logs_and_spans over the last 7 days
calculateLogPatternBaselines :: Projects.ProjectId -> ATBackgroundCtx ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to support magic alerts. If we're detecting spikes, we should implement an alert system that our users can enable on any metric as well, so its the same code anad logic for all cases. Not magic logic we run in the background.

@claude

This comment was marked as outdated.

@claude
Copy link

claude bot commented Jan 24, 2026

PR Review: Log Patterns Feature

Summary

This is a solid implementation of log patterns functionality with anomaly detection. The code follows Haskell best practices and makes good use of the available GHC extensions. However, there are opportunities for improvement in performance, security, and code succinctness.


🔴 Critical Issues

1. Unbounded Query Vulnerability (src/Models/Apis/LogPatterns.hs:96-110)

getLogPatterns :: DB es => Projects.ProjectId -> Int -> Int -> Eff es [LogPattern]
getLogPatterns pid limit offset = PG.query q (pid, limit, offset)

Issue: No validation on limit parameter - malicious caller could request millions of rows.
Fix: Add bounds: let safeLimit = min limit 1000

2. Race Condition in Baseline Calculation (src/BackgroundJobs.hs:1674-1695)

Sequential pattern processing without locking could cause race conditions when multiple jobs run concurrently.
Fix: Use database-level locking or optimistic concurrency control (add version column).

3. Missing Permission Check (src/Models/Apis/LogPatterns.hs:138-148)

acknowledgeLogPatterns :: DB es => Users.UserId -> V.Vector Text -> Eff es Int64

Issue: No validation that user has permission to acknowledge patterns for the project.
Fix: Join with project membership table in the query.

4. Potential Orphaned Jobs (src/BackgroundJobs.hs:1748-1752)

If insertIssue fails, the enhancement job is still created.
Fix: Wrap in transaction or use bracket pattern.


⚠️ Performance Issues

1. N+1 Query Pattern (src/BackgroundJobs.hs:1678-1695)

patterns <- LogPatterns.getLogPatterns pid 1000 0
forM_ patterns \lp -> do
  statsM <- LogPatterns.getPatternStats pid lp.logPattern 168

Impact: 1 + N queries for baseline calculation.
Fix: Create batch query using SQL window functions to compute all pattern statistics in one go.

2. Missing Composite Index (static/migrations/0026_log_patterns.sql)

Spike detection query filters by state and baseline_state, but no composite index exists.
Fix:

CREATE INDEX idx_log_patterns_spike_detection 
  ON apis.log_patterns(project_id, state, baseline_state)
  WHERE state != 'ignored' AND baseline_state = 'established';

3. Memory Leak Risk (src/Pkg/Drain.hs:230-236)

getAllLogGroups tree =
  let levelOnes = children tree
      levelTwos = V.concatMap nodes levelOnes
      allLogGroups = V.concatMap logGroups levelTwos
   in V.map (\grp -> (grp.exampleLog, templateStr grp, logIds grp)) allLogGroups

Issue: Multiple intermediate vectors without forced evaluation.
Fix: Add bang patterns or use strict vector operations.


📝 Code Succinctness Improvements

1. Use MultiWayIf (src/BackgroundJobs.hs:1691-1693)

-- Current:
newState = case lp.baselineState of
  BSEstablished -> BSEstablished 
  BSLearning -> if newMean > 100 || patternAgeDays > 1
                then BSEstablished 
                else BSLearning

-- Better:
newState = if
  | lp.baselineState == BSEstablished -> BSEstablished
  | newMean > 100 || patternAgeDays > 1 -> BSEstablished
  | otherwise -> BSLearning

2. Use ViewPatterns (src/Pkg/Drain.hs:156-168)

-- Current: 13 lines
leastRecentlyUsedIndex logGroups =
  V.ifoldl'
    ( \acc i g ->
        case acc of
          Nothing -> Just (i, lastSeen g)
          Just (j, t) ->
            if lastSeen g < t
              then Just (i, lastSeen g)
              else Just (j, t)
    )
    Nothing
    logGroups
    & maybe 0 fst

-- More concise: 6 lines
leastRecentlyUsedIndex = maybe 0 fst . V.ifoldl' step Nothing
  where
    step acc i g = case acc of
      Nothing -> Just (i, lastSeen g)
      Just (j, t) | lastSeen g < t -> Just (i, lastSeen g)
                  | otherwise -> Just (j, t)

3. Replace Case with fromMaybe (src/Models/Apis/LogPatterns.hs:248-252)

-- Current:
results <- PG.query q (pid, pattern')
case results of
  [Only count] -> return count
  _ -> return 0

-- Concise:
fromMaybe 0 . fmap fromOnly . listToMaybe <$> PG.query q (pid, pattern')

4. Define Data Types Instead of Large Tuples (src/BackgroundJobs.hs:1720)

-- Current:
forM_ spikeData \(patternId, patternHash, currentRate, mean, stddev) -> do

-- Better:
data SpikeInfo = SpikeInfo
  { patternId :: LogPatternId
  , patternHash :: Text
  , currentRate :: Double
  , baselineMean :: Double
  , baselineStddev :: Double
  } deriving stock (Show, Generic)

forM_ (toSpikeInfo <$> spikeData) \info -> ...

Better type safety and more readable.

5. Simplify Vector Conversions (src/BackgroundJobs.hs:512-515)

-- Current:
existingPatterns <- LogPatterns.getLogPatternTexts pid
let known = V.fromList $ map (\pat -> ("",False,pat,Nothing,Nothing,Nothing)) existingPatterns
    combined = known <> ((\(logId, content, trId, serviceName, level) -> (logId, True, content, trId, serviceName, level)) <$> events)

-- Better:
let known = (\pat -> ("", False, pat, Nothing, Nothing, Nothing)) <$> V.fromList existingPatterns
    combined = known <> (second5 (const True) <$> events)  -- Define second5 helper

🔒 Security Concerns

1. Missing Input Validation (src/Models/Apis/LogPatterns.hs:151-165)

Pattern content has no length limits - large strings could cause DoS.
Fix: guard (T.length pattern < 10000) >> upsertLogPattern ...

2. Unvalidated Bulk Actions (src/Pages/Anomalies.hs:141-160)

Bulk actions endpoint trusts client-supplied IDs without verifying project ownership.
Fix: Add WHERE project_id = ? to all anomaly ID lookups.


📦 Database Schema Issues

static/migrations/0026_log_patterns.sql

  1. Line 15 & 24: Use enum types instead of TEXT for state and baseline_state:

    state apis.log_pattern_state NOT NULL DEFAULT 'new'
  2. Missing index: acknowledged_by column lacks index, causing slow joins:

    CREATE INDEX idx_log_patterns_acknowledged_by ON apis.log_patterns(acknowledged_by);
  3. Race condition in trigger (line 58): log_pattern_created_notify could create duplicate jobs.
    Fix: Use INSERT ... ON CONFLICT DO NOTHING in the background job queue.


✅ Positive Highlights

  • Excellent use of deriving strategies and newtypes
  • Good separation of concerns (Models, BackgroundJobs, Pages)
  • Comprehensive database migration with indexes
  • Well-structured issue types with proper sum types
  • Good use of V.Vector for performance
  • LLM enhancement integration is well-architected
  • Proper use of Maybe for optional fields

Test Coverage

The test file test/unit/Pkg/DrainSpec.hs was reformatted (49 lines changed). Ensure all tests still pass and no logic was altered during formatting.


Recommended Priority

  1. High: Fix unbounded query (add limit validation)
  2. High: Add permission check to acknowledgeLogPatterns
  3. High: Add composite index for spike detection
  4. Medium: Batch N+1 queries in baseline calculation
  5. Medium: Add transaction wrapper for issue creation
  6. Low: Refactor large tuples to data types
  7. Low: Apply succinctness improvements

Overall, this is a well-structured feature that follows good Haskell practices. Addressing the critical issues will make it production-ready. Great work! 🚀

[text|
SELECT id, created_at, updated_at, project_id, issue_type::text, endpoint_hash, acknowledged_at, acknowledged_by, archived_at, title, service, critical,
CASE WHEN critical THEN 'critical' ELSE 'info' END, affected_requests, affected_clients, NULL::double precision,
CASE WHEN critical THEN 'critical' ELSE 'info' END, 0::int, 0::int, NULL::double precision,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you set these to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not part of the new issues table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we adding it to the query?

Comment on lines 149 to 162
Issues.LogPattern ->
"Generate a concise title for this log pattern issue.\n"
<> "Title: "
<> issue.title
<> "\n"
<> "Service: "
<> fromMaybe "unknown-service" issue.service
Issues.LogPatternRateChange ->
"Generate a concise title for this log pattern rate change.\n"
<> "Title: "
<> issue.title
<> "\n"
<> "Service: "
<> fromMaybe "unknown-service" issue.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the text quasiquotes here.

@@ -0,0 +1,68 @@
BEGIN;

CREATE TABLE IF NOT EXISTS apis.log_patterns (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will support patterns on different fields not just the default body/message field. This table doesn't seem aware of this expectation.

CREATE INDEX IF NOT EXISTS idx_log_patterns_last_seen ON apis.log_patterns(project_id, last_seen_at DESC);
CREATE INDEX IF NOT EXISTS idx_log_patterns_service ON apis.log_patterns(project_id, service_name);

CREATE OR REPLACE FUNCTION apis.new_log_pattern_proc() RETURNS trigger AS $$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the question remains. Is this a timeseries table or regular table? If its a timeseries table (to be on timefusion), then it won't be in the same database as where the background jobs is, and hence would be unable to queue jobs from within the db.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a regular table.

@claude
Copy link

claude bot commented Jan 27, 2026

Code Review: Log Patterns Implementation

Overall Assessment: This is solid, production-quality Haskell code with excellent architecture. The implementation demonstrates strong type safety, proper database design, and statistical rigor. However, there are some critical issues and opportunities for improvement.


Critical Issues 🔴

1. Division by Zero Risk

Location: src/BackgroundJobs.hs:1712

let zScore = (currentRate - mean) / stddev

Missing check for mean > 0. If mean = 0 but currentRate > 0, this should likely be flagged as a spike for a new pattern gaining activity.

Fix:

| stddev > 0 && mean > 0 ->
    let currentRate = fromIntegral lpRate.currentHourCount
        zScore = (currentRate - mean) / stddev
        isSpike = abs zScore > 3.0 && currentRate > mean
| mean == 0 && currentHourCount > 10 -> 
    Just (lpRate.patternId, ...) -- New pattern with activity

2. Trigger Performance Issue

Location: static/migrations/0026_log_patterns.sql:40-58

The trigger fires on ALL inserts, including ON CONFLICT DO UPDATE cases. For 1000 pattern upserts per batch, this creates 1000 duplicate background jobs.

Fix:

CREATE TRIGGER log_pattern_created_notify 
AFTER INSERT ON apis.log_patterns 
FOR EACH ROW 
WHEN (NEW.xmax = 0)  -- Only real inserts, not upserts
EXECUTE PROCEDURE apis.new_log_pattern_proc();

3. Memory Risk with Large Pattern Sets

Location: src/BackgroundJobs.hs:512, 1678

Loading all patterns into memory without pagination. For projects with 10K+ patterns, this could cause OOM.

Fix: Add pagination or limit to getLogPatterns:

-- Process in batches of 1000
let batchSize = 1000
patterns <- getLogPatterns pid batchSize 0

High Priority Issues ⚠️

4. Missing Database Constraints

Location: static/migrations/0026_log_patterns.sql:15, 24

State columns lack CHECK constraints for data integrity.

Fix:

state TEXT NOT NULL DEFAULT 'new' CHECK (state IN ('new', 'acknowledged', 'ignored')),
baseline_state TEXT NOT NULL DEFAULT 'learning' CHECK (baseline_state IN ('learning', 'established')),

5. Hardcoded Magic Numbers

Throughout src/BackgroundJobs.hs, thresholds are hardcoded:

  • Line 1713: 3.0 (3-sigma threshold)
  • Line 1681: 168 (7-day baseline window)
  • Line 1743: 10000 (min events for new pattern issue)
  • Line 1692: 100 (min hourly events for baseline)

Fix: Extract to named constants with documentation:

-- Constants for baseline establishment
minEventsForEstablishment :: Double
minEventsForEstablishment = 100  -- Minimum hourly events to establish baseline

minAgeForEstablishment :: Double  
minAgeForEstablishment = 1  -- Minimum age in days

sigmaThreshold :: Double
sigmaThreshold = 3.0  -- Standard deviations for spike detection

6. Missing Index

Location: static/migrations/0026_log_patterns.sql

No index on pattern_hash alone for cross-project queries.

Fix:

CREATE INDEX idx_log_patterns_hash ON apis.log_patterns(pattern_hash);

Code Quality Improvements

7. Redundant SQL in LogPatterns.hs

Location: src/Models/Apis/LogPatterns.hs:95-110, 119-134

SELECT clauses repeat all 20 fields multiple times.

Improvement: Leverage the Entity deriving more:

getLogPatterns pid limit offset = 
  PG.query [sql| SELECT * FROM apis.log_patterns 
                 WHERE project_id = ? 
                 ORDER BY last_seen_at DESC 
                 LIMIT ? OFFSET ? |] (pid, limit, offset)

8. Use Enum for Change Direction

Location: src/Models/Apis/Issues.hs:669

direction :: Text should be type-safe:

data ChangeDirection = Spike | Drop deriving (Show, Eq, Generic, ToJSON, FromJSON)

9. MultiWayIf for Cleaner Conditionals

Location: src/BackgroundJobs.hs:1690-1693

Already using many extensions, but MultiWayIf would help:

let newState = if
  | patternAgeDays < 1 -> BSLearning
  | newMean < 100 && lp.baselineState == BSLearning -> BSLearning  
  | otherwise -> BSEstablished

Performance Considerations

Strengths ✅

  • Excellent use of median + MAD instead of mean + stddev for outlier robustness
  • Batch queries to prevent N+1 (lines 315-318)
  • HashMap for O(1) pattern lookup (line 1719)
  • Proper use of coerce for zero-cost conversions

Opportunities

  • Pattern Loading: Consider streaming for very large pattern sets
  • Drain Tree: Not persisted between runs (CPU vs memory trade-off is reasonable)
  • Query Optimization: Lines 285-306 use LEFT JOIN with subquery - consider materialized view for 1000s of patterns

Security ✅

  • ✅ All queries use parameterized statements (no SQL injection)
  • ✅ Proper access control via project_id filtering
  • ⚠️ DoS potential: High log volume → thousands of patterns → thousands of baseline calculations (consider rate limiting)

Missing Functionality

  1. No drop detection - only spike detection (line 1713: currentRate > mean prevents drop detection)
  2. No pattern cleanup - consider retention policy for patterns not seen in 30+ days
  3. No error handling/retry - baseline/spike calculation failures are silent
  4. No tests - missing unit tests for baseline edge cases, integration tests for spike detection

What I Really Like 🌟

  1. Statistical Rigor: Using median + MAD with proper 1.4826 scaling factor shows deep understanding
  2. Type Safety: Excellent use of newtypes (LogPatternId), derivingVia, and type-safe database operations
  3. Modern Haskell: Great use of RecordDotSyntax, BlockArguments, DerivingVia
  4. State Machine: Proper baseline state management (BSLearningBSEstablished)
  5. Database Design: Comprehensive indexes and foreign key constraints
  6. Code Organization: Clean separation between data models, background jobs, and UI

Recommendations

Must Fix Before Merge

  1. ✅ Fix division-by-zero in spike detection (BackgroundJobs.hs:1712)
  2. ✅ Fix trigger to avoid duplicate jobs (migration)
  3. ✅ Add CHECK constraints for state columns

Should Fix Soon

  1. Add pagination for pattern loading (prevent OOM)
  2. Extract magic numbers to named constants
  3. Add basic error handling and logging
  4. Add index on pattern_hash

Nice to Have

  1. Implement drop detection
  2. Add unit tests for statistical calculations
  3. Add pattern cleanup/retention policy

Verdict: Strong implementation with minor critical fixes needed. The architecture is sound and the code demonstrates excellent Haskell practices. Once the critical issues are addressed, this will be production-ready.

@claude
Copy link

claude bot commented Jan 27, 2026

PR #300 Review: Log Patterns Functionality

Reviewed 12 files with +1102/-374 lines. Overall: Good implementation with room for improvement.

Priority Issues

  1. Security: Missing Input Validation (LogPatterns.hs:151) - No length validation on patterns/hashes
  2. Security: No Rate Limiting - Pattern creation lacks rate limiting
  3. Performance: Memory Issue (BackgroundJobs.hs:512) - getLogPatternTexts loads ALL patterns into memory
  4. Bug: Race Condition (BackgroundJobs.hs:1743-1754) - TOCTOU between event count check and pattern retrieval
  5. Bug: Fragile LLM Parsing (Enhancement.hs:86-88) - Silent failures on unexpected format

Performance Optimizations

  • Vector Inefficiency (BackgroundJobs.hs:536): O(n) V.find - use HashMap for O(1)
  • DB Query Performance (LogPatterns.hs:218-243): Add index on (project_id, log_pattern, timestamp)
  • Magic Numbers (BackgroundJobs.hs:1692): Extract to constants

Code Succinctness (GHC Extensions)

  • Use guards instead of case on bools (LogPatterns.hs:138-141)
  • Use ViewPatterns for cleaner pattern matching (Drain.hs:100-107)
  • Use RecordWildCards to simplify long functions (Issues.hs:669-694)
  • Break long lines (Issues.hs:694 is 140+ chars)

Test Coverage

Good: Comprehensive DRAIN algorithm tests
Missing: LogPatterns DB ops, baseline calc, spike detection, integration tests

Package Usage

Consider using statistics package for robust median/MAD/stddev calculations

Summary Score

  • Code Quality: 7/10
  • Security: 6/10 (needs input validation & rate limiting)
  • Performance: 7/10
  • Test Coverage: 5/10
  • Maintainability: 7/10

Recommendation: Request changes for security/performance issues, then approve.

@tonyalaribe
Copy link
Contributor

Rename patterns function from 15mins to 5mins.

@claude
Copy link

claude bot commented Jan 28, 2026

PR Review: Log Pattern Anomaly Detection

Great work on implementing log pattern anomaly detection! The implementation is thoughtful with good test coverage. However, I've identified several critical issues and optimization opportunities.


🚨 CRITICAL: Security Issue

SQL Injection Vulnerability in src/Models/Apis/RequestDumps.hs lines 1344-1356:

The target parameter is interpolated into SQL without proper validation. An attacker could manipulate targetM to inject SQL.

-- VULNERABLE CODE
target = fromMaybe "log_pattern" targetM
let q = [text|
  SELECT log_pattern, count(*) as p_count
  FROM otel_logs_and_spans
  WHERE ${whereCondition} AND log_pattern = ANY(?)
  ...
|]

Fix: Use a whitelist:

target = case targetM of
  Just "log_pattern" -> "log_pattern"
  Just "summary_pattern" -> "summary_pattern"
  _ -> "log_pattern"

⚡ Performance Issues

1. Expensive Baseline Calculations

Lines 1213-1248 use PERCENTILE_CONT twice per pattern, which is computationally expensive. For projects with hundreds of patterns, this will be slow.

Recommendations:

  • Create materialized view for hourly counts
  • Consider simpler approximations (e.g., 1.4826 * stddev_samp instead of MAD)
  • Batch process patterns instead of individual queries

2. Missing Indexes Causing Table Scans

Lines 89-100: Query filtering on (summary_pattern IS NULL OR log_pattern IS NULL) will cause full table scans.

Add partial indexes:

CREATE INDEX idx_otel_logs_missing_log_pattern 
  ON otel_logs_and_spans(project_id, timestamp) 
  WHERE log_pattern IS NULL;

3. Inefficient Pattern Lookup

Lines 113-119: getLogPatternTexts loads ALL patterns into memory for every batch. For projects with thousands of patterns, this is wasteful.

Recommendation: Filter at database level instead of in-memory.

4. Job Storm

Lines 76-81: Scheduling jobs for ALL projects at once creates a job storm.

Recommendation: Stagger job creation:

run_at = NOW() + random() * INTERVAL '1 hour'

🐛 Bugs

1. Pattern Metadata Extraction

Lines 139-146: If the first log ID doesn't match any event, metadata is lost.

-- BUG: Uses first ID which might not exist
let (serviceName, logLevel, logTraceId) = case ids V.!? 0 of
      Just logId | logId /= "" ->
        case V.find (\(i, _, _, _, _) -> i == logId) events of

Fix: Find any matching event:

let metadata = case V.find (\(i, _, _, _, _) -> V.elem i ids) events of

2. Inconsistent Z-Score Logic

Line 224: Using abs but then checking direction is inconsistent.

-- INCONSISTENT
let zScore = (currentRate - mean) / stddev
    isSpike = abs zScore > 3.0 && currentRate > mean

Fix: Just check positive spikes:

let isSpike = zScore > 3.0  -- Only detect upward spikes

3. Variable Name Collision

Line 228: Using pid' when pid is in scope is confusing.

let spikeIds = V.fromList $ map (\(pid', _, _, _, _) -> pid') spikeData

Fix: Use descriptive names:

let spikeIds = V.fromList $ map (\(patternId, _, _, _, _) -> patternId) spikeData

📝 Code Quality & Succinctness

1. Long Parameter Lists Need Record Types

Multiple functions have 7-9 parameters. Use records:

-- Before (9 parameters!)
updateTreeWithLog :: DrainTree -> Int -> Text -> V.Vector Text -> Text -> Bool -> Text -> Text -> UTCTime -> DrainTree

-- After
data LogContext = LogContext
  { tokenCount :: Int
  , firstToken :: Text
  , tokens :: V.Vector Text
  , logId :: Text
  , isSample :: Bool
  , content :: Text
  , field :: Text
  , timestamp :: UTCTime
  }

updateTreeWithLog :: DrainTree -> LogContext -> DrainTree

2. Use RecordWildCards & NamedFieldPuns

Lines 1756-1761 can be simplified:

-- Enable RecordWildCards
case V.findIndex (\DrainLevelOne{tokenCount} -> tokenCount == targetCount) levelOnes of
  Just index ->
    let existingLevel@DrainLevelOne{nodes} = levelOnes V.! index

3. Repetitive Update Pattern

Lines 1760-1803: updateOrCreateLevelOne, updateOrCreateLevelTwo, and updateOrCreateLogGroup all follow the same pattern. Abstract into a higher-order function:

updateOrCreateInVector :: 
  (a -> Bool) ->           -- Find predicate
  (a -> (a, Bool)) ->      -- Update function
  (V.Vector a -> a) ->     -- Create function
  V.Vector a -> (V.Vector a, Bool)

4. Use Type Classes for Issue Prompts

Enhancement.hs lines 650-950 have repetitive pattern matching. Use type classes:

class IssueDataToPrompt a where
  toTitlePrompt :: a -> Text
  toDescriptionPrompt :: a -> Text

instance IssueDataToPrompt APIChangeData where ...
instance IssueDataToPrompt RuntimeExceptionData where ...

5. Inefficient Text Processing

Line 1903-1906: Character-by-character processing is slow. Use T.uncons:

-- Instead of: if T.head t == '"'
case T.uncons t of
  Just ('"', rest) -> ...
  Just ('[', rest) -> ...

🔍 SQL Optimization

1. SELECT * is Inefficient

Line 1319: Avoid SELECT *:

-- Replace with explicit column list
PG.query [sql| SELECT id, project_id, log_pattern, ... FROM apis.log_patterns WHERE id = ANY(?) |]

2. Missing Index for Common Query

Lines 1104-1115: Add composite index:

CREATE INDEX idx_log_patterns_project_last_seen 
  ON apis.log_patterns(project_id, last_seen_at DESC);

3. Filter Recently Active Patterns Only

Lines 1285-1311: Add filter to reduce unnecessary joins:

WHERE lp.project_id = ?
  AND lp.state != 'ignored' 
  AND lp.baseline_state = 'established'
  AND lp.last_seen_at > NOW() - INTERVAL '1 day'  -- Only check recently active

4. Data Migration for Constraint Change

Migration lines 2020-2093: Constraint changed but might have duplicates. Add deduplication:

-- Before changing constraint
DELETE FROM apis.log_patterns a USING apis.log_patterns b
WHERE a.id > b.id 
  AND a.project_id = b.project_id 
  AND a.log_level = b.log_level 
  AND a.field_path = b.field_path 
  AND a.pattern_hash = b.pattern_hash;

🔒 Security Concerns

1. Sample Messages May Contain PII

Line 257: Sample log messages might contain sensitive data.

Recommendations:

  • Truncate to first 200 chars
  • Add config flag to disable sample collection
  • Scrub common PII patterns (emails, API keys)

2. Validate Background Job Payloads

Line 2069: Ensure background job processor validates pattern hash format to prevent injection.


✅ What's Good

  • Excellent test coverage in test/unit/Pkg/DrainSpec.hs
  • Good use of batch queries to avoid N+1 (line 1316)
  • Proper use of quasi-quoters for SQL
  • Thoughtful use of median/MAD for robustness against outliers
  • Well-structured hierarchical clustering in Drain algorithm

🎯 Priority Actions

Must Fix Before Merge:

  1. ✅ Fix SQL injection in RequestDumps.hs
  2. ✅ Add pattern metadata extraction bug fix
  3. ✅ Add partial indexes for pattern extraction

Should Fix:
4. Optimize baseline calculation (batch or simplify stats)
5. Stagger background job creation
6. Add composite indexes for common queries
7. Use record types for long parameter lists
8. Fix z-score logic inconsistency

Nice to Have:
9. Extract higher-order update function
10. Use RecordWildCards/ViewPatterns
11. Add type classes for issue prompts
12. Use explicit SELECT columns


Overall, this is solid work! The main blockers are the SQL injection vulnerability and performance concerns at scale. Once those are addressed, this will be a great addition.

@claude
Copy link

claude bot commented Jan 29, 2026

Code Review: PR #300 - Log Patterns Implementation

Executive Summary

This PR implements log pattern detection using the DRAIN algorithm with database persistence, UI presentation, and anomaly detection. The implementation is solid but has several areas for improvement regarding code quality, performance, and succinctness.

Overall Assessment: 7.5/10

  • ✅ Well-structured, type-safe code
  • ✅ Excellent test coverage
  • ✅ Proper SQL parameterization (security)
  • ⚠️ Debug code left in production
  • ⚠️ N+1 query patterns
  • ⚠️ Opportunities for more concise code

Critical Issues (Fix Before Merge)

1. Debug Code in Production 🚨

src/BackgroundJobs.hs:523-524, 274-275

traceShowM patternTxt  -- Remove
traceShowM ids         -- Remove

src/Models/Apis/LogPatterns.hs:274-275

traceShowM counts      -- Remove
traceShowM patterns    -- Remove

Fix: Remove all traceShowM calls or replace with proper structured logging:

Log.logDebug "Processing pattern" [("pattern", AE.toJSON patternTxt)]

2. Potential Index Error 🐛

src/Pkg/Drain.hs:164-177

leastRecentlyUsedIndex :: V.Vector LogGroup -> Int
leastRecentlyUsedIndex logGroups =
  V.ifoldl' ... Nothing logGroups
    & maybe 0 fst  -- Returns 0 for empty vector!

Issue: Returns index 0 for empty vectors, causing crashes.

Fix:

leastRecentlyUsedIndex :: V.Vector LogGroup -> Maybe Int
leastRecentlyUsedIndex logGroups
  | V.null logGroups = Nothing
  | otherwise = Just $ V.ifoldl' ... & fromMaybe 0 . fmap fst

3. Commented Code

src/BackgroundJobs.hs:502

-- processPatterns "summary" "summary_pattern" ...  -- Remove entirely

Performance Issues

1. N+1 Query Pattern ⚡

src/BackgroundJobs.hs:1664-1697

patterns <- LogPatterns.getLogPatterns pid 1000 0      -- 1 query
forM_ patterns \lp -> do
  statsM <- LogPatterns.getPatternStats pid lp.logPattern 168  -- 1000 queries
  _ <- LogPatterns.updateBaseline pid ...                      -- 1000 queries

For 1000 patterns: 2001 total queries

Recommendation: Use a single query with window functions:

WITH hourly_stats AS (
  SELECT log_pattern,
         PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY hourly_count) as median,
         STDDEV(hourly_count) as stddev
  FROM (
    SELECT log_pattern, 
           date_trunc('hour', timestamp) as hour,
           COUNT(*) as hourly_count
    FROM otel_logs_and_spans
    WHERE project_id = ? 
      AND log_pattern = ANY(?)  -- Batch all patterns
      AND timestamp >= NOW() - INTERVAL '7 days'
    GROUP BY log_pattern, hour
  ) subq
  GROUP BY log_pattern
)
UPDATE apis.log_patterns lp
SET baseline_volume_hourly_mean = hs.median,
    baseline_volume_hourly_stddev = hs.stddev,
    baseline_updated_at = NOW()
FROM hourly_stats hs
WHERE lp.log_pattern = hs.log_pattern AND lp.project_id = ?

2. Multiple Queries Where One Would Suffice

src/Models/Apis/LogPatterns.hs:269-304

patterns <- PG.query patternsQuery (Only pid)     -- Query 1
counts <- PG.query countsQuery (Only pid)         -- Query 2
pure $ map (attachCount counts) patterns

Recommendation: Single JOIN query:

SELECT lp.*, COALESCE(c.count, 0)::INT as current_hour_count
FROM apis.log_patterns lp
LEFT JOIN (
  SELECT log_pattern, COUNT(*)::INT as count
  FROM otel_logs_and_spans
  WHERE project_id = ? AND timestamp >= NOW() - INTERVAL '1 hour'
  GROUP BY log_pattern
) c ON lp.log_pattern = c.log_pattern
WHERE lp.project_id = ? AND lp.state != 'ignored'

3. Inefficient Text Search

src/Pkg/Drain.hs:260-264

looksLikeJson t =
  ("{" `T.isInfixOf` t && "}" `T.isSuffixOf` t)  -- Scans entire string

Better:

looksLikeJson t = case T.uncons (T.strip t) of
  Just ('{', _) -> "}" `T.isSuffixOf` t
  Just ('[', _) -> "]" `T.isSuffixOf` t
  _ -> False

Code Succinctness (Leverage GHC Extensions)

Use LambdaCase

src/BackgroundJobs.hs:509-515

-- Current:
let (serviceName, logLevel, logTraceId) = case ids V.!? 0 of
      Just logId | logId /= "" ->
        case V.find (\(i, _, _, _, _) -> i == logId) events of
          Just (_, _, trId, sName, lvl) -> (sName, lvl, trId)
          Nothing -> (Nothing, Nothing, Nothing)
      _ -> (Nothing, Nothing, Nothing)

-- Better with LambdaCase:
let (serviceName, logLevel, logTraceId) = 
      ids V.!? 0 >>= \case
        logId | logId /= "" ->
          V.find (\(i,_,_,_,_) -> i == logId) events <&> 
            \(_,_,trId,sName,lvl) -> (sName, lvl, trId)
        _ -> Nothing
      & fromMaybe (Nothing, Nothing, Nothing)

Reduce Column Enumeration

src/Models/Apis/LogPatterns.hs:102-112

Manually lists all columns when could use SELECT *:

-- Current: 10 lines of column names
SELECT id, project_id, created_at, updated_at, log_pattern, ...

-- Better:
SELECT * FROM apis.log_patterns
WHERE project_id = ? ORDER BY last_seen_at DESC LIMIT ? OFFSET ?

Package Usage

The project has excellent packages available (lens, extra, safe) but could leverage them more:

Use lens operators

src/Models/Apis/LogPatterns.hs:306

-- Current: Manual record construction
attachCount counts (patId, projId, logPat, patHash, blState, blMean, blStddev) =
  LogPatternWithRate { patternId = patId, ... }

-- Could use lens for cleaner construction

Use extra functions

Could use groupOn, sumOn from Data.List.Extra in several places.


Security ✅

  • ✅ All SQL queries properly use parameterized queries (? placeholders)
  • ✅ No SQL injection vulnerabilities found
  • ✅ Good database constraints and foreign keys
  • ✅ Proper unique constraints with composite keys

Note: Earlier diff versions had dynamic query construction, but current version correctly uses parameterization. Good fix!


Test Coverage ✅

test/unit/Pkg/DrainSpec.hs has excellent coverage:

  • ✅ Basic tokenization
  • ✅ Quoted strings, key-value patterns
  • ✅ Various log formats (HTTP, database, errors, timestamps)
  • ✅ Edge cases

Suggestion: Add property-based tests:

prop "tokenize preserves non-empty input" =
  forAll (listOf1 arbitrary) $ \words ->
    not . null $ tokenize (T.unwords words)

Positive Aspects 🌟

  1. Well-structured code - Clear separation of concerns
  2. Type safety - Good use of newtypes (LogPatternId, UUIDId)
  3. Comprehensive logging - Proper structured logging
  4. Good test coverage - Especially Drain algorithm
  5. Idiomatic Haskell - Proper use of Effectful
  6. Good migrations - Proper indexing, idempotent SQL
  7. Excellent refactoring - The mkIssue helper (Issues.hs:756) reduces duplication significantly

Recommendations Priority

Must Fix (Before Merge):

  1. ❌ Remove all traceShowM calls
  2. ❌ Fix leastRecentlyUsedIndex empty vector handling
  3. ❌ Delete commented code

Should Fix (High Value):

  1. ⚡ Optimize N+1 baseline calculation query
  2. ⚡ Combine getPatternsWithCurrentRates queries
  3. 📝 Extract magic numbers (3.0 z-score, 10000 threshold) to config

Nice to Have:

  1. 📚 Use more lens operators
  2. 📚 Apply LambdaCase/ViewPatterns for brevity
  3. 📚 Property-based tests

Line Count Reduction Opportunities

Current additions: 1327 lines
Estimated reduction with optimizations: ~150 lines (11% reduction)

  • Query consolidation: ~30 lines
  • Use of LambdaCase: ~40 lines
  • Lens operators: ~20 lines
  • Helper extraction: ~30 lines
  • SELECT * vs column enumeration: ~30 lines

Overall: Strong implementation of a complex feature. The DRAIN algorithm is well-implemented with good test coverage. Main issues are debug code and query optimization opportunities. Fix critical issues and this is ready to merge! 🚀

@claude
Copy link

claude bot commented Jan 29, 2026

Code Review: Log Patterns Implementation

Great work implementing the log pattern detection and anomaly detection system! The architecture is solid. Here are suggestions for more concise and idiomatic code:

CRITICAL ISSUES:

  1. Debug Code in Production (BackgroundJobs.hs:529-530)

    • Remove traceShowM statements before merging
  2. Disabled Monitor (BackgroundJobs.hs:361)

    • QueryMonitorsCheck -> pass is commented out. Why?
  3. UNIQUE Constraint Mismatch (0026_log_patterns.sql)

    • Migration has: UNIQUE(project_id, pattern_hash)
    • Code expects: ON CONFLICT (project_id, log_level, field_path, pattern_hash)
    • This will cause runtime errors! Fix the migration.

CODE QUALITY:

Models/Apis/LogPatterns.hs:

  • Line 128-130: Remove V.null guard, PostgreSQL handles empty arrays
  • Line 273-283: Use Map.fromList for O(log n) lookups instead of O(n) lookup

Pkg/Drain.hs:

  • Line 104: Use point-free: ((== targetCount) . (.tokenCount))
  • Good test coverage overall

BackgroundJobs.hs:

  • Line 520-526: Use do notation in Maybe monad to simplify nesting
  • Consider batch upserts to reduce DB round-trips

PERFORMANCE:

  • Add index: CREATE INDEX idx_log_patterns_upsert_key ON apis.log_patterns(project_id, log_level, field_path, pattern_hash)
  • Monitor memory usage for large log batches

MINOR ISSUES:

  • Unused import: SeverityLevel (Pkg/Drain.hs:20)
  • Magic numbers: 1.4826, 287, 300 need comments/constants
  • trace_id field in migration but not in LogPattern record type
  • No rollback migration provided

SUMMARY:
Rating: 7.5/10

Must fix before merge:

  1. Remove debug traceShowM calls
  2. Fix UNIQUE constraint mismatch
  3. Address commented out monitor check

This is nearly production-ready with those fixes. Great Drain algorithm implementation!

@claude
Copy link

claude bot commented Jan 31, 2026

Code Review: PR #300 - Log Pattern Detection System

Thank you for this comprehensive implementation of log pattern detection and anomaly tracking! The feature adds valuable functionality for identifying unusual log patterns and volume spikes. Below is a detailed review focusing on code quality, performance, and opportunities for making the code more succinct using available packages and GHC extensions.


High Priority Issues

1. Inefficient N+1 Query Pattern (src/Models/Apis/LogPatterns.hs:269-313)

The getPatternsWithCurrentRates function performs two separate queries and manually joins them in Haskell. This is inefficient for large datasets.

Current approach:

getPatternsWithCurrentRates pid = do
  patterns <- PG.query patternsQuery (Only pid)
  counts :: [(Text, Int)] <- PG.query countsQuery (Only pid)
  pure $ map (attachCount counts) patterns

Issue: The lookup in attachCount is O(n), making this O(n²) overall.

Recommended: Use a single SQL LEFT JOIN query:

SELECT lp.id, lp.project_id, lp.log_pattern, lp.pattern_hash,
       lp.baseline_state, lp.baseline_volume_hourly_mean, lp.baseline_volume_hourly_stddev,
       COALESCE(counts.cnt, 0)::INT
FROM apis.log_patterns lp
LEFT JOIN (
  SELECT log_pattern, COUNT(*)::INT as cnt
  FROM otel_logs_and_spans
  WHERE project_id = ?::text AND timestamp >= now() - interval '1 hour'
    AND kind = 'log' AND log_pattern IS NOT NULL
  GROUP BY log_pattern
) counts ON lp.log_pattern = counts.log_pattern
WHERE lp.project_id = ? AND lp.state != 'ignored' AND lp.baseline_state = 'established'

Alternatively, if keeping two queries, convert counts to a HashMap for O(1) lookups:

import qualified Data.HashMap.Strict as HM
countsMap = HM.fromList counts
-- then: HM.findWithDefault 0 logPat countsMap

2. Extract Magic Numbers to Named Constants (src/BackgroundJobs.hs)

Multiple hardcoded values should be named constants:

-- Line 234, 238, 259, 289
baselineHoursRequired :: Int = 168  -- 7 days
baselineMeanThreshold :: Double = 100
zScoreThreshold :: Double = 3.0
minPatternAgeDays :: Int = 1
spikeEventVolumeThreshold :: Int = 10000

3. Duplicated Issue Creation Logic (src/BackgroundJobs.hs:271-275, 295-298)

The pattern of creating an issue, inserting it, and triggering LLM enhancement is repeated:

Extract to:

createAndEnhanceIssue :: Issue -> ATBackgroundCtx ()
createAndEnhanceIssue issue = do
  authCtx <- ask @Config.AuthContext
  Issues.insertIssue issue
  liftIO $ withResource authCtx.jobsPool \conn ->
    void $ createJob conn "background_jobs" $ 
      EnhanceIssuesWithLLM issue.projectId (V.singleton issue.id)

Then use: createAndEnhanceIssue =<< liftIO (Issues.createLogPatternIssue ...)


4. Inefficient Vector Operations (src/BackgroundJobs.hs:174-177)

case V.find (\(i, _, _, _, _) -> i == logId) events of

This O(n) lookup happens inside a loop. Build a HashMap from events first:

import qualified Data.HashMap.Strict as HM
let eventsMap = HM.fromList [(logId, (trId, sName, lvl)) | (logId, _, trId, sName, lvl) <- V.toList events]

5. Use coerce More Effectively (src/Models/Apis/LogPatterns.hs:116)

getLogPatternTexts pid = coerce @[Only Text] @[Text] <$> PG.query q (Only pid)

This works but is verbose. Since you have DerivingVia, define:

newtype LogPatternTexts = LogPatternTexts [Text]
  deriving newtype (FromRow)
  deriving stock (Coercible [Only Text])

Or simply use: map (\(Only t) -> t) which is clearer.


Medium Priority Improvements

6. Leverage RecordWildCards More (src/Models/Apis/LogPatterns.hs:303-313)

attachCount counts (patId, projId, logPat, patHash, blState, blMean, blStddev) =
  LogPatternWithRate
    { patternId = patId
    , projectId = projId
    , logPattern = logPat
    , patternHash = patHash
    , baselineState = blState
    , baselineMean = blMean
    , baselineStddev = blStddev
    , currentHourCount = fromMaybe 0 $ lookup logPat counts
    }

With RecordWildCards (already enabled):

attachCount counts (patternId, projectId, logPattern, patternHash, baselineState, baselineMean, baselineStddev) =
  let currentHourCount = fromMaybe 0 $ lookup logPattern counts
  in LogPatternWithRate{..}

7. Simplify Nested Case Logic (src/BackgroundJobs.hs:174-177, src/Pages/Anomalies.hs:1465-1606)

BackgroundJobs.hs:

-- Current deeply nested pattern
let (serviceName, logLevel, logTraceId) = case ids V.!? 0 of
      Just logId | logId /= "" ->
        case V.find (\(i, _, _, _, _) -> i == logId) events of
          Just (_, _, trId, sName, lvl) -> (sName, lvl, trId)
          Nothing -> (Nothing, Nothing, Nothing)
      _ -> (Nothing, Nothing, Nothing)

-- Better with ViewPatterns and guards:
let (serviceName, logLevel, logTraceId) =
      fromMaybe (Nothing, Nothing, Nothing) $ do
        logId <- ids V.!? 0
        guard (logId /= "")
        (_, _, trId, sName, lvl) <- V.find (\(i, _, _, _, _) -> i == logId) events
        pure (sName, lvl, trId)

Pages/Anomalies.hs: Use pattern guards:

renderIssueStats issue
  | Issues.LogPatternSpike <- issue.issueType
  , AE.Success (logPatternData :: LogPatternSpikeData) <- AE.fromJSON (getAeson issue.issueData)
  = renderLogPatternSpike logPatternData
  | Issues.NewLogPattern <- issue.issueType
  , AE.Success (newPatternData :: NewLogPatternData) <- AE.fromJSON (getAeson issue.issueData)
  = renderNewLogPattern newPatternData
  | otherwise = defaultRender

8. Use Available Packages (src/Pkg/Drain.hs:1936-1980)

The tokenization logic could leverage:

  • attoparsec or megaparsec (check if in dependencies) for structured parsing instead of manual T.head/T.tail
  • text-builder for efficient text concatenation instead of ++ on lists

The recursive list building with go ... ++ ... is O(n²). Use:

import qualified Data.DList as DL  -- if available
go acc rest = ... DL.snoc acc token ...

Or accumulate with foldl':

foldl' (\acc c -> ...) [] (T.unpack content)

9. Extract Service Name Fallback (src/Models/Apis/Issues/Enhancement.hs, src/Pages/Anomalies.hs)

Repeated pattern:

service = fromMaybe "unknown-service" issue.service

Add to a utility module or use:

serviceOrDefault :: Issue -> Text
serviceOrDefault = fromMaybe "unknown-service" . (.service)

10. Optimize Drain Pattern Matching (src/Pkg/Drain.hs)

The MAD constant 1.4826 should be documented:

-- Constant to convert MAD to standard deviation equivalent
-- 1.4826 ≈ 1/Φ⁻¹(3/4) where Φ is the standard normal CDF
madToStddevFactor :: Double
madToStddevFactor = 1.4826

Also consider using Numeric.Statistics from statistics package if available for robust stat calculations.


Database & Performance

11. Add Missing Indices (static/migrations/0026_log_patterns.sql)

-- For spike detection query performance
CREATE INDEX IF NOT EXISTS idx_log_patterns_spike_detection 
  ON apis.log_patterns(project_id, state, baseline_state) 
  WHERE baseline_state = 'established' AND state != 'ignored';

-- For hash lookups
CREATE INDEX IF NOT EXISTS idx_log_patterns_hash 
  ON apis.log_patterns(pattern_hash);

-- For recent patterns ordering
CREATE INDEX IF NOT EXISTS idx_log_patterns_recent 
  ON apis.log_patterns(project_id, last_seen_at DESC);

12. Consider Batch Insert Trigger Debouncing (static/migrations/0026_log_patterns.sql:47-55)

The trigger fires on every INSERT, creating a background job each time. For high-volume scenarios, consider:

  • Batching jobs (e.g., only fire if no job created in last N minutes)
  • Rate limiting with a check on jobs table
  • Moving to a scheduled check rather than trigger-based

Security & Error Handling

13. Add Error Handling for Issue Creation (src/BackgroundJobs.hs:271, 295)

issue <- liftIO $ Issues.createLogPatternIssue pid lp

If issue creation fails, it's silently ignored. Add:

result <- liftIO $ tryAny $ Issues.createLogPatternIssue pid lp
case result of
  Left err -> Log.logError [fmt|Failed to create log pattern issue: {err}|]
  Right issue -> createAndEnhanceIssue issue

14. Validate Pattern Hash (src/BackgroundJobs.hs:292)

Before querying with patternHash, validate it's non-empty and reasonable length.


Test Coverage

15. Good Test Updates (test/unit/Pkg/DrainSpec.hs)

The test file shows good coverage improvements. Consider adding:

  • Tests for edge cases: empty patterns, very long patterns
  • Tests for the MAD/median baseline calculation
  • Tests for spike detection with various z-scores

Code Succinctness Summary

Already well-used extensions:

Could leverage more:

Packages to consider:

  • unordered-containers - for HashMap (should already be in deps via aeson)
  • vector-algorithms - already in deps, good
  • safe - already in deps for safe indexing

Summary

Excellent work on:

  • Comprehensive feature implementation with baseline tracking
  • Good use of robust statistics (median + MAD instead of mean + stddev)
  • Database migration with proper constraints
  • Test coverage updates

Key improvements needed:

  1. Combine queries in getPatternsWithCurrentRates for efficiency
  2. Extract magic numbers to named constants
  3. Reduce duplicated issue creation logic
  4. Add database indices for query performance
  5. Build lookup maps instead of repeated O(n) searches

LOC reduction estimate: ~50-80 lines could be removed by:

  • Using HashMap instead of lists (saves explicit lookups)
  • Extracting repeated patterns to helpers
  • Using RecordWildCards more consistently
  • Simplifying nested case/guards

This is a solid foundation that will benefit from these refinements!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants