Skip to content

Fix the LLM based interactions on issues and alerts and slack conversations as well bot integration tests and refactor for code conciseness#307

Merged
tonyalaribe merged 14 commits intomasterfrom
ai-bot-workflow-unification
Feb 8, 2026
Merged

Fix the LLM based interactions on issues and alerts and slack conversations as well bot integration tests and refactor for code conciseness#307
tonyalaribe merged 14 commits intomasterfrom
ai-bot-workflow-unification

Conversation

@tonyalaribe
Copy link
Contributor

@tonyalaribe tonyalaribe commented Feb 7, 2026

Summary

  • Add comprehensive integration test suites for Discord, Slack, and WhatsApp bot handlers
  • Add agentic chat and anomaly notification test coverage
  • Add golden test fixtures for validating bot response formats
  • Refactor code for conciseness using idiomatic Haskell patterns

Changes

New Test Files

  • AgenticSpec.hs - Tests for agentic chat workflows
  • DiscordSpec.hs - Discord bot handler tests
  • SlackSpec.hs - Slack bot handler tests
  • WhatsappSpec.hs - WhatsApp bot handler tests
  • AnomalyNotificationSpec.hs - Anomaly notification tests
  • BotFixtures.hs / BotTestHelpers.hs - Shared test utilities

Code Refactoring

  • Discord.hs: Replaced nested case statements with Either do-notation in verifyDiscordSignature
  • Utils.hs: Use lens-aeson (^?) for JSON access in parseReportStats, simplify timeParams with catMaybes
  • AI.hs: Replace large case statement with Map.lookup in parseVisualizationType

Test plan

  • Run integration tests: USE_EXTERNAL_DB=true cabal test integration-tests
  • Verify bot handlers work correctly in staging

tonyalaribe and others added 2 commits February 7, 2026 01:50
- Add integration test suites for Discord, Slack, WhatsApp bots
- Add agentic chat and anomaly notification tests
- Add golden test fixtures for bot responses
- Refactor verifyDiscordSignature to use Either do-notation
- Use lens-aeson for JSON access in parseReportStats
- Replace case statements with Map.lookup in parseVisualizationType
- Simplify timeParams using catMaybes pattern
@claude
Copy link

claude bot commented Feb 7, 2026

Code Review - PR #307

Overall Assessment

Strong PR that adds comprehensive test coverage for bot handlers and includes well-executed refactoring using idiomatic Haskell patterns. The code changes reduce line count by 51 lines while improving clarity. Good use of enabled GHC extensions and ecosystem packages.


Code Quality & Best Practices

✅ Excellent Refactoring

Discord.hs (lines 422-428): The Either do-notation refactoring is exemplary:

  • Replaces 20 lines of nested case statements with 7 concise lines
  • Uses first show for error conversion (though see improvement below)
  • Leverages fromRight False as clean default handling
  • Much more readable and maintainable

Utils.hs (lines 194): timeParams simplification is elegant:

  • catMaybes pattern is idiomatic and concise
  • Eliminates verbose foldMap with conditional logic
  • Clear intent: filter out Nothing values and format present ones

AI.hs (lines 119-138): Map.lookup pattern is good for:

  • O(log n) vs O(n) case statement matching
  • Easier to extend with new visualization types
  • However, consider using HashMap from unordered-containers (already a dependency) for O(1) lookup since this map is static

🔍 Potential Improvements

1. Discord.hs - Error Information Loss

The refactored verifyDiscordSignature uses first show to convert CryptoError to String, but the calling code discards this by using fromRight False. Consider:

-- Current (loses error info):
verifyDiscordSignature :: ByteString -> ByteString -> ByteString -> ByteString -> Bool
verifyDiscordSignature publicKey signatureHex timestamp rawBody = fromRight False $ do
  s <- first show $ Base16.decode signatureHex
  ...

-- Alternative (preserves error for logging):
verifyDiscordSignature :: ByteString -> ByteString -> ByteString -> ByteString -> Either Text Bool
-- Then at call site, log the error before defaulting to False

This would aid debugging failed signature verifications in production.

2. Utils.hs - Lens-aeson Usage

The parseReportStats refactor (lines 86-88) is good but could be more concise with lens composition:

-- Current:
parseReportStats json = (getTotal "events", getTotal "errors")
  where
    getTotal k = fromMaybe 0 $ json ^? key k . key "total" . _Number <&> round

-- More concise using bimap or Bifunctor:
parseReportStats json = bimap getTotal getTotal ("events", "errors")
  where getTotal k = fromMaybe 0 $ json ^? key k . key "total" . _Number <&> round

Or even inline with both from lens if dealing with many keys.

3. AI.hs - Static Map Construction

The vizTypeMap at lines 120-138 is constructed on every function call. Mark it NOINLINE to ensure it's constructed once:

vizTypeMap :: Map Text Text
vizTypeMap = Map.fromList [...]
{-# NOINLINE vizTypeMap #-}

parseVisualizationType :: Text -> Maybe Text  
parseVisualizationType t = Map.lookup t vizTypeMap

Or use HashMap for O(1) lookup:

import Data.HashMap.Strict qualified as HM

vizTypeMap :: HM.HashMap Text Text
vizTypeMap = HM.fromList [...]
{-# NOINLINE vizTypeMap #-}

Test Coverage

✅ Strengths

  • Comprehensive coverage across Discord, Slack, WhatsApp handlers
  • Good use of golden files for regression testing
  • Proper fixture extraction to BotFixtures.hs and BotTestHelpers.hs
  • Conditional tests for OpenAI API (using pendingWith)
  • Good use of lens operators in test assertions (^?, has, lengthOf)

🔍 Suggestions

1. Test Helpers - Reduce Duplication

BotTestHelpers.hs has some duplication in extraction functions (lines 746-768). Consider using lens prisms:

-- Instead of separate extract functions:
extractSlackBlocks :: AE.Value -> Maybe AE.Value
extractSlackBlocks val = val ^? key "blocks"

extractDiscordComponents :: AE.Value -> Maybe AE.Value  
extractDiscordComponents val = val ^? key "components"

-- Single parameterized function:
extractKey :: Text -> AE.Value -> Maybe AE.Value
extractKey k val = val ^? key k

Performance Considerations

✅ Good Patterns

  • Using catMaybes instead of foldMap with conditionals (reduces traversal overhead)
  • Map lookup for visualization types (though HashMap would be better)

⚠️ Minor Concerns

1. Utils.hs - URL Encoding

Line 192 encodes widget JSON on every call. If widgets are reused, consider avoiding double conversion:

let widgetJsonBS = toStrict $ AE.encode widget
    widgetJson = decodeUtf8 @Text widgetJsonBS  
    encodedJson = decodeUtf8 @Text $ urlEncode True widgetJsonBS

Security Concerns

✅ Good Practices

  • Discord signature verification properly uses Ed25519
  • Timestamp is included in signed message (prevents replay)
  • Test keys are clearly marked and deterministic (not production keys)

⚠️ Minor Concerns

1. Error Handling in Signature Verification

The refactored verifyDiscordSignature returns False for all errors, making it hard to distinguish:

  • Invalid signature
  • Malformed public key
  • Base16 decode failure
  • Ed25519 crypto failure

For security auditing, consider preserving error types or at minimum logging failed verification attempts with error details.

2. Test Key Exposure

BotTestHelpers.hs exports testDiscordSecretKey. While it's clearly test data, consider making it internal to the module or adding a comment warning against production use.


Package Utilization

✅ Good Use of Packages

  • lens and lens-aeson - properly leveraged for traversals
  • aeson-qq - clean test fixture generation
  • cryptonite - proper use for Ed25519 signatures
  • base16-bytestring - hex encoding/decoding
  • hspec - good test structure

🔍 Underutilized Packages

Given your dependencies, consider:

1. generic-lens: For accessing record fields by name with lenses
2. deriving-aeson: For more concise JSON instances
3. relude's utilities: whenJust, whenNothing, ordNub (already used viaNonEmpty well!)


Code Conciseness - Specific Wins

✅ Achieving Line Reduction Goals

Total: -51 lines while adding 1093 lines of tests

Key savings:

  1. Discord.hs: -13 lines (from nested cases to Either do-notation)
  2. Utils.hs: -7 lines (from foldMap to catMaybes, lens-aeson refactor)
  3. AI.hs: -2 lines net (Map.lookup readability trade-off acceptable)

🔍 Additional Opportunities

  1. Test helper extraction functions could be 1 generic function with a key parameter
  2. Consider test utilities like shouldHaveKeyValue for common patterns
  3. Fixture builder pattern could reduce repetition

Summary

Strengths:

  • Excellent use of Either do-notation and lens-aeson
  • Comprehensive test coverage with golden files
  • Good refactoring that reduces LOC while improving clarity
  • Proper use of GHC extensions and existing packages

Recommendations (Priority Order):

  1. High: Use HashMap instead of Map for static lookup tables (AI.hs)
  2. High: Add {-# NOINLINE #-} pragma to vizTypeMap
  3. Medium: Consider returning Either Text Bool from verifyDiscordSignature for better error handling
  4. Medium: Parameterize test helper extraction functions to reduce duplication
  5. Low: Consider generic-lens for more concise record field access
  6. Low: Add more property-based tests with QuickCheck for bot payloads

Verdict:Approve with minor suggestions

This PR successfully achieves the goals of adding test coverage and improving code conciseness. The refactoring is well-executed and leverages Haskell's strengths effectively.

tonyalaribe and others added 2 commits February 8, 2026 02:50
- Rename ChatLLMResponse to LLMResponse for clarity
- Remove AIOutputType enum and fold functionality into LLMResponse
- Standardize field names across bot integrations (explanation, widgets, timeRange)
- Consolidate response parsing functions (parseLLMResponse, parseAgenticResponse)
- Update all bot handlers (Slack, Discord, WhatsApp) to use unified types
- Simplify anomaly investigation and log explorer AI interactions
- Net reduction of 33 lines through better code reuse
@claude
Copy link

claude bot commented Feb 8, 2026

PR #307 Code Review: Bot Integration Tests & Refactoring

Executive Summary

Strong PR adding comprehensive bot test coverage (+1735/-739 lines) with excellent idiomatic Haskell refactorings. Recommend merge after addressing 2 high-priority security items.


✅ Excellent Refactorings

1. Discord.hs - verifyDiscordSignature (L430-436)

Replaced nested case statements with Either do-notation - much cleaner:

verifyDiscordSignature publicKey signatureHex timestamp rawBody = fromRight False $ do
  s <- first show $ Base16.decode signatureHex
  sig <- first show $ Crypto.eitherCryptoError $ Ed25519.signature s
  pkBytes <- first show $ Base16.decode publicKey
  pk <- first show $ Crypto.eitherCryptoError $ Ed25519.publicKey pkBytes
  pure $ Ed25519.verify pk (timestamp <> rawBody) sig

2. Utils.hs - parseReportStats (L438-442)

Excellent use of lens-aeson for concise JSON traversal:

parseReportStats json = (getTotal "events", getTotal "errors")
  where getTotal k = fromMaybe 0 $ json ^? key k . key "total" . _Number <&> round

3. AI.hs - parseVisualizationType (L192-212)

Map lookup replaces large case statement - more maintainable and extensible.

4. Utils.hs - timeParams building

Elegant use of catMaybes for optional URL parameters.


🔴 High Priority Issues

1. Security: Wrong HTTP Status Codes for Auth Failures

Discord.hs:222-224 & similar locations:

validateSignature _ _ _ _ = throwError err404{errBody = "Invalid signature"}

Problem: Returns 404 for invalid signatures instead of 401/403.
Impact: Information leakage (404 suggests endpoint doesn't exist vs auth failure).
Fix: Use err401 or err403 for authentication/authorization failures.

2. Input Sanitization

Whatsapp.hs:149-164, Discord/Slack handlers:

  • Webhook payloads should be sanitized before processing
  • No visible validation against injection attacks
    Recommendation: Add validation/sanitization layer or document existing protections.

🟡 Medium Priority Suggestions

1. Error Handling in widgetPngUrl (Utils.hs:195-197)

if T.length url > 8000
  then Log.logAttention "Widget PNG URL too large" (...) >> pure ""
  else pure url

Issue: Silently returns empty string on failure - could lead to broken images without clear error to clients.
Suggestion: Return Either Text Text or throw proper error.

2. Extract Magic Numbers to Constants

  • 8000 char URL limit (Utils.hs:195)
  • 4000 char query limit (Anomalies.hs:450)
  • 2000 char preview limits
    Benefits: Easier to maintain, self-documenting.

3. Large Function Decomposition

buildSystemPromptForIssue (Anomalies.hs:506-658) is 150+ lines.
Suggestion: Extract sub-sections (examples formatting, tool descriptions, etc.) into smaller functions.

4. PII in Logging

Be careful logging user queries:

Log.logTrace ("Slack interaction received" :: Text) $ AE.object ["text" AE..= interaction.text]

Recommendation: Review all logging to ensure no PII exposure.


🟢 Code Succinctness Opportunities

1. Text Building Performance (AI.hs:395-397)

vIntercalate uses <> for text concatenation:

vIntercalate sep = V.ifoldl' (\acc i x -> if i == 0 then x else acc <> sep <> x) ""

Current: Good use of strict fold.
Enhancement: For very large vectors, consider text-builder-linear (already in package.yaml:236).

2. Use safe Package More

Available in package.yaml:183 but underutilized.

  • headMay, atMay for safer list operations
  • readMay for safe parsing

3. Already Great Package Usage

  • lens-aeson for JSON traversal
  • containers for Map.lookup
  • extra for utilities
  • ✅ Proper use of GHC extensions (LambdaCase, MultiWayIf, etc.)

✅ Test Coverage - Excellent!

Strong Points

  1. Comprehensive test infrastructure: 5 new spec files + helpers + fixtures
  2. Golden file testing for regression prevention
  3. Conditional live API tests with proper guards (OPENAI_API_KEY)
  4. Security testing: Deterministic Ed25519 keypair for Discord (BotTestHelpers.hs:143-158)
  5. Edge case coverage: Malformed JSON, code block stripping, etc.

Coverage Gaps (Low Priority)

  • Rate limiting scenarios
  • Concurrent bot message handling
  • Network timeout/retry logic
  • Consider property-based testing (QuickCheck) for:
    • Widget JSON roundtrips
    • URL encoding/decoding
    • Signature verification properties

🎯 Summary

Strengths

  • Idiomatic Haskell: Excellent use of lens, Either do-notation, Map lookup
  • Test infrastructure: Comprehensive with golden files and helpers
  • Code reduction: Net improvement in code quality and conciseness
  • Type safety: Strong typing throughout

Required Before Merge

  1. 🔴 Fix HTTP status codes (404 → 401/403 for auth failures)
  2. 🔴 Document/add input sanitization for webhook payloads

Recommended (Can Follow-Up)

  1. 🟡 Extract magic numbers to constants
  2. 🟡 Review PII logging
  3. 🟡 Break down buildSystemPromptForIssue
  4. 🟡 Error handling in widgetPngUrl

Verdict: ✅ Approve with Changes

High-quality PR that significantly improves codebase. Fix the 2 security items and this is ready to merge!


Net Changes: +1735/-739 lines (996 net addition, mostly tests)
Test Coverage: Excellent with golden files
Code Quality: Strong improvement through refactoring
Security: Good practices, 2 items need attention

@claude
Copy link

claude bot commented Feb 8, 2026

PR Review: Bot Integration Tests and Code Refactoring

Overview

Strong PR with excellent test coverage and idiomatic Haskell refactoring. The use of Either monads, lens-aeson, and Map-based lookups significantly improves code quality. However, there are some critical issues to address.


🔴 Critical Issues

1. Race Condition in Discord Thread Migration (Discord.hs:325-331)

Problem: Multiple concurrent requests can insert duplicate messages

existingHistory <- Issues.selectChatHistory convId
when (null existingHistory) $ do
  msgs <- getThreadStarterMessage interaction envCfg.discordBotToken
  -- Multiple requests could reach here simultaneously

Fix: Use transaction with advisory lock or UPSERT pattern:

PG.executeOne [sql| 
  INSERT INTO chat_migration_lock (conversation_id) 
  VALUES (?) 
  ON CONFLICT DO NOTHING 
  RETURNING id
|] (Only convId) >>= \case
  Just _ -> -- proceed with migration
  Nothing -> pass

⚠️ High Priority Issues

2. Wrong HTTP Status for Invalid Signatures (Discord.hs:223)

Problem: Returns 404 for invalid signatures - should be 401/403

| otherwise = throwError err404{errBody = "Invalid signature"}

Fix: throwError err401{errBody = "Invalid or missing signature"}

3. Unsafe Empty String Default (Slack.hs:229)

Problem: fromMaybe "" passes empty string to DB query on missing guild_id

discordData <- getDiscordData (fromMaybe "" (maybe interaction.guild_id ...))

Fix: Handle Nothing explicitly with proper error response

4. Memory Leak Risk in Token Buffer (AI.hs:783-787)

Problem: On error, returns full allMsgs which may exceed maxTokens

pure $ fromMaybe allMsgs (rightToMaybe result)

Fix: Add fallback trimming logic to ensure token limits


🟡 Performance Optimizations

5. Sequential HTTP Calls (Discord.hs:465-466)

Two Discord API calls execute sequentially - use concurrently:

import Control.Concurrent.Async (concurrently)
(response, response') <- liftIO $ concurrently
  (getWith opts url)
  (getWith opts starterMessageUrl)

6. Redundant Text Conversions (Utils.hs:190-191)

-- Current
let widgetJson = decodeUtf8 @Text $ toStrict $ AE.encode widget
    encodedJson = decodeUtf8 @Text $ urlEncode True $ encodeUtf8 widgetJson

-- Better
let widgetJsonBS = toStrict $ AE.encode widget
    encodedJson = decodeUtf8 @Text $ urlEncode True widgetJsonBS

✅ Excellent Refactoring

verifyDiscordSignature - Either Monad (Discord.hs:429-435)

Perfect use of Either do-notation replacing nested case statements. Clean and readable.

parseReportStats - Lens Aeson (Utils.hs:451-454)

Excellent use of lens-aeson combinators. Idiomatic and concise.

parseVisualizationType - Map Lookup (AI.hs:217-237)

Great replacement of large case statement. Consider lifting vizTypeMap to top-level with {-# NOINLINE #-} for sharing.

timeParams - catMaybes Pattern (Utils.hs:194)

Clean functional approach with proper Maybe handling.


📊 Test Coverage

✅ Strengths

  • Golden file testing for bot responses
  • Signature verification testing with Ed25519
  • Fixture-based reusable test data
  • Good happy path coverage

⚠️ Gaps

  • No error path tests (invalid JSON, DB failures, LLM timeouts)
  • No concurrent access tests (for race condition above)
  • Missing edge cases (empty reports, malformed threads, 8000+ char URLs)

Recommended additions:

-- DiscordSpec.hs
it "handles concurrent thread access safely" \tr -> do
  -- Spawn 5 concurrent requests to same thread
  -- Verify no duplicate messages

it "rejects invalid JSON payloads" \tr -> do
  let (body, sig, ts) = signDiscordPayload "not json" "123"
  -- Should return error response

💡 Code Conciseness Opportunities

1. Use aeson-qq for JSON Construction

You already use it in BotFixtures.hs - consider wider adoption:

import Data.Aeson.QQ (aesonQQ)

hereSuccessResponse = [aesonQQ|{
  "type": 4,
  "data": {"flags": 32768, ...}
}|]

2. HashMap for Emoji Lookup (Utils.hs:47-57)

import Data.HashMap.Strict qualified as HM

botEmojiMap :: HM.HashMap Text Text
botEmojiMap = HM.fromList [("success", "🟢"), ...]
{-# NOINLINE botEmojiMap #-}

botEmoji = HM.lookupDefault "" ?? botEmojiMap

3. Use retry Package for LLM Calls

Instead of manual error handling, leverage existing ecosystem:

import Control.Retry

runAgenticQuery config userQuery apiKey = 
  retrying (exponentialBackoff 1000000 <> limitRetries 3) 
           [const shouldRetryError] 
           $ \_ -> -- actual LLM call

4. PatternSynonyms for Discord Responses

pattern PongResponse :: AE.Value
pattern PongResponse = AE.object ["type" AE..= (1 :: Int)]

pattern DeferredResponse :: AE.Value  
pattern DeferredResponse = AE.object ["type" AE..= (5 :: Int)]

🔒 Security

Good:

  • Parameterized queries throughout (no SQL injection)
  • Sensitive values not logged in tests
  • URL length validation (8000 chars)
  • Ed25519 signature verification

⚠️ Consider: Add logging for failed signature verification attempts (security monitoring)


Summary

Category Critical High Medium
Bugs 1 2 2
Performance 0 0 2
Tests 0 0 4

Recommendation: Fix the race condition and HTTP status issues before merging. The refactoring work is excellent and demonstrates strong understanding of idiomatic Haskell patterns. Test coverage is comprehensive for happy paths but needs error scenario coverage.

Overall: Approve with required changes

Great work on code conciseness - the Either monads, lens-aeson, and Map lookups are exactly the right patterns to use!

@claude
Copy link

claude bot commented Feb 8, 2026

Review posted - See full details below

Summary

PR #307 adds comprehensive bot integration tests and refactors code for conciseness. APPROVE WITH CHANGES ⭐⭐⭐⭐☆

Test coverage improvements are excellent, refactoring demonstrates good use of packages. However, 4 critical issues in security-sensitive code must be addressed before merge.

🔴 CRITICAL - Must Fix Before Merge

1. Discord Signature Verification (Security) - src/Pages/Bots/Discord.hs:429-435
Using fromRight False silently swallows crypto errors. Return Either Text Bool with logging.

2. LLM Cache Key Collision Risk - src/Data/Effectful/LLM.hs:174-180
Truncating messages + 32/64-bit hash = high collision risk. Use SHA256 cryptographic hash.

3. Partial Functions - src/Data/Effectful/LLM.hs:136,204
Replace error with throwIO for proper error handling.

4. Unsafe Vector Head - src/Pages/Bots/Whatsapp.hs:177-178
Use safe indexing vals V.\!? 0 instead of partial V.head.

⚠️ Should Fix

  1. Hardcoded model name - Extract to config
  2. Widget parsing - Use parseMaybe parseJSON
  3. Security logging - Log signature verification failures

✅ Excellent Practices

✅ Effect-based architecture with clean separation
✅ lens-aeson usage in parseReportStats
✅ Map.lookup replacing case statements
✅ BA.constEq for constant-time comparison
✅ Comprehensive test coverage with golden tests
✅ Code reduction: Web.Routes -28 net lines

Scores

Code Quality ⭐⭐⭐⭐☆ | Security ⭐⭐⭐⭐☆ | Performance ⭐⭐⭐⭐☆ | Tests ⭐⭐⭐⭐⭐ | Conciseness ⭐⭐⭐⭐⭐

🤖 Generated with Claude Code

@claude
Copy link

claude bot commented Feb 8, 2026

Code Review - PR #307: Bot Integration Tests & Refactoring

Summary

This PR successfully refactors code for improved conciseness and adds comprehensive test coverage for bot handlers. The refactoring leverages Haskell best practices (Either do-notation, lens-aeson, Map.lookup) effectively. However, there are opportunities for further improvement.


✅ Strengths

1. Excellent Refactoring Patterns

  • Discord.hs:442-448 - Clean Either do-notation for signature verification (replacing nested case statements)
  • Utils.hs:448-451 - Proper lens-aeson usage with ^?, key, and _Number for JSON parsing
  • Utils.hs:190 - Elegant catMaybes pattern for URL parameter building
  • AI.hs:216-236 - Map.lookup replacing large case statement (maintainable & clear)

2. Security Practices

  • Correct Ed25519 signature verification for Discord webhooks
  • Timing-attack resistant constEq in widget signature validation
  • Proper URL encoding with urlEncode True

3. Test Coverage

  • Comprehensive integration tests for Discord, Slack, WhatsApp handlers
  • Golden test fixtures for response validation
  • Good use of deterministic keypair generation in test helpers

🔧 Issues & Recommendations

Priority 1: Test Coverage Gaps

  1. Missing parseReportStats tests (Utils.hs:448-451)

    • Need tests for: valid JSON, missing keys (should return 0), non-numeric values
    • This is a new lens-aeson refactor that should be validated
  2. Missing parseVisualizationType tests (AI.hs:216-236)

    • No tests validating all visualization type mappings
    • Recommendation: Add exhaustive or property-based tests for all map entries
  3. Missing widgetPngUrl parameter tests (Utils.hs:190)

    • Test different parameter combinations (only since, multiple params, empty)
    • Validate URL length limit behavior (>8000 chars)

Priority 2: Code Conciseness & Performance

  1. vizTypeMap reconstruction (AI.hs:219-236)

    -- Current: Map.fromList called on every parseVisualizationType call
    -- Recommendation: Move to module-level constant
    vizTypeMap :: Map.Map Text Text
    vizTypeMap = Map.fromList [...]
    {-# NOINLINE vizTypeMap #-}
    
    parseVisualizationType :: Text -> Maybe Text
    parseVisualizationType = flip Map.lookup vizTypeMap

    This saves repeated map construction and is more concise.

  2. Unused import (Discord.hs:22)

    import Control.Lens ((.~), (^.))  -- (.~) is unused

    Remove (.~) or use it somewhere. Project prioritizes code conciseness.

Priority 3: Observability

  1. Silent cryptographic failures (Discord.hs:443)

    • fromRight False swallows all signature verification errors
    • Recommendation: Add debug logging before the fallback for security auditing
    Left err -> logDebug ("Discord signature verification failed: " <> err) >> pure False
  2. Test error handling (BotTestHelpers.hs:180)

    Crypto.CryptoFailed _ -> error "Failed to create test secret key"

    Use fail or throwIO instead of error in test code for better error propagation.


📦 Package Usage

Good use of existing packages:

  • lens-aeson (already in dependencies)
  • containers for Map.lookup
  • cryptonite for Ed25519/HMAC
  • ✅ No new dependencies introduced

No reimplementations detected - all refactors leverage existing packages appropriately.


🔒 Security Assessment

  • ✅ Cryptographic operations are correct
  • ✅ Timing-attack resistance in HMAC comparison
  • ✅ Proper Base16 decoding validation
  • ⚠️ Consider logging signature verification failures for audit trails

📊 Metrics

Metric Value
Lines Added 2,144
Lines Removed 832
Net Change +1,312
Files Changed 35
Test Files Added 6

Code conciseness improvement: The refactoring reduces nesting and leverages GHC extensions well. Further gains possible with module-level constants and import cleanup.


🎯 Action Items Before Merge

Must-have:

  1. Add unit tests for parseReportStats (Utils.hs:448-451)
  2. Add tests for parseVisualizationType mappings (AI.hs:216-236)
  3. Remove unused (.~) import from Discord.hs:22

Nice-to-have:

  1. Move vizTypeMap to module-level with NOINLINE pragma
  2. Add debug logging to signature verification failures
  3. Replace error with fail in BotTestHelpers.hs:180

✨ Conclusion

Recommendation: Approve with minor changes

The refactoring successfully achieves the stated goals and improves code quality. The use of lens-aeson, Either do-notation, and Map.lookup are excellent examples of idiomatic Haskell that reduces LOC while maintaining clarity. Address the test coverage gaps and minor import cleanup to align with the project's code conciseness priorities.

Great work on the comprehensive test suite additions! 🚀

@tonyalaribe tonyalaribe changed the title Add bot integration tests and refactor for code conciseness Fix the LLM based interactions on issues and alerts and slack conversations as well bot integration tests and refactor for code conciseness Feb 8, 2026
@claude
Copy link

claude bot commented Feb 8, 2026

PR #307 Review: Bot Integration Tests & LLM Refactoring

Overview

This PR adds comprehensive test coverage for bot integrations and refactors code for conciseness. Overall structure is excellent, but several critical issues need attention before merge.


🔴 Critical Issues (Must Fix)

1. Empty Messages Vector in runAgenticQuery (src/Pkg/AI.hs:630)

params = OpenAIV1._CreateChatCompletion
  { OpenAIV1.model = Models.Model "gpt-4o-mini"
  , OpenAIV1.tools = Just $ V.fromList allToolDefs
  , OpenAIV1.messages = V.empty  -- ← PROBLEM!
  }
  • messages is empty but chatHistory passed separately - assumes LLMCore.chat merges them
  • Fragile design; should explicitly populate from chatHistory

2. Weak Golden File Cache Key (src/Data/Effectful/LLM.hs:172-178)

messageStrs = map (\m -> show (LLMCore.role m) <> ":" <> T.take 100 (LLMCore.content m)) messages
  • Only first 100 chars used in hash - loses important context
  • abs hashVal undefined on overflow
  • Fix: Use SHA256 of full message content

3. Crash-Prone Error Handling (src/Data/Effectful/LLM.hs:134)

Nothing -> error $ fromString $ "Failed to decode LLM response from file: " <> filePath
  • Uses error instead of throwIO - crashes program
  • Test suite could fail unexpectedly
  • Fix: Use proper IO exception handling

🟠 High Priority Issues

4. Missing Timeout on Tool Execution (src/Pkg/AI.hs:789-806)

  • Tool calls have no timeout protection
  • Long KQL queries could block indefinitely
  • Fix: Wrap executeToolCall with 30s timeout

5. Rate Limiting Missing (src/Pages/Bots/Discord.hs, src/Web/Routes.hs)

  • /discord/interactions, /slack/events endpoints unprotected
  • sendJsonFollowupResponse calls not rate-limited
  • Could hit Discord API limits or cause DoS
  • Fix: Add per-guild rate limiting and backoff

6. Potential Directory Traversal (src/Pages/Bots/Discord.hs:296-297)

  • Dashboard/widget IDs not validated before file access
  • Fix: Whitelist allowed dashboard names or validate path

7. Missing Timestamp Validation (src/Pages/Bots/Discord.hs)

  • Discord signature validation doesn't check timestamp freshness
  • Vulnerable to replay attacks
  • Fix: Validate X-Signature-Timestamp within 5 minutes

🟡 Medium Priority Issues

8. Silent Widget URL Failure (src/Pages/Bots/Utils.hs:193-195)

  • Logs warning but returns empty string silently
  • Causes broken embeds without user awareness
  • Fix: Add fallback URL shortening or explicit error

9. Widget Cache Strategy (src/Web/Routes.hs:697)

addHeader @"Cache-Control" (bool "public, max-age=300" "public, max-age=31536000, immutable" ...)
  • Assumes absolute date ranges never change
  • Could serve stale data if query definitions update
  • Fix: Include widget version/hash in cache key

10. Large History String Building (src/Pkg/AI.hs:315, 319)

T.intercalate "\n" $ map formatMessage msgs
  • No size limit on accumulated history
  • Could cause memory pressure in long conversations
  • Fix: Add explicit truncation or size limits

🟢 Minor Issues & Code Quality

11. Typo in Export (src/Pkg/AI.hs:19)

getNormalTupleReponse  -- should be "Response"

12. Dead Code (src/Pages/Bots/Discord.hs:304-306)

  • Three consecutive pass statements (last two unreachable)
  • Suggests incomplete refactoring

13. Missing Report Stats Validation (src/Pages/Bots/Utils.hs:449-452)

  • Uses ^? _Number without validating keys exist
  • Could fail silently returning 0
  • Fix: Add explicit error logging for malformed JSON

✅ Excellent Additions

  1. Test Infrastructure - Comprehensive coverage with proper fixtures and golden files
  2. LLM Effectful Module - Clean abstraction with good separation of concerns
  3. Security - Proper use of BA.constEq for constant-time comparison, Ed25519 validation
  4. Concurrency - Good use of ki-effectful for structured concurrency
  5. Code Refactoring - Discord.hs Either do-notation, Utils.hs lens-aeson usage

🎯 Recommendations for Code Conciseness

Given your priority on concise code and leveraging GHC extensions:

Use More Pattern Guards (Instead of nested lets)

Before:

case foo of
  Just x -> let y = transform x in process y
  Nothing -> pass

After:

case foo of
  Just x | y <- transform x -> process y
  Nothing -> pass

Consider RecordWildCards (src/Pages/Bots/Discord.hs)

Many places do interaction.guild_id, interaction.channel_id repeatedly

{-# LANGUAGE RecordWildCards #-}
handleInteraction interaction@Interaction{..} = do
  -- Now use guild_id, channel_id directly

Use ViewPatterns for Complex Parsing

Instead of parsing then matching, parse in pattern:

{-# LANGUAGE ViewPatterns #-}
handleCommand (parseCommand -> Just cmd) = ...

📊 Test Coverage

Excellent: Comprehensive bot integration tests with golden files
Missing:

  • Negative test cases (malformed payloads, invalid signatures)
  • Concurrent request handling tests
  • Error recovery scenarios

🔒 Security Summary

Strong: Constant-time signature verification, proper HMAC usage
⚠️ Needs Work: Timestamp validation, rate limiting, API key sanitization in logs, input validation


📦 Dependencies & Performance

New Deps: All appropriate (typed-process, ki-effectful, cryptonite)
Performance Concerns:

  • Chart-CLI spawns new process per request (Routes.hs:692) - consider pooling
  • Parallel thread fetching without backpressure (Discord.hs:485-488)
  • Tool result JSON could hit token limits with large queries

🎬 Verdict

Conditional Approval - Excellent test infrastructure and valuable refactoring, but fix critical issues (1-3) before merge. High priority issues (4-7) should ideally be addressed but could be follow-up if time-constrained.

Test Plan: ✅ Run USE_EXTERNAL_DB=true cabal test integration-tests before merge

The 2160 line addition with 839 deletions is a net win for maintainability. Great work on the test coverage! 🎉

@tonyalaribe tonyalaribe merged commit f5bd812 into master Feb 8, 2026
3 checks passed
@tonyalaribe tonyalaribe deleted the ai-bot-workflow-unification branch February 8, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant