feat(bridge): add Slack Socket Mode integration#1129
feat(bridge): add Slack Socket Mode integration#1129caco26i wants to merge 4 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: NemoClaw Bot <bot@example.com> Made-with: Cursor
📝 WalkthroughWalkthroughAdds a Slack Socket Mode bridge service, propagates persisted Slack credentials into sandbox environments, and updates service start/stop/status scripts to manage and conditionally run the new Changes
Sequence DiagramsequenceDiagram
actor User as Slack User
participant Slack as Slack API
participant Bridge as slack-bridge.js
participant SSH as SSH Client
participant Sandbox as Nemoclaw Sandbox
participant Agent as OpenClaw Agent
User->>Slack: send message / mention
Slack->>Bridge: Socket Mode event
Bridge->>Bridge: validate tokens, sandbox, cooldown/gate
Bridge->>SSH: create temp SSH config & open connection
SSH->>Sandbox: run `nemoclaw-start openclaw agent` with input
Sandbox->>Agent: deliver message input
Agent-->>Sandbox: produce stdout/stderr
Sandbox-->>SSH: return process output
SSH-->>Bridge: relay stdout/stderr
Bridge->>Bridge: filter/format, split into 3000-char chunks
Bridge->>Slack: chat.postMessage (threaded, chunked)
Slack->>User: display response
Bridge->>SSH: cleanup temp config & close
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
scripts/slack-bridge.js (5)
225-228: Reconnection logic lacks exponential backoff.The fixed 3-second delay on WebSocket close could cause rapid reconnection attempts if there's an authentication or server issue, potentially triggering rate limits.
♻️ Suggested improvement with exponential backoff
+let reconnectAttempts = 0; +const MAX_RECONNECT_DELAY = 60000; ws.addEventListener("close", () => { console.log("Socket Mode connection closed. Reconnecting..."); - setTimeout(connectSocketMode, 3000); + const delay = Math.min(3000 * Math.pow(2, reconnectAttempts), MAX_RECONNECT_DELAY); + reconnectAttempts++; + setTimeout(connectSocketMode, delay); }); +// Reset reconnect attempts on successful connection +ws.addEventListener("open", () => { + reconnectAttempts = 0; console.log("Connected to Slack Socket Mode."); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 225 - 228, The reconnection on WebSocket close currently uses a fixed 3s delay (ws.addEventListener("close" ...) calling setTimeout(connectSocketMode, 3000)), which can cause rapid retries and rate limits; replace the fixed timeout with an exponential backoff strategy inside the close handler that uses a backoff counter (e.g., reconnectAttempt or backoffMs) keyed to connectSocketMode, doubling the delay on each failure up to a maximum (and applying jitter), reset the counter on a successful connection (e.g., in the ws.open or successful handshake path), and ensure you cap the delay and optionally log attempts to avoid infinite tight-loop reconnects.
95-97: Inlinerequire("fs")calls — import once at the top.The file system module is required inline multiple times. Import it once at the top for consistency and minor performance improvement.
♻️ Proposed fix
const https = require("https"); +const fs = require("fs"); const { execFileSync, spawn } = require("child_process"); // Then replace inline requires: - const confDir = require("fs").mkdtempSync("/tmp/nemoclaw-slack-ssh-"); + const confDir = fs.mkdtempSync("/tmp/nemoclaw-slack-ssh-"); const confPath = `${confDir}/config`; - require("fs").writeFileSync(confPath, sshConfig, { mode: 0o600 }); + fs.writeFileSync(confPath, sshConfig, { mode: 0o600 }); // ... - try { require("fs").unlinkSync(confPath); require("fs").rmdirSync(confDir); } catch { /* ignored */ } + try { fs.unlinkSync(confPath); fs.rmdirSync(confDir); } catch { /* ignored */ }Also applies to: 114-114
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 95 - 97, Replace inline require("fs") calls by importing fs once at the top and using that reference throughout: add a single top-level binding (e.g., const fs = require('fs')) and update usages of require("fs").mkdtempSync and require("fs").writeFileSync to fs.mkdtempSync and fs.writeFileSync respectively (also update the other occurrence around the ssh-bridge code, e.g., the call referenced at line ~114). This keeps the unique symbols confDir and confPath unchanged while consolidating the fs import for consistency and minor performance improvement.
39-42:activeSessionsMap is unused — message history is not being tracked.The comment on line 39 states this is for
channelId → message history, but the Map is only cleared onresetcommand (line 191). No message history is ever stored, so multi-turn context is not preserved across agent calls.Either implement the session tracking or remove the dead code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 39 - 42, The activeSessions Map declared as activeSessions is never used so multi-turn context isn't preserved; either implement session tracking by storing and updating message history into activeSessions keyed by channelId whenever a message is received or sent (and include that history when calling the agent), and keep the existing reset logic that clears the Map, or remove the activeSessions declaration and related comment entirely; locate places where messages are handled (e.g., the message handler function that sends/receives to the agent and the reset command logic) and either push each incoming/outgoing message into activeSessions.get(channelId) (initializing an array if absent) and read that array when constructing agent calls, or delete activeSessions and its comment if session state is intentionally unsupported.
91-97:execFileSynccan throw — wrap in try/catch for better error handling.If
openshell sandbox ssh-configfails (e.g., sandbox doesn't exist), the unhandled exception will crash the promise chain without a meaningful error message to the user.🛡️ Proposed fix
function runAgentInSandbox(message, sessionId) { return new Promise((resolve) => { - const sshConfig = execFileSync(OPENSHELL, ["sandbox", "ssh-config", SANDBOX], { encoding: "utf-8" }); + let sshConfig; + try { + sshConfig = execFileSync(OPENSHELL, ["sandbox", "ssh-config", SANDBOX], { encoding: "utf-8" }); + } catch (err) { + resolve(`Failed to get SSH config for sandbox '${SANDBOX}': ${err.message}`); + return; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 91 - 97, The call to execFileSync in runAgentInSandbox can throw and is currently unhandled; change the Promise constructor to new Promise((resolve, reject)) and wrap the execFileSync(OPENSHELL, ["sandbox", "ssh-config", SANDBOX], ...) call (and the subsequent mkdtempSync/writeFileSync sequence) in a try/catch so any error is caught and the promise is rejected with the caught error (or a wrapped error with context). Update error handling to include the thrown error message and ensure any partially created temp files/dirs are cleaned up in the catch block; reference function runAgentInSandbox and the execFileSync/ mkdtempSync/ writeFileSync calls to locate the code to modify.
251-251: Hardcoded model name in banner may be misleading.The banner displays
nvidia/nemotron-3-super-120b-a12bregardless of the actual model configured for the sandbox. This could confuse users about which model is actually serving their requests.Consider either removing the model line or querying the actual configured model dynamically.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` at line 251, The banner contains a hardcoded model line (the console.log that prints " │ Model: nvidia/nemotron-3-super-120b-a12b │") which can be misleading; replace that static string with a dynamic value from the configured model (e.g., read the model name from the runtime config, env var, or the variable used to initialize the sandbox like modelName/config.model) or remove the line entirely; locate the console.log call in the banner-printing code (search for the exact hardcoded string) and change it to interpolate the actual model value with a sensible fallback such as "unknown" if not set.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/slack-bridge.js`:
- Around line 102-105: The child process created with spawn("ssh", ...) uses a
non-functional timeout option — modify the code around the spawn call to
implement a manual timeout: after creating proc, start a setTimeout that calls
proc.kill() (and logs an error) after the desired ms, and store the timer id;
also attach listeners on proc ('exit' or 'close' and 'error') to clear the timer
when the process ends and to handle cleanup; ensure you do not rely on the spawn
option "timeout" anymore and update any references to proc to handle termination
and potential partial stdout/stderr.
- Around line 172-173: The inner declaration const event = msg.payload.event
shadows the addEventListener callback's event parameter; rename that inner
variable (e.g., payloadEvent or slackEvent) and update all subsequent references
in the block to use the new name so the callback parameter remains unshadowed
and code clarity is preserved (locate the occurrence where msg.payload.event is
assigned inside the addEventListener handler and replace the identifier and its
uses).
---
Nitpick comments:
In `@scripts/slack-bridge.js`:
- Around line 225-228: The reconnection on WebSocket close currently uses a
fixed 3s delay (ws.addEventListener("close" ...) calling
setTimeout(connectSocketMode, 3000)), which can cause rapid retries and rate
limits; replace the fixed timeout with an exponential backoff strategy inside
the close handler that uses a backoff counter (e.g., reconnectAttempt or
backoffMs) keyed to connectSocketMode, doubling the delay on each failure up to
a maximum (and applying jitter), reset the counter on a successful connection
(e.g., in the ws.open or successful handshake path), and ensure you cap the
delay and optionally log attempts to avoid infinite tight-loop reconnects.
- Around line 95-97: Replace inline require("fs") calls by importing fs once at
the top and using that reference throughout: add a single top-level binding
(e.g., const fs = require('fs')) and update usages of require("fs").mkdtempSync
and require("fs").writeFileSync to fs.mkdtempSync and fs.writeFileSync
respectively (also update the other occurrence around the ssh-bridge code, e.g.,
the call referenced at line ~114). This keeps the unique symbols confDir and
confPath unchanged while consolidating the fs import for consistency and minor
performance improvement.
- Around line 39-42: The activeSessions Map declared as activeSessions is never
used so multi-turn context isn't preserved; either implement session tracking by
storing and updating message history into activeSessions keyed by channelId
whenever a message is received or sent (and include that history when calling
the agent), and keep the existing reset logic that clears the Map, or remove the
activeSessions declaration and related comment entirely; locate places where
messages are handled (e.g., the message handler function that sends/receives to
the agent and the reset command logic) and either push each incoming/outgoing
message into activeSessions.get(channelId) (initializing an array if absent) and
read that array when constructing agent calls, or delete activeSessions and its
comment if session state is intentionally unsupported.
- Around line 91-97: The call to execFileSync in runAgentInSandbox can throw and
is currently unhandled; change the Promise constructor to new Promise((resolve,
reject)) and wrap the execFileSync(OPENSHELL, ["sandbox", "ssh-config",
SANDBOX], ...) call (and the subsequent mkdtempSync/writeFileSync sequence) in a
try/catch so any error is caught and the promise is rejected with the caught
error (or a wrapped error with context). Update error handling to include the
thrown error message and ensure any partially created temp files/dirs are
cleaned up in the catch block; reference function runAgentInSandbox and the
execFileSync/ mkdtempSync/ writeFileSync calls to locate the code to modify.
- Line 251: The banner contains a hardcoded model line (the console.log that
prints " │ Model: nvidia/nemotron-3-super-120b-a12b │") which can be
misleading; replace that static string with a dynamic value from the configured
model (e.g., read the model name from the runtime config, env var, or the
variable used to initialize the sandbox like modelName/config.model) or remove
the line entirely; locate the console.log call in the banner-printing code
(search for the exact hardcoded string) and change it to interpolate the actual
model value with a sensible fallback such as "unknown" if not set.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a62adebc-50f8-47ce-9dda-57fa0bde1f6b
📒 Files selected for processing (4)
bin/lib/onboard.jsbin/nemoclaw.jsscripts/slack-bridge.jsscripts/start-services.sh
Made-with: Cursor
|
I have reviewed and applied all of CodeRabbit's comments:
The changes have been pushed to the PR branch. |
There was a problem hiding this comment.
🧹 Nitpick comments (4)
scripts/slack-bridge.js (4)
259-285: Consider adding an unhandled rejection handler for production robustness.The
main()call on line 285 handles its own errors viaprocess.exit(1), but unexpected rejections elsewhere (e.g., from the WebSocket message handler) could crash the process silently.♻️ Proposed addition before main()
+process.on("unhandledRejection", (err) => { + console.error("Unhandled rejection:", err); +}); + main();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 259 - 285, Add a global unhandled rejection handler to ensure unexpected Promise rejections are logged and handled: register process.on('unhandledRejection', ...) (and optionally process.on('uncaughtException', ...)) near the top-level before calling main(), log the error with context and either exit(1) or perform graceful cleanup; this will cover async failures coming from connectSocketMode(), main(), or any other async handlers that would otherwise crash silently.
176-179:reconnectAttemptsis referenced before its lexical definition.The variable
reconnectAttemptsis used on line 177 but declared on line 242. This works because theopencallback executes after line 242, but the source ordering is confusing and could lead to bugs if refactored.♻️ Proposed fix: move declaration before event listeners
const ws = new WebSocket(res.url); + + let reconnectAttempts = 0; + const MAX_RECONNECT_DELAY = 60000; ws.addEventListener("open", () => { reconnectAttempts = 0; console.log("Connected to Slack Socket Mode."); }); // ... rest of event listeners ... - let reconnectAttempts = 0; - const MAX_RECONNECT_DELAY = 60000; - ws.addEventListener("close", () => {Also applies to: 242-250
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 176 - 179, The variable reconnectAttempts is referenced in the ws.addEventListener("open") callback before its lexical declaration; move the reconnectAttempts declaration (and any related initialization) above the WebSocket event listener registrations (e.g., before ws.addEventListener("open") and other ws.addEventListener calls) so that reconnectAttempts is declared in scope prior to being used; update the file's top-of-scope area where reconnectAttempts (and its related reconnect/backoff variables around the current declaration in the reconnect logic) are defined to ensure event handlers like the "open" callback, "close", and reconnection logic read/write a declared variable.
75-87: Silent failures on message sending may leave users without responses.If
slackApi("chat.postMessage", ...)fails (e.g., rate limited, network error), the error is swallowed and the user may not see part or all of the response. Consider logging failures or throwing to allow the caller to handle them.♻️ Proposed improvement
async function sendMessage(channel, text, thread_ts) { const chunks = []; for (let i = 0; i < text.length; i += 3000) { chunks.push(text.slice(i, i + 3000)); } for (const chunk of chunks) { - await slackApi("chat.postMessage", { + const res = await slackApi("chat.postMessage", { channel, text: chunk, thread_ts, }, BOT_TOKEN); + if (!res.ok) { + console.error(`Failed to send message to ${channel}: ${res.error}`); + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 75 - 87, The sendMessage function currently awaits slackApi("chat.postMessage", ...) for each chunk but does not handle failures; wrap each slackApi call in a try/catch inside sendMessage, log the error with context (include channel, thread_ts and the chunk or chunk index) using the existing logger or console.error, and rethrow the error (or throw a new Error that preserves the original) so the caller can handle it; reference sendMessage, slackApi and BOT_TOKEN when locating the change.
181-183:JSON.parsemay throw on malformed WebSocket messages.If Slack sends a malformed message (unlikely but possible during protocol changes or network corruption), the unguarded
JSON.parsewill throw and the event won't be acknowledged, potentially causing message replay.♻️ Proposed fix
ws.addEventListener("message", async (event) => { - const msg = JSON.parse(event.data); + let msg; + try { + msg = JSON.parse(event.data); + } catch (err) { + console.error("Failed to parse WebSocket message:", err.message); + return; + } if (msg.type === "hello") return;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/slack-bridge.js` around lines 181 - 183, Wrap the unguarded JSON.parse(event.data) inside a try/catch in the ws.addEventListener("message", async (event) => { ... }) handler so malformed WebSocket messages don't throw; on catch, log the parse error (include event.data), optionally send any required nack/ack over ws if the Slack protocol expects it, and return early so downstream code that uses msg is not executed. Ensure you reference the parsed variable msg and any existing logger (or console.error) when recording the error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/slack-bridge.js`:
- Around line 259-285: Add a global unhandled rejection handler to ensure
unexpected Promise rejections are logged and handled: register
process.on('unhandledRejection', ...) (and optionally
process.on('uncaughtException', ...)) near the top-level before calling main(),
log the error with context and either exit(1) or perform graceful cleanup; this
will cover async failures coming from connectSocketMode(), main(), or any other
async handlers that would otherwise crash silently.
- Around line 176-179: The variable reconnectAttempts is referenced in the
ws.addEventListener("open") callback before its lexical declaration; move the
reconnectAttempts declaration (and any related initialization) above the
WebSocket event listener registrations (e.g., before ws.addEventListener("open")
and other ws.addEventListener calls) so that reconnectAttempts is declared in
scope prior to being used; update the file's top-of-scope area where
reconnectAttempts (and its related reconnect/backoff variables around the
current declaration in the reconnect logic) are defined to ensure event handlers
like the "open" callback, "close", and reconnection logic read/write a declared
variable.
- Around line 75-87: The sendMessage function currently awaits
slackApi("chat.postMessage", ...) for each chunk but does not handle failures;
wrap each slackApi call in a try/catch inside sendMessage, log the error with
context (include channel, thread_ts and the chunk or chunk index) using the
existing logger or console.error, and rethrow the error (or throw a new Error
that preserves the original) so the caller can handle it; reference sendMessage,
slackApi and BOT_TOKEN when locating the change.
- Around line 181-183: Wrap the unguarded JSON.parse(event.data) inside a
try/catch in the ws.addEventListener("message", async (event) => { ... })
handler so malformed WebSocket messages don't throw; on catch, log the parse
error (include event.data), optionally send any required nack/ack over ws if the
Slack protocol expects it, and return early so downstream code that uses msg is
not executed. Ensure you reference the parsed variable msg and any existing
logger (or console.error) when recording the error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a026fb97-ff23-4424-b6a1-a4c35515a34c
📒 Files selected for processing (1)
scripts/slack-bridge.js
Made-with: Cursor
8cc28e4 to
b9a305e
Compare
|
I have reviewed and applied the new nitpicks from CodeRabbit:
The changes have been pushed to the PR branch. |
|
For context, here is an analysis of other open PRs that are related to this Slack integration:
|
|
I noticed there are a couple of open issues related to credential handling in the bridge scripts that also apply to this Slack integration:
Currently, this Slack bridge mirrors the existing Telegram implementation, meaning it also passes the Should we adapt this PR to use the new provider system (perhaps aligning with the work in draft PR #1081), or is it preferable to merge this as-is to establish feature parity with Telegram and address the credential isolation refactoring globally in a separate PR? |
Summary
This PR adds support for a Slack Socket Mode integration, mirroring the existing Telegram bridge functionality. It allows users to interact with their OpenClaw sandbox directly from Slack.
Changes
scripts/slack-bridge.jsto handle Slack Socket Mode connections and forward messages to the sandbox.scripts/start-services.shto start and stop the Slack bridge alongside the Telegram bridge whenSLACK_BOT_TOKENandSLACK_APP_TOKENare present.bin/nemoclaw.jsto load credentials from~/.nemoclaw/credentials.jsonbefore starting services.bin/lib/onboard.jsto prevent credential exposure.Type of Change
Testing
npx prek run --all-filespasses (or equivalentlymake check).npm testpasses.make docsbuilds without warnings. (for doc-only changes)Checklist
General
Code Changes
npx prek run --all-filesauto-fixes formatting (ormake formatfor targeted runs).Summary by CodeRabbit
New Features
Improvements