Overview
Add Slack as a data source for Ei. Like Cursor/ClaudeCode, a single synthetic persona per workspace ingests messages and feeds the extraction pipeline. Unlike code tool integrations, Slack is human↔human conversation — higher signal for people/topics, different extraction unit model.
This ticket also establishes the two-speed polling architecture that Ticket #N will later port to the other integrations.
Depends on: #51 (sources[] field)
Auth
Two modes, both supported:
auth: {
type: "xoxp" | "browser";
token: string; // xoxp (permanent) or xoxc (browser session, ~2 weeks)
cookie?: string; // xoxd (browser mode only)
}
xoxp (recommended): User creates a Slack App at api.slack.com once, installs to their workspace, gets a non-expiring user token. Works on Enterprise Grid. No compliance risk.
browser (quick start): xoxc/xoxd extracted from browser DevTools (or auto-extracted from Slack's local LevelDB at ~/Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Local Storage/leveldb). Expires every ~2 weeks. Good for initial setup and testing.
OAuth redirect via existing https://ei.flare576.com/callback/ pattern (same as Spotify).
Channel Tier Model
Extraction scope and backfill depth vary by channel type:
| Tier |
Criteria |
Backfill |
Extraction scope |
| DM |
D-prefix channel ID |
90 days |
All messages |
| Private |
G-prefix or private channel |
90 days |
All messages |
| Public |
C-prefix, member count < threshold |
30 days |
Threads you posted in only |
| Broadcast |
C-prefix, member count ≥ threshold (default: 100) |
Skip |
Skip |
User can override any channel's tier via channel_overrides.
Why "threads you engaged in" for public channels? Real-world data from R&P shows Jeremy posts in ~9% of public channel messages. Extracting ambient channel discussion he never participated in is LLM cost with minimal signal gain. The broadcast threshold isn't about member count as a quality proxy — it's about channels where you're a lurker by design.
Extraction Model
The Two Units
Thread unit — A Slack thread (parent + replies):
messages_context: thread root + all previously-extracted replies
messages_analyze: new replies since thread_last_extracted
- Thread roots are NEVER in the spine's analyze set — they belong to the thread unit
- Threads with new activity are always re-queued, even if the parent is old
Spine unit — Unthreaded messages in the channel:
messages_context: ~8h of prior unthreaded messages
messages_analyze: new unthreaded messages in the extraction window
- Thread roots appear as
[Processed] anchors showing where side-conversations branched
- Thread reply content is NEVER in the spine
Processing Order (per channel per day)
- Threads first — any thread with new replies since last sync, oldest first
- Spine second — unthreaded messages in the window, with thread roots as anchors
The Interleaved Spine Format
Current extraction prompts use two clean sections ("Earlier Conversation" / "Most Recent Messages"). The spine needs a third concept: processed thread roots interleaved with new unthreaded messages. This requires a new prompt variant buildSpineExtractionPrompt():
## Earlier Context
[mid:abc:human] ← prior unthreaded messages
## Channel Messages
[Processed] [mid:def:Jeremy] ← thread root (side-conversation happened here)
[New] [mid:ghi:Tom] ← unthreaded message to extract from
[New] [mid:jkl:Jeremy] ← unthreaded message to extract from
[Processed] [mid:mno:Jeremy] ← thread root (another side-conversation)
[New] [mid:pqr:Tom] ← unthreaded message to extract from
System prompt: "ONLY ANALYZE messages labeled [New]. Messages labeled [Processed] are thread roots — they show where side-conversations branched off. Do not extract from [Processed] messages."
One Day At A Time — Workspace-Wide Cursor
Unlike code tool integrations (session-at-a-time), Slack advances a workspace-level cursor 24h per run:
- Read
extraction_point (ISO timestamp — workspace cursor)
extractionWindow = [extraction_point, extraction_point + 24h]
- If
extractionLimit > 30 days ago: skip public channels (they only get 30 days backfill)
- For each eligible channel with activity in window:
a. Queue thread units (threads with new replies)
b. Queue spine unit (unthreaded messages in window)
- Advance
extraction_point by 24h
Why this approach? Real-world R&P data shows ~500-800 messages/month across all channels. A 24h window is ~16-26 messages — 1-2 LLM extraction calls. The 90-day DM backfill is 90 micro-runs, identical cadence to OpenCode's session-at-a-time approach.
Two-Speed Polling (New Architecture — Establish Pattern Here)
Current integrations use fixed 60s polling with no backoff, producing constant "nothing to do" log spam when caught up. Slack establishes the correct pattern:
type SyncResult = "imported" | "nothing_to_do";
Catch-up speed (default 60s): When extraction_point is more than 24h behind now.
Steady-state speed (default 3600s / 1h): When extraction_point is within 24h of now.
Processor uses SyncResult to switch speeds:
"imported" → stay at catch-up speed (or reset to it)
"nothing_to_do" → switch to steady-state speed
No user config needed. No log spam. Fast during backfill, quiet when current.
This pattern will be ported to OpenCode/Cursor/ClaudeCode in a follow-up ticket once validated on Slack.
Settings Shape
// src/integrations/slack/types.ts
export interface SlackSettings {
integration?: boolean;
polling_interval_ms?: number; // catch-up speed, default: 60000
steady_state_interval_ms?: number; // steady-state speed, default: 3600000
extraction_model?: string; // "Provider:model" override
extraction_point?: string; // ISO — workspace cursor, advances 24h/run
last_sync?: string; // ISO — last time integration ran
auth: {
type: "xoxp" | "browser";
token: string;
cookie?: string; // xoxd (browser mode only)
};
backfill_days: {
dm: number; // default: 90
private: number; // default: 90
public: number; // default: 30
};
broadcast_threshold?: number; // member count above which = skip, default: 100
channel_overrides?: Record<string, "dm" | "private" | "public" | "skip">;
channels: Record<string, SlackChannelState>;
}
export interface SlackChannelState {
spine_last_extracted?: string; // ISO
threads: Record<string, string>; // threadTs → ISO last extracted
}
Add slack?: SlackSettings to HumanSettings in src/core/types/entities.ts.
Sources Tagging (requires #51)
All extracted items get tagged with the source channel:
sources: [`slack:${channelId}`] // e.g. "slack:C09V5R90C0G"
This enables ei topics --source slack "deployment" and ei topics --source "slack:C09V5R90C0G" for channel-scoped queries.
Real-World Data (R&P Workspace)
Sampled to validate window size assumptions:
| Metric |
Value |
| Total channel memberships |
~1,513 |
| Channels with recent activity |
~10-20 |
| Dead/archived channels (0 members) |
~200+ |
| Estimated messages/month (all channels) |
~500-800 |
| Messages in a 24h window |
~16-26 |
| Thread density |
~30% of top-level messages |
| Average thread depth |
~8 replies |
| Jeremy's share of active project channel |
~34% |
| Jeremy's share of public interest channel |
~9% |
What Changes
| File |
Change |
src/integrations/slack/ |
New directory: types.ts, importer.ts, reader.ts |
src/prompts/human/spine-scan.ts |
New prompt builder for interleaved [Processed]/[New] spine format |
src/core/types/entities.ts |
Add slack?: SlackSettings to HumanSettings |
src/core/processor.ts |
Add checkAndSyncSlack(), two-speed polling logic, SyncResult type |
src/core/tools/builtin/ |
Slack OAuth auth helper (mirrors spotify-auth.ts) |
web/ |
Settings UI: Slack connection, backfill config, channel tier overrides |
tui/ |
/auth slack command, /settings Slack section |
CONTRACTS.md |
Document Slack persona naming convention, channel tier model |
Overview
Add Slack as a data source for Ei. Like Cursor/ClaudeCode, a single synthetic persona per workspace ingests messages and feeds the extraction pipeline. Unlike code tool integrations, Slack is human↔human conversation — higher signal for people/topics, different extraction unit model.
This ticket also establishes the two-speed polling architecture that Ticket #N will later port to the other integrations.
Depends on: #51 (
sources[]field)Auth
Two modes, both supported:
xoxp (recommended): User creates a Slack App at api.slack.com once, installs to their workspace, gets a non-expiring user token. Works on Enterprise Grid. No compliance risk.
browser (quick start): xoxc/xoxd extracted from browser DevTools (or auto-extracted from Slack's local LevelDB at
~/Library/Containers/com.tinyspeck.slackmacgap/Data/Library/Application Support/Slack/Local Storage/leveldb). Expires every ~2 weeks. Good for initial setup and testing.OAuth redirect via existing
https://ei.flare576.com/callback/pattern (same as Spotify).Channel Tier Model
Extraction scope and backfill depth vary by channel type:
D-prefix channel IDG-prefix or private channelC-prefix, member count < thresholdC-prefix, member count ≥ threshold (default: 100)User can override any channel's tier via
channel_overrides.Extraction Model
The Two Units
Thread unit — A Slack thread (parent + replies):
messages_context: thread root + all previously-extracted repliesmessages_analyze: new replies sincethread_last_extractedSpine unit — Unthreaded messages in the channel:
messages_context: ~8h of prior unthreaded messagesmessages_analyze: new unthreaded messages in the extraction window[Processed]anchors showing where side-conversations branchedProcessing Order (per channel per day)
The Interleaved Spine Format
Current extraction prompts use two clean sections ("Earlier Conversation" / "Most Recent Messages"). The spine needs a third concept: processed thread roots interleaved with new unthreaded messages. This requires a new prompt variant
buildSpineExtractionPrompt():System prompt: "ONLY ANALYZE messages labeled [New]. Messages labeled [Processed] are thread roots — they show where side-conversations branched off. Do not extract from [Processed] messages."
One Day At A Time — Workspace-Wide Cursor
Unlike code tool integrations (session-at-a-time), Slack advances a workspace-level cursor 24h per run:
extraction_point(ISO timestamp — workspace cursor)extractionWindow=[extraction_point, extraction_point + 24h]extractionLimit > 30 days ago: skip public channels (they only get 30 days backfill)a. Queue thread units (threads with new replies)
b. Queue spine unit (unthreaded messages in window)
extraction_pointby 24hWhy this approach? Real-world R&P data shows ~500-800 messages/month across all channels. A 24h window is ~16-26 messages — 1-2 LLM extraction calls. The 90-day DM backfill is 90 micro-runs, identical cadence to OpenCode's session-at-a-time approach.
Two-Speed Polling (New Architecture — Establish Pattern Here)
Current integrations use fixed 60s polling with no backoff, producing constant "nothing to do" log spam when caught up. Slack establishes the correct pattern:
Catch-up speed (default 60s): When
extraction_pointis more than 24h behind now.Steady-state speed (default 3600s / 1h): When
extraction_pointis within 24h of now.Processor uses
SyncResultto switch speeds:"imported"→ stay at catch-up speed (or reset to it)"nothing_to_do"→ switch to steady-state speedNo user config needed. No log spam. Fast during backfill, quiet when current.
Settings Shape
Add
slack?: SlackSettingstoHumanSettingsinsrc/core/types/entities.ts.Sources Tagging (requires #51)
All extracted items get tagged with the source channel:
This enables
ei topics --source slack "deployment"andei topics --source "slack:C09V5R90C0G"for channel-scoped queries.Real-World Data (R&P Workspace)
Sampled to validate window size assumptions:
What Changes
src/integrations/slack/types.ts,importer.ts,reader.tssrc/prompts/human/spine-scan.tssrc/core/types/entities.tsslack?: SlackSettingstoHumanSettingssrc/core/processor.tscheckAndSyncSlack(), two-speed polling logic,SyncResulttypesrc/core/tools/builtin/spotify-auth.ts)web/tui//auth slackcommand,/settingsSlack sectionCONTRACTS.md