Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
e091b54
Enhance Teams Export with performance and UX improvements
claude Nov 7, 2025
b3f10b9
Add progress indicators and load all chats for complete visibility
claude Nov 7, 2025
24386ec
Add chat list caching and interactive search functionality
claude Nov 7, 2025
921e8b6
Fix chat sorting to match Teams desktop client behavior
claude Nov 7, 2025
9da8632
Fix Graph API message limit and document sorting limitation
claude Nov 7, 2025
f14da7e
Extend cache TTL to 24h and add interactive cache refresh
claude Nov 7, 2025
7288ac2
Switch from Jira Wiki Markup to standard Markdown with image support
claude Nov 7, 2025
6bf96d6
Refactor: extract chat loading with progress into reusable function
claude Nov 7, 2025
f5d79c3
Fix emoji alignment in interactive chat table
claude Nov 7, 2025
59f6ade
Add setup.py for pipx compatibility with dependency installation
claude Nov 7, 2025
f94426e
Add message sorting and interactive date range selection
claude Nov 7, 2025
a86c693
Fix image attachment display - don't show '[No content]' when attachm…
claude Nov 7, 2025
fa26234
Make 'Today (last 24 hours)' the default export period
claude Nov 7, 2025
d24783c
Fix AttributeError when message 'from' field is None (system messages)
claude Nov 7, 2025
49300bb
Extract inline images from HTML content in Teams messages
claude Nov 7, 2025
1080e7d
Add automatic image download functionality
claude Nov 7, 2025
da73859
Fix file extensions for downloaded images
claude Nov 7, 2025
9d2b30c
Add HTML export format with embedded base64 images
claude Nov 7, 2025
8451f84
Fix HTML export to embed images as base64
claude Nov 7, 2025
1b59e1b
Change default date range from 1 year to 24 hours
claude Nov 7, 2025
9be6586
Add JavaScript copy button for proper image clipboard handling
claude Nov 7, 2025
b6b6524
Simplify HTML copy functionality for better compatibility
claude Nov 7, 2025
e1e9894
Add Word document (docx) export format for Jira/Confluence
claude Nov 7, 2025
a76af9e
Add build/, dist/, and .DS_Store to .gitignore
claude Nov 7, 2025
2deb939
Fix docx formatter to use correct message field names
claude Nov 7, 2025
9219888
Enable image downloads for docx format
claude Nov 7, 2025
4af1d65
Add support for downloading all file types, not just images
claude Nov 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@ __pycache__/
.exports/
exports/
.env
build/
dist/
.DS_Store
137 changes: 128 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,29 +35,148 @@ Additional background lives in the internal wiki: [Arkadium IT Knowledge Base](h

## Usage

### Quick Start (Interactive Mode)

The simplest way to export a chat is to run without any arguments:

```bash
teams-export
```

This will:
1. Authenticate with Microsoft Graph
2. Show an interactive menu with your 20 most recent chats
3. Let you select the chat by number
4. Export today's messages in Jira-friendly format

### Export by User Email (1:1 chats)

```bash
teams-export --user "john.smith@company.com"
```

### Export by Chat Name (Group chats)

```bash
teams-export --chat "Project Alpha Team"
```

### Export with Date Range

```bash
# Specific dates
teams-export --user "john.smith@company.com" --from 2025-10-23 --to 2025-10-25

# Using keywords
teams-export --user "john.smith@company.com" --from "last week" --to "today"
```
teams-export --user "john.smith@company.com" --from 2025-10-23 --to 2025-10-23 --format json

### Export in Different Formats

```bash
# Markdown (default) - works in Jira, GitHub, Confluence, etc.
teams-export --user "john.smith@company.com" --format jira

# JSON for programmatic processing
teams-export --user "john.smith@company.com" --format json

# CSV for spreadsheet analysis
teams-export --user "john.smith@company.com" --format csv
```

- `--user` targets 1:1 chats by participant name or email.
- `--chat` targets group chats by display name.
- `--from` / `--to` accept `YYYY-MM-DD`, `today`, or `last week`.
- `--format` supports `json` (default) or `csv`.
The default Markdown format includes:
- Standard Markdown syntax (compatible with Jira, GitHub, Confluence)
- Clickable links for attachments
- Inline image rendering for shared images
- Message quotes and formatting preserved

### Other Options

- `--list` prints available chats with participants.
- `--all` exports every chat in the provided window.
- `--all` exports every chat in the provided window (uses parallel processing for speed).
- `--force-login` clears the cache and forces a new device code login.
- `--refresh-cache` forces refresh of chat list (bypasses 24-hour cache).
- `--output-dir` specifies where to save exports (default: `./exports/`).

**Interactive Menu Controls:**
- Enter number (1-20) to select a chat
- Press `s` to search across all chats
- Press `c` to refresh chat list from API
- Press `q` to quit

### Examples

```bash
# Interactive selection with custom date range
teams-export --from "2025-10-01" --to "2025-10-31"

Exports are saved under `./exports/` by default with filenames like `john_smith_2025-10-23.json`.
# Export all chats from last week in parallel
teams-export --all --from "last week" --format jira

## Token Cache
# List all available chats
teams-export --list

# Export specific user's chat for today
teams-export --user "jane.doe@company.com"
```

Exports are saved under `./exports/` by default with filenames like `john_smith_2025-10-23.md` (for Markdown/Jira format) or `john_smith_2025-10-23.json`.

## Caching

### Token Cache
MSAL token cache is stored at `~/.teams-exporter/token_cache.json`. The cache refreshes automatically; re-run with `--force-login` to regenerate the device flow.

### Chat List Cache
To speed up repeated operations, the chat list is cached locally for 24 hours at `~/.teams-exporter/cache/chats_cache.json`.

**First run:** Loads all chats from API (~30-60 seconds for 1000+ chats)
**Subsequent runs (within 24h):** Instant load from cache

To refresh the cache:
- **Interactive menu**: Press `c` during chat selection to refresh and reload
- **Command line**: Use `--refresh-cache` flag to force refresh before showing menu

**Note:** Chats are sorted by last message timestamp (using `lastMessagePreview`), matching the behavior of the Teams desktop client.

### Graph API Sorting Limitation

The Microsoft Graph API's `/me/chats` endpoint does **not** support the `$orderby` query parameter ([see official documentation](https://learn.microsoft.com/en-us/graph/api/chat-list?view=graph-rest-1.0&tabs=http#optional-query-parameters)). This means:

- Chats cannot be sorted server-side by last message time
- All chats must be loaded to achieve correct chronological sorting
- Client-side sorting is performed using `lastMessagePreview.createdDateTime`

This is why the initial load fetches all chats (with progress indication) rather than loading only the most recent N chats. The 24-hour cache ensures subsequent runs are instant.

## Features

### Performance Optimizations
- **Chat list caching**: 24-hour local cache makes repeated runs instant
- **Parallel exports**: When using `--all`, exports multiple chats concurrently (up to 3 at once)
- **Automatic retry**: Handles API rate limiting (429) and server errors (5xx) with exponential backoff
- **Optimized pagination**: Fetches 50 messages per request (Graph API maximum)
- **Smart filtering**: Stops fetching when messages are outside the date range

### User Experience Improvements
- **Interactive chat selection**: Beautiful menu with chat names, types, and last activity
- **Multiple match handling**: If search finds multiple chats, shows menu instead of error
- **Markdown format**: Standard Markdown output that works in Jira, GitHub, Confluence, and other platforms
- Clean HTML conversion (removes tags, preserves formatting)
- Blockquote formatting (`>`) for message content
- Standard Markdown headers (`##`, `###`) and emphasis (`**bold**`, `*italic*`)
- Attachment support with clickable links
- **Image support**: Images from chat attachments rendered as `![name](url)`
- Reaction indicators
- Proper timestamp formatting
- **Smart defaults**: Defaults to today's date if not specified
- **Progress tracking**: Shows real-time progress for multi-chat exports

## Limitations

- Requires delegated permissions for the signed-in user.
- Attachments are referenced in the output but not downloaded.
- Microsoft Graph API throttling is not yet handled with automatic retries.
- Parallel exports limited to 3 concurrent requests to avoid API throttling.

## Security Notes

Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@ dependencies = [
"typer>=0.12",
"requests>=2.32",
"msal>=1.28",
"python-dateutil>=2.9"
"python-dateutil>=2.9",
"wcwidth>=0.2",
"python-docx>=1.0"
]

[project.scripts]
Expand Down
9 changes: 9 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Setup script for teams-export.

This file is for compatibility with older build tools.
The main configuration is in pyproject.toml.
"""
from setuptools import setup

# Configuration is in pyproject.toml
setup()
82 changes: 82 additions & 0 deletions src/teams_export/cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
"""Local caching for chat lists to speed up repeated operations."""

from __future__ import annotations

import json
import time
from pathlib import Path
from typing import List, Optional


DEFAULT_CACHE_DIR = Path("~/.teams-exporter/cache").expanduser()
CACHE_TTL_SECONDS = 86400 # 24 hours (1 day)


class ChatCache:
"""Simple file-based cache for chat lists."""

def __init__(self, cache_dir: Path = DEFAULT_CACHE_DIR):
self.cache_dir = cache_dir
self.cache_file = cache_dir / "chats_cache.json"

def get(self, user_id: str) -> Optional[List[dict]]:
"""Get cached chats for a user if still valid.

Args:
user_id: User identifier (from token claims or 'me')

Returns:
List of chats if cache is valid, None otherwise
"""
if not self.cache_file.exists():
return None

try:
with self.cache_file.open("r", encoding="utf-8") as f:
cache_data = json.load(f)

# Check if cache is for the same user
if cache_data.get("user_id") != user_id:
return None

# Check if cache is still fresh
cached_time = cache_data.get("timestamp", 0)
age = time.time() - cached_time
if age > CACHE_TTL_SECONDS:
return None

chats = cache_data.get("chats", [])
return chats if chats else None

except (json.JSONDecodeError, KeyError, OSError):
return None

def set(self, user_id: str, chats: List[dict]) -> None:
"""Cache chat list for a user.

Args:
user_id: User identifier
chats: List of chat objects to cache
"""
self.cache_dir.mkdir(parents=True, exist_ok=True)

cache_data = {
"user_id": user_id,
"timestamp": time.time(),
"chats": chats,
}

try:
with self.cache_file.open("w", encoding="utf-8") as f:
json.dump(cache_data, f, indent=2)
except OSError:
# Silently fail if can't write cache
pass

def clear(self) -> None:
"""Clear the cache file."""
try:
if self.cache_file.exists():
self.cache_file.unlink()
except OSError:
pass
Loading