[bug] ChatGPT HTML bulk import failed.

**Describe the bug**

Three distinct issues encountered during a 1,537-conversation ChatGPT HTML bulk import:

**Issue 1 — Import dialog silently dismissed on focus loss**
While the first import was running (~800 records processed over an afternoon), clicking outside the import dialog caused it to disappear. No confirmation prompt, no way to recover or resume. The import process stopped.

**Issue 2 — Re-import performance collapses**
After re-importing the same file, the second run completed only **37 records in ~2.7 hours** (progress.log shows timestamps from `03:44:30` to `06:23:30` UTC, Apr 10). The first import was processing at a much higher rate before interruption.

**Issue 3 — Server-wide degradation after re-import**
After the re-import, the entire Nowledge Mem server became degraded. Ingesting data through **any** channel (Alma plugin, nmem CLI, MCP) started failing at a high rate.

**To Reproduce**

1. Prepare a ChatGPT HTML export (~1,537 conversations, format: `chatgpt_html_bulk`)
2. Start import in Nowledge Mem desktop app
3. Let it run — first import proceeds at reasonable speed (~800 records in an afternoon)
4. Click outside the import dialog → dialog disappears, import stops
5. Re-import the same file
6. Observe: import crawls (~37 records in 2.7 hours), server degrades for all ingest channels

**Expected behavior**

- Import dialog should be a persistent modal — not dismissable by accidental click
- Import should be resumable / support deduplication on re-import
- A failed or repeated import should not corrupt the WAL or degrade the server

**Screenshots**

_(Available on request)_

**Additional context**

## Environment

| Item | Detail |
|------|--------|
| OS | Windows 11 (VM), build 10.0.26200.7985 |
| RAM | 64 GB |
| CPU | AMD EPYC 7763, 16 vCPU |
| Nowledge Mem version | **0.6.19** (confirmed via UI) |
| Embedding model | `BAAI/bge-m3` (1024-dim, fastembed backend) |
| LLM model | gpt-5.3-codex |
| DB size | 331 MB (`nowledge_graph_v2_v1.db`) |
| Search index size | 319 MB (Lance format, 6 indices) |
| Total log volume | ~153 MB across 6 rotated log files |
| Import file | ChatGPT HTML export, 1,537 conversations |

---

## Diagnostic evidence from local logs

### 5 separate import attempts logged (all for the same 1,537-thread file)

| # | Timestamp (local, UTC+8) | Job ID |
|---|---|---|
| 1 | 2026-04-09 14:37 | `ed4da345-6f26-4ea7-bd9d-d15cfb07809f` |
| 2 | 2026-04-09 23:19 | `51e53486-e7f2-4745-91b9-4c86e1a91dff` |
| 3 | 2026-04-10 01:54 | `28058c7e-0dd7-4733-9374-e05e65fcce2b` |
| 4 | 2026-04-10 11:43 | `627b6e82-4621-4fc3-bdd5-f3acd72a528a` |
| 5 | 2026-04-10 15:02 | `1850ac39-3d05-4df5-97cf-c14c4e5cf305` |

### progress.log: only 37 records imported, with erratic timing

```
2026-04-10T03:44:30  (record 1)
2026-04-10T03:44:31  (record 2,  1s gap)
2026-04-10T03:51:47  (record 3,  7min gap)
2026-04-10T04:41:49  (record 4,  50min gap)
2026-04-10T05:44:00  (record 5,  62min gap)
2026-04-10T05:46:36 → 05:46:49  (records 6–18, ~1s each — burst)
2026-04-10T05:49:24 → 05:52:05  (records 19–25, mixed)
2026-04-10T05:54:41 → 06:02:47  (records 26–35, mixed)
2026-04-10T06:10:27  (record 36, 8min gap)
2026-04-10T06:23:30  (record 37, 13min gap)
```

Pattern: short bursts of ~1s/record interrupted by long stalls (up to 62 minutes), suggesting repeated contention or blocking.

### WAL corruption detected twice

```
[2026-04-10 08:56:07] WARN: Detected corrupted WAL file, attempting recovery
  error: "Corrupted wal file. Read out invalid WAL record type."
  → Moved aside as *.wal.corrupt-20260410T005607Z (36 KB)
```

An earlier corruption was also recovered on Apr 7 (103 KB). Both timestamps correlate with import activity.

### Checkpoint contention — error distribution across all logs

| Log file | Time range | Checkpoint retrying | Checkpoint ALL RETRIES EXHAUSTED | FTS search failed | Corrupted WAL |
|---|---|---|---|---|---|
| app.log.5 | Apr 8 01:36–02:20 | 0 | 0 | 66 | 0 |
| app.log.4 | Apr 8 08:54–22:01 | 11 | 3 | 278 | 0 |
| app.log.3 | Apr 8 22:01–Apr 9 01:54 | 8 | 1 | 12 | 13 |
| app.log.2 | Apr 9 01:54–Apr 10 00:25 | **212** | **16** | 9 | 4 |
| app.log.1 | Apr 10 00:25–08:54 | **66** | **21** | 3 | 0 |
| app.log   | Apr 10 08:55–15:22 | **71** | **18** | **199** | **3** |

The recurring error message:
```
"Timeout waiting for active transactions to leave the system before checkpointing.
 If you have an open transaction, please close it and try again."
```

Checkpoint failures per hour on Apr 10 (showing peak contention midday):
```
09h: 4  |  10h: 1  |  11h: 6  |  12h: 23  |  13h: 24  |  14h: 23  |  15h: 9
```

### FTS index corruption

197 `FTS search failed on entities_index` errors caused by missing Lance data files:
```
error: "Not found: .../entities_index.lance/data/<hash>.lance"
error: "FileDoesNotExist("meta.json")"
```

### Other ingest channels affected

Alma plugin thread appends fail with "Thread not found" errors, confirming server-wide degradation:
```
[2026-04-10 09:00:05] ERROR: Invalid request for append — Thread not found: alma-<redacted>
[2026-04-10 14:23:27] ERROR: Invalid request for append — Thread not found: alma-<redacted>
[2026-04-10 15:12:06] ERROR: Invalid request for append — Thread not found: alma-<redacted>
```

---

## Windows Defender / Exploit Guard investigation (ruled out)

We investigated Windows Defender Real-Time Protection (RTP) and Controlled Folder Access (CFA) as a possible cause of the WAL corruption and import performance issues. After thorough analysis, **Defender has been ruled out** as a contributing factor.

### Key evidence

- **CFA mode = 3 (audit only), 0 events targeting the NowledgeGraph data directory.** Audit mode does not block or delay file operations — it only logs what *would* be blocked in enforcement mode.
- **All 104 Nowledge-related CFA events target the exe file (`nowledge-mem.exe`), not the data directory.** No CFA events reference any `.db`, `.wal`, or `.lance` files.
- **The NowledgeGraph data directory (`%LOCALAPPDATA%\NowledgeGraph`) is NOT in CFA's default protected folders.** CFA protects user-profile folders like Documents, Desktop, and Pictures — not `AppData\Local`. Writes to the data directory are not subject to CFA at all.
- **Zero Defender events during the import stall windows (03:44–06:24 on Apr 10).** The 50–62 minute stalls in progress.log have no corresponding Defender activity.
- **Write benchmarks show negligible RTP overhead: 1.1ms vs 1.0ms per 4KB write.** Real-Time Protection scanning adds ~0.1ms per write — nowhere near enough to explain multi-minute stalls.
- **Zero Defender events mention any `.db`, `.wal`, or `.lance` files.** Defender is simply not interacting with the Nowledge data files.

### Conclusion

Windows Defender is not the root cause of the WAL corruption or import performance issues.

---

### Summary

Core asks for the maintainers:

- **Make the import dialog non-dismissable / add resume capability** — if the import is interrupted or the dialog is closed, there is currently no way to resume; the partial state left behind leads to the errors documented above
- **Investigate WAL/checkpoint contention under concurrent or repeated imports** — the WAL corruption and FTS index errors appear reproducible when imports are retried after a failure, suggesting possible contention or incomplete cleanup
- **Investigate internal WAL checkpoint contention and state cleanup after interrupted imports** — this appears to be an application-level issue rather than an environmental one


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] ChatGPT HTML bulk import failed. #181

Environment

Diagnostic evidence from local logs

5 separate import attempts logged (all for the same 1,537-thread file)

progress.log: only 37 records imported, with erratic timing

WAL corruption detected twice

Checkpoint contention — error distribution across all logs

FTS index corruption

Other ingest channels affected

Windows Defender / Exploit Guard investigation (ruled out)

Key evidence

Conclusion

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Item	Detail
OS	Windows 11 (VM), build 10.0.26200.7985
RAM	64 GB
CPU	AMD EPYC 7763, 16 vCPU
Nowledge Mem version	0.6.19 (confirmed via UI)
Embedding model	`BAAI/bge-m3` (1024-dim, fastembed backend)
LLM model	gpt-5.3-codex
DB size	331 MB (`nowledge_graph_v2_v1.db`)
Search index size	319 MB (Lance format, 6 indices)
Total log volume	~153 MB across 6 rotated log files
Import file	ChatGPT HTML export, 1,537 conversations

#	Timestamp (local, UTC+8)	Job ID
1	2026-04-09 14:37	`ed4da345-6f26-4ea7-bd9d-d15cfb07809f`
2	2026-04-09 23:19	`51e53486-e7f2-4745-91b9-4c86e1a91dff`
3	2026-04-10 01:54	`28058c7e-0dd7-4733-9374-e05e65fcce2b`
4	2026-04-10 11:43	`627b6e82-4621-4fc3-bdd5-f3acd72a528a`
5	2026-04-10 15:02	`1850ac39-3d05-4df5-97cf-c14c4e5cf305`

Log file	Time range	Checkpoint retrying	Checkpoint ALL RETRIES EXHAUSTED	FTS search failed	Corrupted WAL
app.log.5	Apr 8 01:36–02:20	0	0	66	0
app.log.4	Apr 8 08:54–22:01	11	3	278	0
app.log.3	Apr 8 22:01–Apr 9 01:54	8	1	12	13
app.log.2	Apr 9 01:54–Apr 10 00:25	212	16	9	4
app.log.1	Apr 10 00:25–08:54	66	21	3	0
app.log	Apr 10 08:55–15:22	71	18	199	3

[bug] ChatGPT HTML bulk import failed. #181

Description

Environment

Diagnostic evidence from local logs

5 separate import attempts logged (all for the same 1,537-thread file)

progress.log: only 37 records imported, with erratic timing

WAL corruption detected twice

Checkpoint contention — error distribution across all logs

FTS index corruption

Other ingest channels affected

Windows Defender / Exploit Guard investigation (ruled out)

Key evidence

Conclusion

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions