Mosaic is a high-performance wordlist generator focused on OSINT and human profiling, heavily inspired by the legendary tool. Developed in Go, it replaces blind brute-force generation with behavioral heuristics, processed through a highly concurrent architecture (
and MPSC Channels).
Classic dictionary generators fail because they use a Naive Cartesian Product of all user inputs. If a target has a set of keywords 15_Alice, 07.Bob), wasting CPU time during generation and network time during attacks.
Mosaic implements a 3-Level Combinatorial Model based on human psychology. We define the Base Words set 123, !, years). The generation function creates the password set
This drastically prunes the search tree, ensuring that unrealistic combinations are never allocated in memory, focusing computational power on creating hyper-realistic targets like Aj_15901990 (where
-
Base Heuristic Generation: Let
$|B|$ be the number of bases and$|S|$ the number of suffixes. The Level 3 loop runs in$\mathcal{O}(|B|^2 \cdot |S|)$ . Since$|B|$ and$|S|$ are bounded by human profile inputs (usually$< 100$ ), this executes in milliseconds. -
O(1) Deduplication: We use a Go
map[string]struct{}. Checking and inserting unique strings occurs in$\mathcal{O}(1)$ average case. -
Leetspeak Mutation (Backtracking): The most computationally heavy task. For a password of length
$N$ with a maximum branching factor of leet options$L$ (e.g., 'a' -> 'a', '@', '4' means$L=3$ ), the complexity is$\mathcal{O}(L^N)$ . Thanks to concurrent Workers, this load is distributed across$C$ CPU cores, resulting in a real processing time of$\mathcal{O}(\frac{L^N}{C})$ .
- Base memory usage is dominated by the deduplication map, taking
$\mathcal{O}(U)$ , where$U$ is the number of unique base strings generated. - The Leetspeak algorithm was written as Zero-Allocation. We operate directly on a byte slice (
[]byte), mutating indexes in-place and reverting them (backtracking). Local space complexity:$\mathcal{O}(N)$ for the recursive call stack, generating zero garbage for the Go GC.
- Worker Pool: Mosaic automatically detects
runtime.NumCPU()and spawns symmetric mutation Workers matching your physical/logical cores. - MPSC (Multi-Producer, Single-Consumer): Multiple Workers process strings and inject them into the buffered
passChan. A single consumer Goroutine handles thebufio.Flush()directly to the disk, eliminating Race Conditions and I/O blocking.
To ensure Mosaic provides a tactical advantage in real-world scenarios, we benchmarked it against the industry standard (CUPP - Python 3) using a strict 1:1 heuristic generation test.
Note: The test was conducted without Leetspeak mutations to ensure a fair baseline comparison of the core generation engine.
| Metric | CUPP Original (Python 3) | Mosaic (Go 1.20+) | Advantage |
|---|---|---|---|
| Execution Time | 0.42 Seconds | 0.15 Seconds | ~2.8x Faster |
| Memory Peak (RAM) | 30.4 MB | 25.0 MB | 17% Less RAM |
| CPU Utilization | 99% (GIL Limit - 1 Core) | 312% (Multi-Core) | Perfect Scaling |
| Workload (I/O Writes) | 272 blocks written | 1,712 blocks written | Generated ~6.2x more permutations |
- Breaking the GIL: Python is limited by the Global Interpreter Lock (GIL), pinning execution to a single core. Mosaic uses Go's Goroutines to detect your
runtime.NumCPU()and distribute mathematical loads across all available physical and logical cores simultaneously. - Asynchronous I/O via MPSC: Disk writing is the ultimate bottleneck. Mosaic implements a Multi-Producer Single-Consumer channel architecture. CPU workers push passwords to memory buffers in microseconds, while a single, dedicated Goroutine safely flushes the buffer to disk without race conditions.
- Zero-Allocation Mutations: Deep string permutations stress the Garbage Collector. Mosaic performs in-place byte slice (
[]byte) manipulation with backtracking algorithms, generating near-zero garbage.
While Mosaic is highly optimized, there are theoretical boundaries that can be evolved:
-
Memory Management for Massive Dictionaries: Currently,
map[string]struct{}holds all permutations in RAM before sending them to Workers. For bulk keywords, a Bloom Filter could be implemented for streaming deduplication. -
Context-Aware Leetspeak: The current backtracking replaces all occurrences blindly. A heuristic entropy mutation would limit substitutions only to the first or last vowel, drastically cutting the
$\mathcal{O}(L^N)$ time for long passwords. - Regional Pattern Injection (Keyboard Walks): Humans frequently add "qwer" or "1q2w" as connectors. Adding keyboard walks to Level 2 joins would increase the hit probability against corporate targets.
Fast Build (Optimized):
go build -ldflags="-w -s" -o mosaic main.go