Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions docs/01-notes-node-js-fundamental.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# 📔 Node.js Study Notes - Fundamentals (2026 Edition)

## 🏗️ 1. Foundations (Path & Process)

* **`process.cwd()`**: Returns the "Current Working Directory". It's the robot saying: "I am standing exactly here".
* **`process.argv`**: The array that stores everything you type in the terminal after the command.
* **`path.resolve`**: Convert relative routes into absolute routes.
* **`path.join`**: Avoid errors by joining parts of routes in a secure way (handles `/` vs `\`).
* **`path.relative`**: Cleans the route and gives only the path relative to another (e.g., from workspace to file).
* **`path.extname`**: Returns the file extension (e.g., `.txt`).
* **`path.dirname`**: Returns the name of the parent directory. Useful to know where a file is sitting.

---

## 📂 2. File System (fs/promises)

* **`fs/promises`**: Using the asynchronous version of the FS module so we can use `await`.
* **`readdir`**: Lists everything inside a folder.
* **`stat`**: Checks metadata (if it's a file, directory, and its size).
* **`readFile` / `writeFile`**: Basic commands to read content and create/overwrite files.
* **`mkdir`**: Creates a new directory. Used with `{ recursive: true }` to create nested folders (a/b/c) all at once.
* **`Buffer`**: A way to handle raw binary data. We used `Buffer.from(content, "base64")` to turn encoded text back into real files.
* **`.toString("base64")`**: Converts file content into a base64 string for easy storage in JSON.

---

## ⌨️ 3. CLI & Terminal UI

* **`readline`**: A module used to read input from the terminal line by line. It’s the "listening ear" of the app.
* **`process.uptime()`**: Returns how many seconds the Node process has been running.
* **`.toISOString()`**: Returns a string in standard ISO format (YYYY-MM-DDTHH:mm:ss.sssZ).
* **`\r` (Carriage Return)**: Moves the cursor back to the start of the line without jumping to the next one. Essential for "in-place" updates like progress bars.
* **ANSI Escape Codes**: Special sequences (like `\x1b[38;2;...`) used to color and format terminal text.
* **Hex to RGB**: Converting `#RRGGBB` into three numbers (Red, Green, Blue) so the terminal can apply colors.

---

## 🧩 4. Dynamic Modules

* **`import()`**: A function-like expression that allows you to load a module asynchronously on the fly.
* **`pathToFileURL`**: Converts a system path into a URL (`file:///...`), necessary for dynamic imports on modern Node.js versions.

---

## 🛡️ 5. Hashing & Security

* **`createHash('sha256')`**: Creates a unique "digital fingerprint" (64 characters). If one bit changes, the hash changes completely.
* **`createReadStream`**: Opens a "river" of data. Instead of drinking the whole thing at once (avoiding memory crashes), we process it by "trickles" (chunks).
* **`pipeline`**: The "glue." It connects streams safely and handles errors automatically. It's the modern way to pipe data.

---

## 🌊 6. Streams & Transformations

* **Transform Stream**: A duplex stream that modifies or transforms the data as it passes through (e.g., adding line numbers or filtering).
* **`process.stdin` / `process.stdout`**: The standard input (keyboard) and output (screen) streams of the process.
* **`chunk`**: A small piece of data (usually a Buffer) being processed in the stream pipeline.
* **Backpressure**: A situation where data is produced faster than it can be consumed; Streams handle this automatically to save memory.

---

## 🤐 7. Compression (Zlib & Brotli)

* **`zlib`**: The built-in module for compression and decompression.
* **Brotli (`.br`)**: A modern, high-efficiency compression algorithm from Google, superior to Gzip for text assets.
* **`createBrotliCompress` / `createBrotliDecompress`**: Transform streams used to shrink or restore data.

---

## 🧵 8. Worker Threads (Parallelism)

* **Main Thread**: The primary execution path. It handles the event loop and delegates heavy CPU tasks.
* **Worker Thread**: An independent thread that runs alongside the main thread.
* **`parentPort`**: The communication channel (walkie-talkie) between the worker and the main thread.
* **Divide and Conquer**: Splitting a large problem into smaller chunks to be solved in parallel by multiple workers.

---

## 👶 9. Child Processes

* **`spawn`**: Launches a new process (like `ls`, `date`, or any terminal command).
* **`stdio: inherit / pipe`**: Connecting the child's input/output channels to the parent process.
* **Exit Code**: A number returned by the process when it finishes. `0` means success; anything else indicates an error.
38 changes: 38 additions & 0 deletions docs/02-notes-data-processing-cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# 🛠️ Data Processing CLI - Technical Implementation Notes

## 🎮 1. REPL & State Management
* **Persistent State**: The application maintains a `state` object (e.g., `{ currentDir: homedir() }`) that persists throughout the session. This allows commands like `cd` to update the global context of the app.
* **Readline Interface**: Used to create a continuous loop. It "listens" for user input, processes it through a dispatcher, and then prompts the user again.
* **Graceful Exit**: Implementing `.exit` and handling `SIGINT` (Ctrl+C) to ensure the terminal returns to its normal state properly.

---

## ⚙️ 2. Professional Argument Parsing
* **Flag Extraction**: We built a custom parser to transform a raw string array `["--input", "data.csv"]` into a clean object `{ input: "data.csv" }`.
* **Boolean Flags**: The parser identifies flags without values (like `--save`) and assigns them a `true` value automatically.

---

## 🛤️ 3. Relative & Absolute Path Resolution
* **Context-Aware Paths**: Every command uses a `resolvePath` utility. It combines the `currentDir` from the app state with the user's input string.
* **`path.resolve`**: Crucial for handling both relative paths (`./file.txt`) and absolute paths (`/Users/dian/file.txt`) seamlessly.

---

## 🔒 4. Authenticated Encryption (AES-256-GCM)
* **GCM Mode**: Unlike basic encryption, GCM provides "authenticated encryption," ensuring the data wasn't tampered with.
* **The "Secret Sandwich"**: We learned to structure the output file as: `Salt (16b)` + `IV (12b)` + `Encrypted Data` + `AuthTag (16b)`.
* **Scrypt**: Used for Key Derivation. It turns a simple password into a cryptographically strong 32-byte key using a random Salt.

---

## 🏎️ 5. Parallel Processing with Workers
* **CPU Core Scaling**: Using `os.cpus().length` to determine how many Worker Threads to spawn, maximizing hardware performance.
* **Line-Boundary Splitting**: When processing large files, we split them into chunks but ensure we don't cut a log line in half (Divide and Conquer).
* **Stats Aggregation**: Each worker calculates "partial stats" and sends them back via `parentPort`. The main thread then "reduces" or merges these into the final result.

---

## 🌊 6. Advanced Stream Pipelines
* **`pipeline()` from `node:stream/promises`**: The most robust way to connect a `ReadStream`, a `Transform` (like CSV-to-JSON), and a `WriteStream`. It automatically manages memory (backpressure) and closes all streams if an error occurs.
* **Memory Efficiency**: By using Streams, the application can process 10GB files using only a few megabytes of RAM.
Loading