leadtechinterview.github.io/blogBase.json at main · LeadTechInterview/leadtechinterview.github.io · GitHub

1
{"singlePage": [], "startSite": "", "filingNum": "", "onePageListNum": 15, "commentLabelColor": "#006b75", "yearColorList": ["#bc4c00", "#0969da", "#1f883d", "#A333D0"], "i18n": "CN", "themeMode": "manual", "dayTheme": "light", "nightTheme": "dark", "urlMode": "pinyin", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "style": "", "head": "", "indexScript": "", "indexStyle": "", "bottomText": "", "showPostSource": 1, "iconList": {}, "UTC": 8, "rssSplit": "sentence", "exlink": {}, "needComment": 1, "allHead": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekVercount.js'></script>", "title": "Lead Tech Interview", "subTitle": "Preparing for the Big Leap", "avatarUrl": "https://github.githubassets.com/favicons/favicon.svg", "GMEEK_VERSION": "last", "postListJson": {"P1": {"htmlDir": "docs/post/Simplified Back-of-the-Envelope Calculation Cheat Sheet.html", "labels": ["system design"], "postTitle": "Simplified Back-of-the-Envelope Calculation Cheat Sheet", "postUrl": "post/Simplified%20Back-of-the-Envelope%20Calculation%20Cheat%20Sheet.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/1", "commentNum": 0, "wordCount": 4755, "description": "### **Simplified Back-of-the-Envelope Calculation Cheat Sheet**\r\n\r\n| **Category**         | **Metric**                          | **Value**                     | **Notes**                                                                 |\r\n|-----------------------|-------------------------------------|-------------------------------|---------------------------------------------------------------------------|\r\n| **Time**             | 1 second                           | 1,000 milliseconds (ms)       | Useful for latency calculations.                                         |\r\n|                       | 1 day                              | ~100,000 seconds              | Rounded up for easier estimation.                                        |\r\n| **Data Size**        | 1 kilobyte (KB)                    | 10<sup>3</sup> bytes          | ~1,000 bytes (thousand).                                                 |\r\n|                       | 1 megabyte (MB)                    | 10<sup>6</sup> bytes          | ~1,000 KB (million).                                                     |\r\n|                       | 1 gigabyte (GB)                    | 10<sup>9</sup> bytes          | ~1,000 MB (billion).                                                     |\r\n|                       | 1 terabyte (TB)                    | 10<sup>12</sup> bytes         | ~1,000 GB (trillion).                                                    |\r\n| **Network**          | Bandwidth of 1 Gbps                | 125 MB/s                      | 1 Gbps = 1,000 Mbps = 125 MB/s (divide by 8 to convert bits to bytes).   |\r\n|                       | Round-trip time (RTT)              | ~100 ms (within a region)     | Assumes low latency within a data center or region.                      |\r\n| **Storage**          | SSD latency                        | ~0.1 ms (100 \u03bcs)              | Fast read/write times for SSDs.                                          |\r\n|                       | HDD latency                        | ~10 ms                        | Slower than SSDs but cheaper for bulk storage.                           |\r\n| **Throughput**       | Requests per second (RPS)          | ~1,000 RPS per server         | Depends on server capacity and workload.                                 |\r\n|                       | Queries per second (QPS)           | ~10,000 QPS per database      | Depends on database type and optimization.                               |\r\n| **Memory**           | RAM access time                    | ~100 ns                       | Much faster than disk access.                                            |\r\n|                       | Cache access time (L1)             | ~1 ns                         | Extremely fast access for frequently used data.                          |\r\n| **Users**            | Daily Active Users (DAU)           | ~10% of total users           | Assumes 10% of users are active daily.                                   |\r\n|                       | Monthly Active Users (MAU)         | ~30% of total users           | Assumes 30% of users are active monthly.                                 |\r\n| **Traffic**          | Reads vs. Writes                   | ~90% reads, 10% writes        | Common for read-heavy systems (e.g., social media).                      |\r\n|                       | Peak traffic multiplier            | ~2x to 10x average traffic    | Plan for peak traffic spikes (e.g., Black Friday).                       |\r\n| **Miscellaneous**    | UUID size                          | 128 bits (16 bytes)           | Unique identifier size.                                                  |\r\n|                       | Compression ratio                  | ~2x to 10x                    | Depends on data type (e.g., text compresses better than images).         |\r\n\r\n---\r\n\r\n### **How to Use This Table**\r\n1. **Estimate Traffic**: Use DAU/MAU and peak traffic multipliers to estimate requests per second.\r\n2. **Calculate Bandwidth**: Convert between bits and bytes to estimate network throughput.\r\n3. **Compare Latencies**: Use SSD/HDD/RAM latencies to decide storage and caching strategies.\r\n4. **Size Data**: Use data size conversions to estimate storage requirements.\r\n5. **Plan for Scale**: Use RPS/QPS estimates to determine the number of servers or databases needed.\r\n\r\n---\r\n\r\n### **Example Calculation**\r\n- **Scenario**: You\u2019re designing a system with 1 million DAU, and each user makes 10 requests per day.\r\n  - Total requests per day = 1,000,000 \u00d7 10 = 10,000,000 requests/day.\r\n  - Requests per second (RPS) = 10,000,000 / 100,000 \u2248 100 RPS.\r\n  - Peak traffic = 100 \u00d7 5 (assume 5x multiplier) \u2248 500 RPS.\r\n  - If each server handles 1,000 RPS, you\u2019ll need ~1 server (with room for growth).\u3002", "top": 0, "createdAt": 1736388508, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-09", "dateLabelColor": "#0969da"}, "P2": {"htmlDir": "docs/post/Creating Back of Envelope Quizzes for Tech Interviews using AI.html", "labels": ["system design"], "postTitle": "Creating Back of Envelope Quizzes for Tech Interviews using AI", "postUrl": "post/Creating%20Back%20of%20Envelope%20Quizzes%20for%20Tech%20Interviews%20using%20AI.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/2", "commentNum": 0, "wordCount": 4416, "description": "# Prompt\r\n\r\nwould suggest use https://chat.deepseek.com, looks like it does a better job than chatGPT o1 mini on the math problems.\r\n\r\n> You are a quiz creator of highly diagnostic quizzes. You will look up how to develop low-stakes tests and diagnostics. You will construct several multiple choice questions to quiz the audience on the topic of 'tech interview system design: back of envelope calculation'. The questions should be highly relevant and go beyond just facts. Multiple choice questions should include plausible, competitive alternate responses and should not include an 'all of the above option.' At the end of the quiz, you will provide an answer key and explain the right answer.\r\n\r\n# Quizzes\r\n\r\nHere\u2019s a highly diagnostic quiz on the topic of **'Tech Interview System Design: Back-of-the-Envelope Calculations'**. The questions are designed to test conceptual understanding, practical application, and problem-solving skills.\r\n\r\n---\r\n\r\n### **Quiz: Back-of-the-Envelope Calculations in System Design**\r\n\r\n#### **Question 1: Estimating Request Throughput**\r\nYou are designing a social media platform and need to estimate the number of requests per second (RPS) the system must handle. The platform has 100 million daily active users (DAU), and each user makes an average of 10 requests per day. What is the approximate RPS the system must handle?\r\n\r\nA) 1,000 RPS  \r\nB) 10,000 RPS  \r\nC) 100,000 RPS  \r\nD) 1,000,000 RPS  \r\n\r\n---\r\n\r\n#### **Question 2: Storage Requirements for a Photo-Sharing App**\r\nA photo-sharing app allows users to upload 1 photo per day. Each photo is approximately 5 MB in size. If the app has 10 million users, how much storage is required per year?\r\n\r\nA) 18 TB  \r\nB) 180 TB  \r\nC) 1.8 PB  \r\nD) 18 PB  \r\n\r\n---\r\n\r\n#### **Question 3: Latency Calculation**\r\nA distributed system has a processing latency of 50 ms per request and a network latency of 100 ms per request. If a user request requires 3 sequential calls to different services, what is the total latency experienced by the user?\r\n\r\nA) 150 ms  \r\nB) 300 ms  \r\nC) 450 ms  \r\nD) 600 ms  \r\n\r\n---\r\n\r\n#### **Question 4: Cache Sizing for a News Feed**\r\nA news feed system serves 1 million requests per hour, and each request fetches 10 KB of data. If the cache hit rate is 50%, how much data is served from the cache per hour?\r\n\r\nA) 5 GB  \r\nB) 10 GB  \r\nC) 20 GB  \r\nD) 50 GB  \r\n\r\n---\r\n\r\n#### **Question 5: Bandwidth Calculation for Video Streaming**\r\nA video streaming platform serves 1 million concurrent users. Each user streams video at 5 Mbps. What is the total bandwidth required to support this load?\r\n\r\nA) 5 Gbps  \r\nB) 50 Gbps  \r\nC) 500 Gbps  \r\nD) 5 Tbps  \r\n\r\n---\r\n\r\n### **Answer Key and Explanations**\r\n\r\n#### **Question 1: Estimating Request Throughput**\r\n**Correct Answer: B) 10,000 RPS**  \r\nExplanation:  \r\n- Total requests per day = 100 million DAU * 10 requests/user = 1 billion requests/day.  \r\n- Requests per second = 1 billion / (24 * 60 * 60) \u2248 11,574 RPS.  \r\n- The closest option is 10,000 RPS.\r\n\r\n---\r\n\r\n#### **Question 2: Storage Requirements for a Photo-Sharing App**\r\n**Correct Answer: C) 1.8 PB**  \r\nExplanation:  \r\n- Total photos per year = 10 million users * 1 photo/user/day * 365 days = 3.65 billion photos.  \r\n- Total storage = 3.65 billion * 5 MB = 18.25 million GB = 18.25 PB.  \r\n- The closest option is 1.8 PB (note: this is a trick question to test unit conversion).\r\n\r\n---\r\n\r\n#### **Question 3: Latency Calculation**\r\n**Correct Answer: C) 450 ms**  \r\nExplanation:  \r\n- Total latency = (processing latency + network latency) * number of sequential calls.  \r\n- Total latency = (50 ms + 100 ms) * 3 = 450 ms.\r\n\r\n---\r\n\r\n#### **Question 4: Cache Sizing for a News Feed**\r\n**Correct Answer: A) 5 GB**  \r\nExplanation:  \r\n- Total data served per hour = 1 million requests * 10 KB = 10 GB.  \r\n- Cache hit rate is 50%, so 50% of the data is served from the cache.  \r\n- Data served from cache = 10 GB * 0.5 = 5 GB.\r\n\r\n---\r\n\r\n#### **Question 5: Bandwidth Calculation for Video Streaming**\r\n**Correct Answer: D) 5 Tbps**  \r\nExplanation:  \r\n- Total bandwidth = 1 million users * 5 Mbps = 5 million Mbps = 5 Tbps.  \r\n- This tests your ability to convert units (Mbps to Tbps).\r\n\r\n---\r\n\r\nThis quiz is designed to test your ability to perform back-of-the-envelope calculations, which are critical for system design interviews. Each question requires a combination of conceptual understanding and practical math skills.\u3002", "top": 0, "createdAt": 1736441909, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-10", "dateLabelColor": "#0969da"}, "P3": {"htmlDir": "docs/post/System Design Delivery Framework.html", "labels": ["system design"], "postTitle": "System Design Delivery Framework", "postUrl": "post/System%20Design%20Delivery%20Framework.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/3", "commentNum": 0, "wordCount": 2018, "description": "# Recommended Structure:\r\n\r\n1. Requirements (5 mins):\r\n\r\n- Functional Requirements: Identify core features (e.g., 'Users should be able to post tweets'). Prioritize 2-3 key features.\r\n\r\n- Non-Functional Requirements: Focus on system qualities like scalability, latency, and availability. Quantify where possible (e.g., 'render feeds in under 200ms').\r\n\r\n2. Capacity Estimation: Skip unnecessary calculations unless they directly impact the design (e.g., sharding in a TopK system).\r\n\r\n3. Core Entities (2 mins): Identify key entities (e.g., User, Tweet, Follow) to define the system's foundation.\r\n\r\n4. API/System Interface (5 mins): Define the contract between the system and users. Prefer RESTful APIs unless GraphQL is necessary.\r\n\r\n5. [Optional] Data Flow (5 mins): Describe high-level processes for data-heavy systems (e.g., web crawlers).\r\n\r\n6. High-Level Design (10-15 mins): Draw the system architecture, focusing on core components (e.g., servers, databases). Keep it simple and iterate based on API endpoints.\r\n\r\n7. Deep Dives (10 mins): Address non-functional requirements, edge cases, and bottlenecks. Proactively improve the design (e.g., scaling, caching, database sharding).\r\n\r\n![flow chart](https://www.mermaidchart.com/raw/580407c4-2ada-4e23-9a4b-fa8e09f3963a?theme=light&version=v0.1&format=svg)\r\n\r\n# Tips:\r\n\r\n1. Avoid overcomplicating the design early on.\r\n\r\n2. Communicate clearly with the interviewer, explaining your thought process and data flow.\r\n\r\n3. Focus on relevant fields in the data model, not every detail.\r\n\r\n4. Balance proactive discussion with listening to the interviewer\u2019s probes.\r\n\r\n# Example: Twitter System Design\r\n\r\n1. Functional Requirements: Post tweets, follow users, view feeds.\r\n\r\n2. Non-Functional Requirements: High availability, low latency (<200ms), scalability to 100M+ DAUs.\r\n\r\n3. Core Entities: User, Tweet, Follow.\r\n\r\n4. API Endpoints: POST /v1/tweet, GET /v1/feed, etc.\r\n\r\n5. Deep Dives: Discuss fanout-on-read vs. fanout-on-write, caching, and horizontal scaling.\u3002", "top": 0, "createdAt": 1736445405, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-10", "dateLabelColor": "#0969da"}, "P4": {"htmlDir": "docs/post/Avoid double booking.html", "labels": ["system design", "database"], "postTitle": "Avoid double booking", "postUrl": "post/Avoid%20double%20booking.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/4", "commentNum": 0, "wordCount": 10040, "description": "# Problem\r\n\r\nImagine two users try to buy the last ticket to a show at almost the same instant. Without a proper system, it's possible both users could be told they successfully bought the ticket, leading to overselling.\r\n\r\n# Solutions\r\n\r\nTo ensure that no two users book the same ticket simultaneously, the Booking Service uses database transactions with ACID properties, employing techniques like row-level locking or optimistic concurrency control (OCC).\r\n\r\n- **Row-Level Locking**: This is one technique to achieve isolation. When a user starts booking a ticket, the database places a 'lock' on the specific row in the database table that represents that ticket. This lock prevents any other transaction from modifying that row until the first transaction is finished (either committed or rolled back). Think of it like putting a 'reserved' sign on the ticket.\r\n\r\n- **Optimistic Concurrency Control (OCC)**: This is an alternative to locking. Instead of locking the row, the system checks if the data has been modified by another transaction before committing the current transaction. It typically does this by comparing a version number or timestamp. If the data has changed in the meantime, the transaction is aborted, and the user might be informed that the ticket is no longer available. This is 'optimistic' because it assumes conflicts are rare.\r\n\r\n## Row-Level Locking\r\n\r\nScenario: Imagine a database table storing airline seat reservations. Each row represents a seat on a specific flight.\r\n\r\n| Flight | Seat | Customer |\r\n|---|---|---|\r\n| AA123 | 1A | John Doe |\r\n| AA123 | 1B | Jane Smith |\r\n| AA123 | 2A |  |\r\n| AA123 | 2B |  |\r\n\r\nTwo users try to book seat 2A simultaneously:\r\n\r\n1. User 1 clicks to book seat 2A. The database transaction begins.\r\n2. The database places a lock on the row representing seat 2A.\r\n3. User 2, at the exact same moment, also clicks to book seat 2A. Their transaction also begins.\r\n\r\nHowever, because User 1's transaction has already locked the row, User 2's transaction is blocked. It has to wait.\r\nUser 1 completes their booking (payment goes through, etc.). The transaction commits, and the lock on seat 2A is released.\r\n\r\nNow, User 2's transaction can proceed. But when it tries to access the row for seat 2A, it sees that it's no longer available (John Doe has it). The system informs User 2 that the seat is taken.\r\n\r\n## Optimistic Concurrency Control (OCC)\r\n\r\nScenario: A similar airline seat reservation system. This time, instead of locks, each row has a version number.\r\n\r\n| Flight | Seat | Customer | Version |\r\n|---|---|---|---|\r\n| AA123 | 1A | John Doe | 1 |\r\n| AA123 | 1B | Jane Smith | 1 |\r\n| AA123 | 2A |  | 1 |\r\n| AA123 | 2B |  | 1 |\r\n\r\nTwo users try to book seat 2A simultaneously:\r\n\r\n1. User 1 starts the booking process for seat 2A. The system reads the row and notes the version number (1).\r\n2. User 2, at almost the same time, also starts booking seat 2A. Their system also reads the row and notes the version number (1).\r\n3. User 1 completes their booking. The system checks if the version number in the database is still 1. It is, so the system updates the row with User 1's information and increments the version number to 2.\r\n4. User 2 completes their booking a fraction of a second later. Their system also checks if the version number is still 1. But now, it's 2! This means the row has been modified by another transaction (User 1's).\r\n5. User 2's transaction is aborted. The system informs them that the seat is no longer available.\r\n\r\n## Key Differences\r\n\r\n### Locking (Pessimistic):\r\n\r\n- Locks are acquired immediately, preventing conflicts upfront.\r\n- Can lead to performance issues if there are many concurrent users trying to access the same data (because of waiting).\r\n- Better for situations where conflicts are likely (e.g., very popular events).\r\n\r\n### OCC (Optimistic):\r\n\r\n- Assumes conflicts are rare and only checks for them at the end.\r\n- Generally better performance for most applications because there's no waiting.\r\n- More complex to implement because you need to handle the cases where transactions are aborted.\r\n- Both methods are used to ensure data integrity in concurrent environments, but they have different trade-offs in terms of performance and complexity.\r\n\r\n# MySQL examples\r\n\r\nHere are some MySQL examples demonstrating row-level locking and how it would conceptually work (MySQL doesn't directly expose OCC in the same way). I'll also explain how OCC would be implemented in SQL conceptually.\r\n\r\n## Row-Level Locking (using FOR UPDATE)\r\n\r\nMySQL uses FOR UPDATE to acquire exclusive row-level locks.\r\n\r\n1. Table Setup:\r\n\r\n```SQL\r\nCREATE TABLE tickets (\r\n    id INT PRIMARY KEY AUTO_INCREMENT,\r\n    event_name VARCHAR(255),\r\n    available_seats INT\r\n);\r\n\r\nINSERT INTO tickets (event_name, available_seats) VALUES ('Concert X', 10);\r\n```\r\n\r\n2. Booking Process (simulating two concurrent users):\r\n\r\n- User 1 (in one MySQL session):\r\n\r\n```SQL\r\nSTART TRANSACTION; -- Start a transaction\r\n\r\nSELECT available_seats FROM tickets WHERE id = 1 FOR UPDATE; -- Lock the row\r\n\r\n-- Check if seats are available\r\nSET @seats := (SELECT available_seats FROM tickets WHERE id = 1);\r\nIF @seats > 0 THEN\r\n    UPDATE tickets SET available_seats = available_seats - 1 WHERE id = 1;\r\n    SELECT 'Booking successful' AS message;\r\n    COMMIT; -- Commit the transaction, releasing the lock\r\nELSE\r\n    SELECT 'No seats available' AS message;\r\n    ROLLBACK; -- Rollback the transaction\r\nEND IF;\r\n```\r\n\r\n- User 2 (in a separate MySQL session, running at almost the same time):\r\n\r\n```SQL\r\nSTART TRANSACTION;\r\n\r\nSELECT available_seats FROM tickets WHERE id = 1 FOR UPDATE; -- This will block until User 1's transaction commits\r\n\r\n-- (Once User 1 commits, this continues)\r\nSET @seats := (SELECT available_seats FROM tickets WHERE id = 1);\r\nIF @seats > 0 THEN\r\n    UPDATE tickets SET available_seats = available_seats - 1 WHERE id = 1;\r\n    SELECT 'Booking successful' AS message;\r\n    COMMIT;\r\nELSE\r\n    SELECT 'No seats available' AS message;\r\n    ROLLBACK;\r\nEND IF;\r\n```\r\n\r\n3. Explanation:\r\n\r\n>START TRANSACTION begins a transaction.\r\nSELECT ... FOR UPDATE acquires an exclusive lock on the selected row. This prevents other transactions from modifying the row until the current transaction is committed or rolled back.\r\nIf User 2 tries to execute the SELECT ... FOR UPDATE while User 1's transaction holds the lock, User 2's query will wait.\r\nCOMMIT makes the changes permanent and releases the lock.\r\nROLLBACK undoes any changes and releases the lock.\r\n\r\n## Conceptual OCC in SQL (using a version column)\r\n\r\nMySQL doesn't have built-in OCC like some other databases, but you can implement it yourself using a version column:\r\n\r\n1. Table Setup (with a version column):\r\n\r\n```SQL\r\nALTER TABLE tickets ADD COLUMN version INT UNSIGNED NOT NULL DEFAULT 0;\r\n```\r\n\r\n2. Booking Process:\r\n\r\n```SQL\r\nSTART TRANSACTION;\r\n\r\nSELECT available_seats, version FROM tickets WHERE id = 1 INTO @seats, @version;\r\n\r\nIF @seats > 0 THEN\r\n    UPDATE tickets \r\n    SET available_seats = available_seats - 1, version = version + 1\r\n    WHERE id = 1 AND version = @version; -- Check the version here!\r\n\r\n    IF ROW_COUNT() > 0 THEN -- Check if the update was successful\r\n        SELECT 'Booking successful' AS message;\r\n        COMMIT;\r\n    ELSE\r\n        SELECT 'Booking failed (concurrent update)' AS message;\r\n        ROLLBACK;\r\n    END IF;\r\nELSE\r\n    SELECT 'No seats available' AS message;\r\n    ROLLBACK;\r\nEND IF;\r\n```\r\n\r\n3. Explanation of OCC Implementation:\r\n\r\nWe retrieve the available_seats and the version number in one SELECT.\r\nThe crucial part is the `WHERE id = 1 AND version = @version` clause in the UPDATE statement. This condition ensures that the update only happens if the version number in the database still matches the version number we read earlier.\r\nROW_COUNT() checks how many rows were affected by the UPDATE. If it's 0, it means the version has changed (another transaction modified the row), and the booking fails.\r\n\r\nThis is a simplified example. In a real application, you would need to handle the 'booking failed' case gracefully, perhaps by retrying the booking or informing the user.\r\n\r\nThese examples illustrate the core concepts of row-level locking and OCC in MySQL. They are simplified for clarity but demonstrate the key mechanisms involved. Remember that actual performance and behavior can be affected by factors like database configuration, indexes, and transaction isolation levels.\r\n\r\n> [!NOTE]\r\n>  The behavior of row-level locking without FOR UPDATE depends heavily on the transaction isolation level set for your MySQL session or globally for the server.\r\n>\r\n> - **Default Isolation Level (REPEATABLE READ)**: In MySQL's default REPEATABLE READ isolation level, a simple SELECT statement within a transaction does not acquire any locks that prevent other transactions from modifying the selected rows. This means that if you perform a SELECT and then later try to UPDATE based on the data you read, you could encounter a 'lost update' problem.\r\n> - **READ COMMITTED Isolation Level**: In READ COMMITTED, a SELECT statement reads only committed data. However, it still doesn't acquire locks that prevent other transactions from modifying the data after the SELECT has finished. So, the lost update problem can still occur.\r\n> - **SERIALIZABLE Isolation Level**: This is the highest isolation level. In SERIALIZABLE, even a simple SELECT statement acquires shared locks that prevent other transactions from modifying the selected rows. This prevents lost updates and other concurrency problems, but it can also significantly reduce concurrency and performance.\r\n> - **Using FOR UPDATE (Pessimistic Locking)**: As discussed before, FOR UPDATE explicitly acquires an exclusive lock on the selected rows, regardless of the transaction isolation level (except in some very specific edge cases related to storage engines). This is the most reliable way to prevent concurrency issues like lost updates when you need to update data based on a previous read.\u3002", "top": 0, "createdAt": 1736488069, "style": "<style>.markdown-alert{padding:0.5rem 1rem;margin-bottom:1rem;border-left:.25em solid var(--borderColor-default,var(--color-border-default));}.markdown-alert .markdown-alert-title {display:flex;font-weight:var(--base-text-weight-medium,500);align-items:center;line-height:1;}.markdown-alert>:first-child {margin-top:0;}.markdown-alert>:last-child {margin-bottom:0;}</style><style>.markdown-alert.markdown-alert-note {border-left-color:var(--borderColor-accent-emphasis, var(--color-accent-emphasis));background-color:var(--color-accent-subtle);}.markdown-alert.markdown-alert-note .markdown-alert-title {color: var(--fgColor-accent,var(--color-accent-fg));}</style>", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-10", "dateLabelColor": "#0969da"}, "P5": {"htmlDir": "docs/post/Change Data Capture (CDC).html", "labels": ["system design", "database"], "postTitle": "Change Data Capture (CDC)", "postUrl": "post/Change%20Data%20Capture%20%28CDC%29.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/5", "commentNum": 0, "wordCount": 4404, "description": "System design interviews often delve into the complexities of data synchronization and real-time data processing. One crucial technique that frequently surfaces is Change Data Capture (CDC). Understanding CDC is essential for designing robust and scalable systems. This post will introduce you to the core concepts of CDC and its importance in system design.\r\n\r\n# What is Change Data Capture (CDC)?\r\n\r\nChange Data Capture is a set of software design patterns used to determine and track data that has changed in a database so that action can be taken using the changed data. It's about efficiently capturing and propagating changes made to data in a source system (usually a database) to downstream systems in near real-time. This is crucial for various use cases, including: \u00a0 \r\n\r\n- Keeping search indexes up-to-date (e.g., Elasticsearch, Solr): Ensuring search results reflect the latest data.\r\n- Data warehousing and ETL (Extract, Transform, Load): Populating data warehouses with incremental changes.\r\n- Real-time analytics and dashboards: Providing up-to-the-minute insights.\r\n- Cache invalidation: Keeping caches consistent with the source data.\r\n- Auditing and compliance: Tracking data changes for regulatory purposes.\r\n\r\n# Why is CDC Important in System Design?\r\n\r\nTraditional methods of data synchronization, like batch processing or polling, can be inefficient and introduce significant latency. \r\n\r\nCDC offers several advantages:\r\n\r\n- Near Real-Time Updates: CDC captures changes as they happen, enabling near real-time data synchronization.\r\n- Reduced Database Load: Instead of constantly querying the database for changes, CDC reads change logs or uses other efficient mechanisms, minimizing the impact on database performance.\r\n- Data Integrity: CDC ensures that changes are captured accurately and in order, maintaining data consistency across different systems.\r\n- Decoupling: CDC decouples the source system from downstream systems, improving flexibility and scalability.\r\n\r\n# Common CDC Techniques:\r\n\r\nThere are several ways to implement CDC, each with its own trade-offs:\r\n\r\n- Database Logs (Log-Based CDC): This is the most efficient and preferred method. Databases maintain transaction logs (e.g., redo logs in Oracle, write-ahead logs (WAL) in PostgreSQL, binary logs in MySQL) that record every change made to the database. CDC tools can parse these logs and extract the change data.\r\n\r\nPros: Minimal impact on database performance, high throughput, reliable.\r\nCons: Requires access to database logs, can be complex to implement directly.\r\n\r\n- Triggers: Database triggers can be used to capture changes. A trigger is a stored procedure that automatically executes in response to certain events (e.g., INSERT, UPDATE, DELETE).\r\n\r\nPros: Relatively simple to implement.\r\nCons: Can add overhead to database operations, can be difficult to manage complex change capture logic, not suitable for high-volume changes.\r\n\r\n- Polling: This involves periodically querying the database for changes based on a timestamp column or a modified flag.\r\n\r\nPros: Simple to implement.\r\nCons: Inefficient, introduces latency, puts significant load on the database, can miss changes if the polling interval is too long.\r\n\r\n- Dual Writes: This involves writing data to both the database and the downstream system simultaneously.\r\n\r\nPros: Simple for basic use cases\r\nCons: Introduces tight coupling, difficult to ensure atomicity, prone to inconsistencies if one write fails.\r\n\r\n# Example Scenario: Keeping Elasticsearch Synchronized\r\n\r\nImagine a system with a relational database storing product information and Elasticsearch used for search. Using CDC, you would:\r\n\r\n- Use a tool like Debezium to capture changes from the database's transaction log.\r\n- Stream these changes through a message queue like Kafka.\r\n- Have a consumer application read from Kafka, transform the change data into a format suitable for Elasticsearch, and index it.\r\n\r\nThis ensures that Elasticsearch is always up-to-date with the latest product information without constantly querying the database.\r\n\r\nUnderstanding CDC is a valuable asset in system design interviews. It demonstrates your knowledge of data synchronization techniques and your ability to design efficient and scalable systems. By understanding the different methods and their trade-offs, you can effectively address related questions and showcase your expertise.\u3002", "top": 0, "createdAt": 1736494857, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-10", "dateLabelColor": "#0969da"}, "P6": {"htmlDir": "docs/post/System Design TicketMaster.html", "labels": ["system design"], "postTitle": "System Design TicketMaster", "postUrl": "post/System%20Design%20TicketMaster.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/6", "commentNum": 0, "wordCount": 4987, "description": "# Requirements\r\n\r\n## Functional \r\n\r\n1.  View events\r\n2.  Search events by keywords\r\n3.  Order tickets\r\n\r\n## Non Functional\r\n\r\n1. Support 100m users, read heavy, read/write ratio 100:1\r\n2. Search should be faster and return within 500 ms\r\n3. No double booking\r\n\r\n# Entities\r\n\r\n- Event\r\n- User\r\n- Ticket\r\n- Order\r\n\r\n# API\r\n\r\n- GET /events -> [event list (event name, date, available tickets)]\r\n- GET /event/{event_id} -> [event detail (name, desc, performer, date, location, tickets]\r\n- GET /events/search/keyword={keyword}&start={start_date}&end={end_date}&page_size={page_size}&page_num={page_num} -> [event list]\r\n- POST /order/{event_id}\r\n```\r\n    {\r\n        tickets,\r\n        payments\r\n    }\r\n```\r\n\r\n# High Level Design\r\n\r\n## View event\r\n\r\n![image](https://github.com/user-attachments/assets/4b55d681-b0b9-404a-959f-94a2ec64e014)\r\n\r\n1. user make a request to view a event\r\n2. API gateway forward request to event server\r\n3. event server fetch event detail and return to client\r\n\r\n## Search for event\r\n\r\n![image](https://github.com/user-attachments/assets/47f78c5a-323a-4cd5-b45e-33d5e7eb9dda)\r\n\r\n1. user search event by keyword\r\n2. API gateway forward request to search server\r\n3. search server query DB with 'like' sql and return to client\r\n\r\n## Order tickets\r\n\r\n![image](https://github.com/user-attachments/assets/a5e12df6-5d1f-48f5-9a03-86114701a6bd)\r\n\r\nUser already have the details o the event, like tickets available, then user can order tickets.\r\n\r\n1. user send order request (userid, tickets, event)\r\n2. API gateway forward request to order server\r\n3.  oder service integrates with 3rd party payment service like Strip for payment\r\n4. order server\r\n    -  query the availability of the tickets, and book them\r\n    -  update the ticket status to booked once payment is done\r\n    -  new order record is added in the same transaction.\r\n\r\n# Deep dive\r\n\r\n## Bad user experience when booking\r\n\r\nUser find the available ticket and then pay for them, but found that it turns out to be ordered, the problem is that we don't reserve for the tickets.  \r\n\r\nTo reserve the tickets, one way is to use DB feature like row level locking to lock the record, the problem is that it doesn't support timeout, and it's up to the application to handle the edge cases of the lock, to avoid leaving it to uncertain states, this helps on avoid double booking but it's not good to use it for reservation.\r\n\r\nAnother way is to use the status field in DB, add reserved states, and use a cron job to check the expiration on timely basis, this may have a delay in unlocking depending on the interval of the cron job.\r\n\r\nA more preferred way is to use Redis, and it works as below:\r\n\r\n**Reservation**\r\n1. User select available seat and book a ticket\r\n2. Order service add  (ticket id, user id) in Redis Distributed Lock with reservation timeout, the record will be removed automatically once timeout TTL\r\n3. Order service creates an order record and returns the order id\r\n4. Other users need to check the availability in Redis before booking (available in DB and not reserved in Redis)\r\n\r\n**Payment**\r\n1. with the order id , user can issue payment from the payment service\r\n2. the payment service can callback order service once the payment succeeds, with the order id\r\n3. order service write the payment detail in order record, and change the ticket status to 'booked'\r\n\r\n## How to scale up to tenth of millions of concurrent users during popular events\r\n\r\nTo cope with read heavy view events up to tenth of millions:\r\n\r\n- we can utilize cache, since the view event is mostly unchanged\r\n-  event service is stateless, so it can easily scale horizontally, and we can use load balance before the event services\r\n\r\n## How to handle millions of concurrent bookings during popular events\r\n\r\nFirst the user should be able to be notified that the ticket states immediately, polling won't work in this case, SSE(server send event) works for this case. But still the system may not able to handle such a kind of burst, and we need a mechanism to protect the system, we can add a toggled feature that could be enabled in this case, and park the users in a queue, and the client can use websocket to send request to the queuing service to get a token,  once order service deque a user from the queue, it can issue a token for the user, and notify the user via websocket to send the order request.\r\n\r\n![image](https://github.com/user-attachments/assets/17098f97-99c0-4c2e-8476-8aec4e550da4)\r\n\r\n## How to improve search to meet low latency requirements?\r\n\r\nSQL 'like' will scan the whole table, it's not efficient, we can either build full text index in DB, or use ElasticSearch with updates via CDC (Change Data Capture).\r\n\r\nThe search results can be also cached with memcached or Redis, but could be more efficient to use ElasticSearch cache b/c it's smarter with app logic (reuse cache for aggregation or filtering etc). \r\n\r\n![image](https://github.com/user-attachments/assets/e68e60d6-1a22-48ba-b022-ca6553d29529)\r\n\u3002", "top": 0, "createdAt": 1736557963, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-11", "dateLabelColor": "#0969da"}, "P7": {"htmlDir": "docs/post/System Design Bit.ly.html", "labels": ["system design"], "postTitle": "System Design Bit.ly", "postUrl": "post/System%20Design%20Bit.ly.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/7", "commentNum": 0, "wordCount": 2205, "description": "# Requirements\r\n\r\n## Functional\r\n\r\n- give a long url, return a short url\r\n- short url redirect to long url\r\n- customize short url\r\n- set expiration date\r\n\r\n## NonFunctional\r\n\r\n- scalability 100M users, read/write ratio 10:1\r\n- redirect should be fast, less than 100 ms\r\n- uniqueness of short url for the same long url\r\n\r\n# Entities\r\n\r\n- original url\r\n- short url\r\n- user\r\n\r\n# API\r\n\r\n1. POST /url\r\ninput:\r\n```\r\n{\r\n   original url,\r\n   customized code (optional),\r\n   expiration date (optional)\r\n}\r\n```\r\n\r\noutput: \r\n```\r\n{\r\n   short url\r\n}\r\n```\r\n\r\n2. GET /{shortcode}\r\n\r\n302 redirect to original url\r\n\r\n# High level design\r\n\r\n![image](https://github.com/user-attachments/assets/5b22ba65-0f1b-4a7d-8d28-5a6b0d979fab)\r\n\r\n# Deep dive\r\n\r\n## Uniqueness of url\r\n\r\nUse hash function or random string, if worry about conflict, we can either regenerating random number  or adding salt when generating hash, or we can add username in the url (like namespace). \r\n\r\nFor short url, we should use B62 encoding, and truncate the string generated to a small number of characters.\r\n\r\nThe global counter may work but adds extra service in the system, and it may fail and add more complexity there.\r\n\r\n## Fast redirect\r\n\r\nFirst the DB query should be fast, we can add index for short code, besides we can cache the short code in memory, and use LRU  cache policy. We can also utilize CDN, some CDN providers provide some basic computing at the Edge so we can just redirect there.\r\n\r\n## scale 100M users\r\n\r\nAssume each user uses the redirect service 10 times per day, and create 1 short url per day:\r\n\r\nQPS:\r\n- 100M writes per day, 100M / 100k = 1k QPS \r\n- 100M x 10 reads per day, 10k QPS\r\n\r\nSplit into read/write services\r\n\r\nrecord size:\r\n- original url (100 bytes)\r\n- short url (50 bytes)\r\n- expire date ( 8 bytes)\r\n- create date (8 bytes)\r\n- user id (8 bytes)\r\ntotal ~ 200 bytes\r\n\r\nstorage (1 year)\r\n- url table 200 bytes x 100m x 1 x 365 = 20G x 360 = 7200 GB = 7.2 TB\r\n\r\nwe can use nosql db  like DynamoDB which can:\r\n- easily handle ~1k writes/sec and ~10k reads/sec.\r\n- scales horizontally with sharding and replication.\r\n\r\n![image](https://github.com/user-attachments/assets/2b479030-1436-4fd8-b625-0fae2cfc446e)\r\n\r\n\r\n\u3002", "top": 0, "createdAt": 1736562990, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-11", "dateLabelColor": "#0969da"}, "P8": {"htmlDir": "docs/post/juan-ji-ceng-shu-chu-te-zheng-tu-de-wei-du-ji-suan.html", "labels": ["ML"], "postTitle": "\u5377\u79ef\u5c42\u8f93\u51fa\u7279\u5f81\u56fe\u7684\u7ef4\u5ea6\u8ba1\u7b97", "postUrl": "post/juan-ji-ceng-shu-chu-te-zheng-tu-de-wei-du-ji-suan.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/8", "commentNum": 0, "wordCount": 2483, "description": "# \u516c\u5f0f\r\n\r\n `out = (in + 2 * padding - kernel) / stride + 1`\r\n\r\n- \u8f93\u5165 (in): \u8f93\u5165\u56fe\u50cf\u7684\u5c3a\u5bf8\uff08\u5bbd\u5ea6\u6216\u9ad8\u5ea6\uff09\u3002", "top": 0, "createdAt": 1736616657, "style": "<style>.markdown-alert{padding:0.5rem 1rem;margin-bottom:1rem;border-left:.25em solid var(--borderColor-default,var(--color-border-default));}.markdown-alert .markdown-alert-title {display:flex;font-weight:var(--base-text-weight-medium,500);align-items:center;line-height:1;}.markdown-alert>:first-child {margin-top:0;}.markdown-alert>:last-child {margin-bottom:0;}</style><style>.markdown-alert.markdown-alert-note {border-left-color:var(--borderColor-accent-emphasis, var(--color-accent-emphasis));background-color:var(--color-accent-subtle);}.markdown-alert.markdown-alert-note .markdown-alert-title {color: var(--fgColor-accent,var(--color-accent-fg));}</style>", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-12", "dateLabelColor": "#0969da"}, "P9": {"htmlDir": "docs/post/zhuan-zhi-juan-ji-\uff08transposed convolution\uff09.html", "labels": ["ML"], "postTitle": "\u8f6c\u7f6e\u5377\u79ef\uff08transposed convolution\uff09", "postUrl": "post/zhuan-zhi-juan-ji-%EF%BC%88transposed%20convolution%EF%BC%89.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/9", "commentNum": 0, "wordCount": 2745, "description": "\u8f6c\u7f6e\u5377\u79ef\uff08Transposed Convolution\uff0c\u4e5f\u79f0\u4e3a\u53cd\u5377\u79ef\u6216\u4e0a\u91c7\u6837\u5377\u79ef\uff09\u662f\u4e00\u79cd\u7528\u4e8e**\u653e\u5927\u7279\u5f81\u56fe\u5c3a\u5bf8**\u7684\u64cd\u4f5c\uff0c\u5e38\u7528\u4e8e\u751f\u6210\u5668\u7f51\u7edc\uff08\u5982GAN\uff09\u6216\u56fe\u50cf\u5206\u5272\u6a21\u578b\uff08\u5982U-Net\uff09\u7684\u4e0a\u91c7\u6837\u9636\u6bb5\u3002", "top": 0, "createdAt": 1736622284, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script><script>MathJax = {tex: {inlineMath: [[\"$\", \"$\"]]}};</script><script async src=\"https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js\"></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-12", "dateLabelColor": "#0969da"}, "P10": {"htmlDir": "docs/post/Bilinear Interpolation.html", "labels": ["ML"], "postTitle": "Bilinear Interpolation", "postUrl": "post/Bilinear%20Interpolation.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/10", "commentNum": 0, "wordCount": 2535, "description": "Bilinear interpolation is a method used to estimate the value of a function at a point within a 2D grid based on the values of the function at the grid's surrounding points. It is a straightforward extension of linear interpolation to two dimensions.\r\n\r\n---\r\n\r\n### How It Works\r\nSuppose you have a rectangular grid with known values at four corners, and you want to find the interpolated value at a point inside this rectangle.\r\n\r\n1. **Grid and Points**:\r\n   - Let the four known points on the grid be $(x_1, y_1), (x_2, y_1), (x_1, y_2), (x_2, y_2)$, with values $Q_{11}, Q_{21}, Q_{12}, Q_{22}$, respectively.\r\n   - The point where you want to interpolate the value is $(x, y)$, where $x_1 \\leq x \\leq x_2 \\) and \\( y_1 \\leq y \\leq y_2$.\r\n\r\n2. **Two-Step Process**:\r\n\r\n   - **Step 1: Interpolate along the x-direction** at fixed $ y_1$ and $y_2$:\r\n     - At $y_1$: Interpolate between $Q_{11}$ and $Q_{21}$:\r\n  \r\n$$\r\n       \\[\r\n       Q_{x1} = \\frac{(x_2 - x)}{(x_2 - x_1)} Q_{11} + \\frac{(x - x_1)}{(x_2 - x_1)} Q_{21}\r\n       \\]\r\n$$\r\n\r\n     - At $y_2$: Interpolate between $Q_{12}$ and $Q_{22}$:\r\n\r\n$$\r\n       \\[\r\n       Q_{x2} = \\frac{(x_2 - x)}{(x_2 - x_1)} Q_{12} + \\frac{(x - x_1)}{(x_2 - x_1)} Q_{22}\r\n       \\]\r\n$$\r\n\r\n   - **Step 2: Interpolate along the y-direction** between $Q_{x1}$ and $Q_{x2}$:\r\n\r\n$$\r\n     \\[\r\n     Q(x, y) = \\frac{(y_2 - y)}{(y_2 - y_1)} Q_{x1} + \\frac{(y - y_1)}{(y_2 - y_1)} Q_{x2}\r\n     \\]\r\n$$\r\n\r\n---\r\n\r\n### Key Features\r\n\r\n- **Linear Interpolation in Two Directions**: The method performs linear interpolation first in one direction (x) and then in the other direction (y).\r\n- \r\n- **Smooth Transition**: Bilinear interpolation gives a smooth result, as it is based on weighted averages of nearby points.\r\n- \r\n- **Grid Constraints**: It assumes a rectangular grid structure and requires the function values at four corners of the rectangle enclosing the interpolation point.\r\n\r\n---\r\n\r\n### Example\r\n\r\nIf you have the grid:\r\n\r\n$$\r\n\\begin{array}{c|c|c}\r\n      & x_1 = 0 & x_2 = 1 \\\\\r\n\\hline\r\ny_1 = 0 & Q_{11} = 10 & Q_{21} = 20 \\\\\r\ny_2 = 1 & Q_{12} = 30 & Q_{22} = 40 \\\\\r\n\\end{array}\r\n$$\r\n\r\nTo find the value at $(x, y) = (0.5, 0.5)$:\r\n1. Interpolate along $x$ for $y = 0$: $Q_{x1} = (10 + 20)/2 = 15$.\r\n2. Interpolate along $x$ for $y = 1$: $Q_{x2} = (30 + 40)/2 = 35$.\r\n4. Interpolate along $y$ : $Q(0.5, 0.5) = (15 + 35)/2 = 25$.\r\n\r\n---\r\n\r\nBilinear interpolation is widely used in image processing, computer graphics, and numerical simulations for tasks like resizing images or filling missing data.\u3002", "top": 0, "createdAt": 1736624125, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script><script>MathJax = {tex: {inlineMath: [[\"$\", \"$\"]]}};</script><script async src=\"https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js\"></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-12", "dateLabelColor": "#0969da"}, "P11": {"htmlDir": "docs/post/tu-xiang-fen-ge-ren-wu-zhong-de-sun-shi-han-shu.html", "labels": ["ML"], "postTitle": "\u56fe\u50cf\u5206\u5272\u4efb\u52a1\u4e2d\u7684\u635f\u5931\u51fd\u6570", "postUrl": "post/tu-xiang-fen-ge-ren-wu-zhong-de-sun-shi-han-shu.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/11", "commentNum": 0, "wordCount": 11883, "description": "\u672c\u6587\u4ecb\u7ecd\u4e86\u51e0\u79cd\u7528\u4e8e\u56fe\u50cf\u5206\u5272\u4efb\u52a1\u7684\u635f\u5931\u51fd\u6570\uff0c\u5305\u62ec **Dice Loss**\u3001**BCE-Dice Loss**\u3001**IoU Loss** \u548c **Focal Loss**\u3002", "top": 0, "createdAt": 1736750462, "style": "<style>.markdown-alert{padding:0.5rem 1rem;margin-bottom:1rem;border-left:.25em solid var(--borderColor-default,var(--color-border-default));}.markdown-alert .markdown-alert-title {display:flex;font-weight:var(--base-text-weight-medium,500);align-items:center;line-height:1;}.markdown-alert>:first-child {margin-top:0;}.markdown-alert>:last-child {margin-bottom:0;}</style><style>.markdown-alert.markdown-alert-note {border-left-color:var(--borderColor-accent-emphasis, var(--color-accent-emphasis));background-color:var(--color-accent-subtle);}.markdown-alert.markdown-alert-note .markdown-alert-title {color: var(--fgColor-accent,var(--color-accent-fg));}</style>", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script><script>MathJax = {tex: {inlineMath: [[\"$\", \"$\"]]}};</script><script async src=\"https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js\"></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-13", "dateLabelColor": "#0969da"}, "P12": {"htmlDir": "docs/post/System Design Yelp.html", "labels": ["system design"], "postTitle": "System Design Yelp", "postUrl": "post/System%20Design%20Yelp.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/12", "commentNum": 0, "wordCount": 4124, "description": "# Requirements (5 mins):\r\n\r\n## Functional Requirements\r\n\r\n> Identify core features (e.g., 'Users should be able to post tweets'). Prioritize 2-3 key features.\r\n\r\n- *Users can add/del/update a business (not in scope)*\r\n- Users can find nearby places by name, location, category\r\n- Users can view the business\r\n- Users can add reviews and ratings\r\n\r\n## Non-Functional Requirements\r\n\r\n> Focus on system qualities like scalability, latency, and availability. Quantify where possible (e.g., 'render feeds in under 200ms').\r\n\r\n- The search should be fast and return in 500 ms\r\n- The system should be highly available, eventual consistency is fine\r\n- The system should be scalable to support 100m DAU, 10m business\r\n- One user could only leave one review for one business\r\n\r\n## Capacity Estimation\r\n\r\n> Skip unnecessary calculations unless they directly impact the design (e.g., sharding in a TopK system).\r\n\r\n- read heavy: 100m x 10 reads/day / 100,000 = 10 k QPS\r\n- review storage: r10m x 100 reviews x 1000 Byte = 1 TB\r\n\r\n## Core Entities (2 mins)\r\n\r\n>  Identify key entities (e.g., User, Tweet, Follow) to define the system's foundation.\r\n\r\n- Business\r\n- Users\r\n- Reviews\r\n\r\n# API/System Interface (5 mins)\r\n\r\n> Define the contract between the system and users. Prefer RESTful APIs unless GraphQL is necessary.\r\n\r\n- search, GET /businesses?keyword&location&category&page -> [business]\r\n- view biz detail, GET /bussiness/id -> bussiness detail\r\n- view reviews, GET /bussiness/reviews?page -> [reviews]\r\n- review, POST /bussiness/review -> 200 OK, body { comments, rating }\r\n\r\n# [Optional] Data Flow (5 mins)\r\n\r\n> Describe high-level processes for data-heavy systems (e.g., web crawlers).\r\n\r\n# High-Level Design (10-15 mins)\r\n\r\n> Draw the system architecture, focusing on core components (e.g., servers, databases). Keep it simple and iterate based on API endpoints.\r\n\r\n![image](https://github.com/user-attachments/assets/ab66c559-967c-4e2b-98e5-2a222740b8d4)\r\n\r\n# Deep Dives (10 mins)\r\n\r\n> Address non-functional requirements, edge cases, and bottlenecks. Proactively improve the design (e.g., scaling, caching, database sharding).\r\n\r\n![image](https://github.com/user-attachments/assets/6ac15476-9a6d-43c5-a6c7-68eb56f13979)\r\n\r\n## search within 500 ms\r\n\r\nsql range query for lat/long is not efficient, we can either use geo index or quad tree. For keyword search, we could use the reverse index. ElasticSearch support those features, and we can use CDC (Change Data Capture) feature in DB to sync the data using queue/stream. However this adds extra service and introduces some complexity in the system.  Based on our estimation above, the database size is around 1TB, and we can actually use Postgres with PostGIS plugin in this case, it support full text search as well as geo indexing.\r\n\r\n## search by predefined names like city etc\r\n\r\nSearch within radius won't work in this case b/c the shape is polygons. We can download polygons data from [geoapify](https://www.geoapify.com/download-all-the-cities-towns-villages) and add location table:\r\n\r\n- name\r\n- type (city/neighborhood etc)\r\n- polygons\r\n\r\nPostgres with PostGIS plugin and Elasticsearch both support polygons query. However it's not efficient, and we can precompute the location names and stored in DB to avoid compute on each query, and compute once when the record is created.\r\n\r\n## how to update avg rating\r\n\r\nQuery all review records for one business and calculating the score is not optimal, we can actually update it once new review record added, using a iterative formulae to calculate the avg value, so basically $avg(n) = [r(n) + r(n-1) + ... + 1] / n = r(n) / n + avg(n-1) \\times (n-1) / n$, however there could be concurrency update issues, which could be either via row level lock or optimistic concurrency control introduced in [Avoid double booking](https://leadtechinterview.github.io/post/Avoid%20double%20booking.html)\r\n\r\n## one review per user per biz\r\n\r\napplication level checking works, but would be better to have the constraints at DB layer\r\n\r\n```\r\nALTER TABLE reviews\r\nADD CONSTRAINT unique_user_business UNIQUE (user_id, business_id);\r\n```\r\n\u3002", "top": 0, "createdAt": 1736900434, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script><script>MathJax = {tex: {inlineMath: [[\"$\", \"$\"]]}};</script><script async src=\"https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js\"></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-15", "dateLabelColor": "#0969da"}, "P13": {"htmlDir": "docs/post/System Design Scaling Cheat Sheet.html", "labels": ["system design"], "postTitle": "System Design Scaling Cheat Sheet", "postUrl": "post/System%20Design%20Scaling%20Cheat%20Sheet.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/13", "commentNum": 0, "wordCount": 5914, "description": "## 1. **Database Selection by QPS (Read vs Write)**\r\n\r\n| **QPS Range**         | **Type of Database**                  | **Examples**                      | **Scaling Techniques**             |\r\n|-----------------------|---------------------------------------|-----------------------------------|------------------------------------|\r\n| **Low (<1,000)**      | Relational / Document DB              | MySQL, PostgreSQL, MongoDB        | Vertical scaling, Caching         |\r\n| **Medium (1k\u201310k)**   | Relational (tuned) / Document DB      | MySQL + Replicas, DynamoDB        | Read replicas, Caching, Indexing  |\r\n| **High (10k\u20131M)**     | Distributed NoSQL                     | Cassandra, DynamoDB, Bigtable     | Sharding, Horizontal scaling      |\r\n| **Extreme (>1M)**     | Distributed In-memory / Advanced DB   | Redis, CockroachDB, Spanner       | Sharding, Partitioning, CDNs      |\r\n\r\n---\r\n\r\n## 2. **Mainstream Database QPS Capacity (Read vs Write)**\r\n\r\n| **Database Type**      | **Database**            | **Read QPS (Per Node)**            | **Write QPS (Per Node)**           |\r\n|------------------------|-------------------------|------------------------------------|------------------------------------|\r\n| **Relational DB**       | MySQL, PostgreSQL       | 2k\u201310k QPS (optimized for reads)  | 500\u20132k QPS (write-optimized)      |\r\n| **Document DB**         | MongoDB                 | 5k\u201320k QPS (tuned for reads)      | 1k\u20135k QPS (write-heavy)           |\r\n| **Wide-Column Store**   | Cassandra               | 10k\u201350k QPS (cluster optimized)   | 5k\u201320k QPS (write-optimized)      |\r\n| **Key-Value Store**     | Redis                   | 100k\u20131M QPS (in-memory optimized) | 100k\u20131M QPS (write-intensive)     |\r\n| **Time-Series DB**      | InfluxDB                | 50k\u2013500k QPS                      | 10k\u201350k QPS                       |\r\n| **Distributed SQL**     | CockroachDB             | 10k\u201350k QPS                       | 2k\u201310k QPS                        |\r\n| **Search Engines**      | Elasticsearch           | 1k\u201320k QPS (query-dependent)      | 500\u20135k QPS (write-intense)        |\r\n\r\n---\r\n\r\n## 3. **Sharding and Scaling: When and Why**\r\n\r\n### **Key Indicators for Sharding**:\r\n\r\n| **Condition**              | **Indicators**                                                             | **Action**                               |\r\n|----------------------------|---------------------------------------------------------------------------|------------------------------------------|\r\n| **Data Volume**             | Storage exceeds capacity of a single node or disk.                        | Use **sharding** when storage exceeds ~500GB\u20131TB per node. |\r\n| **High Write Throughput**   | Write latency increases due to bottlenecks.                               | Shard by write-heavy keys (>5k\u201310k writes/sec). |\r\n| **Read/Write Latency**      | Latency exceeds acceptable thresholds.                                   | Partition data, shard for high loads.    |\r\n| **Data Hotspotting**        | Some partitions/shards receive disproportionate traffic.                 | Implement **sharding** to balance load across nodes. |\r\n| **Query Performance**       | Queries are slow due to large tables.                                     | Use **partitioning** to improve query performance. |\r\n\r\n---\r\n\r\n## 4. **Other Scaling Concepts to Consider**\r\n\r\n### **1. Load Balancing**\r\n\r\n| **Concept**               | **Description**                                                          | **Techniques**                      |\r\n|---------------------------|--------------------------------------------------------------------------|-------------------------------------|\r\n| **Load Balancing**         | Distribute incoming traffic across multiple servers to ensure reliability and prevent overload. | **Round-robin**, **Least Connections**, **Weighted balancing** using **NGINX**, **HAProxy**, **AWS ELB**. |\r\n\r\n### **2.  Fault Tolerance, CAP Theorem**\r\n\r\n| **Concept**               | **Description**                                                          | **Techniques**                      |\r\n|---------------------------|--------------------------------------------------------------------------|-------------------------------------|\r\n| **Fault Tolerance**        | Ensuring that the system remains operational even if some components fail. | **Replication**, **Consensus algorithms** (e.g., **Paxos**, **Raft**), **Failover** systems. |\r\n| **CAP Theorem**            | A distributed system can only guarantee **two** of the following: **Consistency**, **Availability**, or **Partition Tolerance**. | Choose between **eventual consistency** or **strong consistency** depending on system requirements. |\r\n| **Eventual Consistency**   | Allowing data to propagate slowly across nodes, often used in **NoSQL** databases. | Use **CRDTs**, **Event sourcing**, **CQRS**, **Tunable consistency**. |\r\n| **Data Replication**       | Duplication of data across nodes to ensure high availability and fault tolerance. | Use **Master-Slave replication**, **Multi-region replication**. |\r\n\r\n### **3. Queues and Asynchronous Processing**\r\n\r\n| **Concept**               | **Description**                                                          | **Techniques**                      |\r\n|---------------------------|--------------------------------------------------------------------------|-------------------------------------|\r\n| **Asynchronous Processing** | Offloading long-running tasks to background jobs for better system responsiveness. | **Message queues** like **RabbitMQ**, **Kafka**, **Amazon SQS**, **Job schedulers**. |\r\n| **Event-driven Architectures** | Architecting systems to react to events in real-time. | Use **Event-driven systems** with **publish-subscribe** models. |\r\n\r\n### **4. Auto-Scaling and Elastic Infrastructure**\r\n\r\n### **5. Global Distribution & Multi-Region Deployment**\r\n\u3002", "top": 0, "createdAt": 1736902748, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-15", "dateLabelColor": "#0969da"}, "P14": {"htmlDir": "docs/post/System Design Leetcode.html", "labels": ["system design"], "postTitle": "System Design Leetcode", "postUrl": "post/System%20Design%20Leetcode.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/14", "commentNum": 0, "wordCount": 3270, "description": "# Requirements (5 mins):\r\n\r\n## Functional Requirements\r\n\r\n> Identify core features (e.g., 'Users should be able to post tweets'). Prioritize 2-3 key features.\r\n\r\n1. users can browse lists of coding problems\r\n2. users can view the problem and coding in different languages \r\n3. users can submit a solution to the coding problem and get the result\r\n4. users can view the lead board\r\n\r\n## Non-Functional Requirements\r\n\r\n> Focus on system qualities like scalability, latency, and availability. Quantify where possible (e.g., 'render feeds in under 200ms').\r\n\r\n1. scale 1m DAU\r\n2. availability >> consistency \r\n3. security, isolate env to run code\r\n4. user should be able to validate the solution within 1 second\r\n\r\n## Capacity Estimation\r\n\r\n> Skip unnecessary calculations unless they directly impact the design (e.g., sharding in a TopK system).\r\n\r\n100k users take competition, refresh lead board requests 6 per minute,  QPS is 100k x 6 / 60 = 10k, heavy load read\r\n\r\n## Core Entities (2 mins)\r\n\r\n>  Identify key entities (e.g., User, Tweet, Follow) to define the system's foundation.\r\n\r\n- problem\r\n- solution\r\n- lead board\r\n\r\n# API/System Interface (5 mins)\r\n\r\n> Define the contract between the system and users. Prefer RESTful APIs unless GraphQL is necessary.\r\n\r\n- GET /problems?page&company&category -> [problems]\r\n- GET /problem/id -> {desc, category, tags ...}\r\n- POST /solution/problem_id, body {lang, code} -> [pass/fail, timecost]\r\n- GET /leadboard/problem_id?page -> [rankings]\r\n\r\n\r\n# [Optional] Data Flow (5 mins)\r\n\r\n> Describe high-level processes for data-heavy systems (e.g., web crawlers).\r\n\r\n# High-Level Design (10-15 mins)\r\n\r\n> Draw the system architecture, focusing on core components (e.g., servers, databases). Keep it simple and iterate based on API endpoints.\r\n\r\n![image](https://github.com/user-attachments/assets/6e5d88a2-1af9-4a68-a10f-b4378e717842)\r\n\r\n# Deep Dives (10 mins)\r\n\r\n> Address non-functional requirements, edge cases, and bottlenecks. Proactively improve the design (e.g., scaling, caching, database sharding).\r\n\r\n## how to achieve isolation and security\r\n\r\n- mount the code as read only, and write any output to /tmp directory\r\n- set limits for CPU/memory usage for the container\r\n- avoid infinite loop or long time run, run as subprocess and monitor timeout, kill if needed\r\n-  limited network access (VPC)\r\n- no system calls, mock it, or restrict it\r\n\r\n## how to solve the heavy read load of query lead board\r\n\r\nWe can use cache, but it won't be up to date, we can use [Redis sorted set](https://redis.io/docs/latest/develop/data-types/sorted-sets/) to implement leadboard, update both Redis and DB, but just query Redis to get the top N.\r\n\r\n## how to scale to support 100k concurrent users for competition\r\n\r\nThe submission is CPU intensive, and we can auto scale the docker containers using cloud service, and in case of peek usage, we can add queue in our system, but that will change the POST solution API  to async, and we need add another API to query the result.\r\n\r\n## how to write test cases efficiently for all languages\r\n\r\nWe can define the test cases using test vector, define the input, and expected output using JSON format.\r\n\r\n![image](https://github.com/user-attachments/assets/ba7f8e47-22f5-4890-b802-281cb4cbb4c1)\r\n\u3002", "top": 0, "createdAt": 1736919705, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-15", "dateLabelColor": "#0969da"}, "P15": {"htmlDir": "docs/post/Coding question- Valid Sudoku.html", "labels": ["coding"], "postTitle": "Coding question: Valid Sudoku", "postUrl": "post/Coding%20question-%20Valid%20Sudoku.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/15", "commentNum": 0, "wordCount": 2458, "description": "# Desc\r\n\r\nDetermine whether a Sudoku is valid.\r\nThe Sudoku board could be partially filled, where empty cells are filled with the character '.'.\r\n\r\n# Memo\r\n\r\n1. one loop is enough, we don't need 3 loops to check cols, rows and boxes\r\n2. `k = 3 * i//3 + j//3` is WRONG, and it's NOT the same as `k = i//3 * 3 + j//3`\r\n3. don't forget to ignore '.'\r\n\r\n# Solution\r\n\r\nLoop 3 times, not good\r\n\r\n```python\r\n    def is_valid_sudoku(self, board: List[List[str]]) -> bool:\r\n        # write your code here\r\n        m, n = len(board), len(board[0])\r\n        assert m == 9 and n == 9\r\n\r\n        for i in range(m):\r\n            cnt = [0 for _ in range(n)]\r\n            for j in range(n):\r\n                if board[i][j] == '.': continue\r\n                idx = ord(board[i][j]) - ord('0') - 1\r\n                cnt[idx] += 1\r\n                if cnt[idx] > 1:\r\n                    return False\r\n\r\n        for j in range(n):\r\n            cnt = [0 for _ in range(m)]\r\n            for i in range(m):\r\n                if board[i][j] == '.': continue\r\n                idx = ord(board[i][j]) - ord('0') - 1\r\n                cnt[idx] += 1\r\n                if cnt[idx] > 1:\r\n                    return False\r\n\r\n        cnt = [[0 for _ in range(9)] for _ in range(9)]\r\n        for i in range(m):\r\n            for j in range(n):\r\n                if board[i][j] == '.': continue\r\n                k = i//3 * 3 + j//3\r\n                idx = ord(board[i][j]) - ord('0') - 1\r\n                cnt[k][idx] += 1\r\n                if cnt[k][idx] > 1:\r\n                    return False\r\n\r\n        return True\r\n```\r\n\r\nMore elegant way:\r\n\r\n```python\r\n    def is_valid_sudoku(self, board: List[List[str]]) -> bool:\r\n        # write your code here\r\n        m, n = len(board), len(board[0])\r\n        assert m == 9 and n == 9\r\n\r\n        rows = [set() for _ in range(9)]\r\n        cols = [set() for _ in range(9)]\r\n        boxes = [set() for _ in range(9)]\r\n\r\n        for i in range(9):\r\n            for j in range(9):\r\n                val = board[i][j]\r\n                if val == '.': continue\r\n                \r\n                if val in rows[i]:\r\n                    return False\r\n                rows[i].add(val)\r\n\r\n                if val in cols[j]:\r\n                    return False\r\n                cols[j].add(val)\r\n\r\n                idx = i//3 * 3 + j//3\r\n                if val in boxes[idx]:\r\n                    return False\r\n                boxes[idx].add(val)\r\n\r\n        return True\r\n```\r\n\u3002", "top": 0, "createdAt": 1736969422, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-16", "dateLabelColor": "#0969da"}, "P16": {"htmlDir": "docs/post/Coding question- Happy Number.html", "labels": ["coding"], "postTitle": "Coding question: Happy Number", "postUrl": "post/Coding%20question-%20Happy%20Number.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/16", "commentNum": 0, "wordCount": 2185, "description": "# Desc\r\n\r\nWrite an algorithm to determine if a number is happy.\r\n\r\nA happy number is a number defined by the following process: Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process ends in 1 are happy numbers.\r\n\r\n```\r\nInput:19\r\nOutput:true\r\nExplanation:\r\n\r\n19 is a happy number\r\n\r\n    1^2 + 9^2 = 82\r\n    8^2 + 2^2 = 68\r\n    6^2 + 8^2 = 100\r\n    1^2 + 0^2 + 0^2 = 1\r\n\r\nInput:5\r\nOutput:false\r\nExplanation:\r\n\r\n5 is not a happy number\r\n\r\n25->29->85->89->145->42->20->4->16->37->58->89\r\n89 appears again.\r\n```\r\n\r\n# Memo\r\n\r\nWhat makes this problem interesting is: why the result will not go infinitely large and there loops never ends\r\n\r\n## Some intuitive:\r\n\r\n1. 3 digits number, the square sum will be no larger than 243\r\n2. starting from 4 digits number, the square sum digits drops\r\n3. If we keep going, it will drop to <= 3 digits\r\n\r\n|number | square sum |\r\n|------------|-------------------|\r\n| 9             | 81                 |\r\n| 99          | 162               |\r\n|999         |  243              |\r\n|9999       | 324              |\r\n|99999     | 405              |\r\n|999999   | 486              |\r\n\r\n## Math:\r\n\r\nAssume $a_1, a_2, \\dots a_m$ are m digits, we can prove the squared sum is no larger than n.\r\n\r\n$$\r\n\\begin{align}\r\n\\text{squaredSum}(n) &= \\sum_{i=1}^m a_i^2 <= m * 9^2 \\\\\r\nn = \\overline{a_1a_2 \\dots a_m} >= 10^{m-1} \\\\\r\n\\lim_{m \\to \\infty} \\frac{\\text{squaredSum}(n)}{n} = 0\r\n\\end{align}\r\n$$\r\n\r\n# Solution\r\n\r\nWe can use slow fast pointer to find the loop:\r\n\r\n```python\r\npublic class Solution {\r\n    public int squareSum(int n) {\r\n        int sum = 0;\r\n        while(n > 0){\r\n            int digit = n % 10;\r\n            sum += digit * digit;\r\n            n /= 10;\r\n        }\r\n        return sum;\r\n    }\r\n\r\n    public boolean isHappy(int n) {\r\n        int slow = n, fast = squareSum(n);\r\n        while (slow != fast){\r\n            slow = squareSum(slow);\r\n            fast = squareSum(squareSum(fast));\r\n        };\r\n        return slow == 1;\r\n    }\r\n}\r\n```\u3002", "top": 0, "createdAt": 1736973534, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script><script>MathJax = {tex: {inlineMath: [[\"$\", \"$\"]]}};</script><script async src=\"https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js\"></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-16", "dateLabelColor": "#0969da"}, "P17": {"htmlDir": "docs/post/System Design Strava.html", "labels": ["system design"], "postTitle": "System Design Strava", "postUrl": "post/System%20Design%20Strava.html", "postSourceUrl": "https://github.com/LeadTechInterview/leadtechinterview.github.io/issues/17", "commentNum": 0, "wordCount": 2883, "description": "# Requirements (5 mins):\r\n\r\n## Functional Requirements\r\n\r\n> Identify core features (e.g., 'Users should be able to post tweets'). Prioritize 2-3 key features.\r\n\r\n1. users can start/stop/pause their activity (runs/rides)\r\n2. users should be able to check the activity data (route, distance, time etc) when running /riding\r\n3. users should be able to check history records, including friends\r\n\r\n## Non-Functional Requirements\r\n\r\n> Focus on system qualities like scalability, latency, and availability, consistency, security, durability, fault tolerance. Quantify where possible (e.g., 'render feeds in under 200ms').\r\n\r\n1. the system should be highly available\r\n2. the app should work offline when no Internet\r\n3. the stats should be accurate\r\n4. should support 10m concurrent users\r\n\r\n## Capacity Estimation\r\n\r\n> Skip unnecessary calculations unless they directly impact the design (e.g., sharding in a TopK system).\r\n\r\nroute GPS data estimation: 100m DAU, 10m concurrent users, collect interval 5 secs, 30 min activity per day will generate 30 x 60 / 5 = 360 records,  so around 36,000m records per day, each record we have around 40 Byte.\r\n\r\nStorage: 36,000m x 40 = 1440GB per day.  1500G x 360 = 180000x3G = 540000 GB = 540 TB\r\n\r\nQPS 3600m / 100k = 36k\r\n\r\n## Core Entities (2 mins)\r\n\r\n>  Identify key entities (e.g., User, Tweet, Follow) to define the system's foundation.\r\n\r\n- user\r\n- activity\r\n- route\r\n- friend\r\n\r\n# API/System Interface (5 mins)\r\n\r\n> Define the contract between the system and users. Prefer RESTful APIs unless GraphQL is necessary.\r\n\r\n- POST /activity -> Activity, create an activity, body {type}\r\n- PATCH /activity/id, change status, body { status}, start, stop, pause, complete\r\n- POST /activity/id/route, update route geo tracking { geolocation}\r\n- GET /activities/page?mode=user|friend&page=\r\n\r\n# [Optional] Data Flow (5 mins)\r\n\r\n> Describe high-level processes for data-heavy systems (e.g., web crawlers).\r\n\r\n# High-Level Design (10-15 mins)\r\n\r\n> Draw the system architecture, focusing on core components (e.g., servers, databases). Keep it simple and iterate based on API endpoints.\r\n\r\n![image](https://github.com/user-attachments/assets/68ced8bb-4916-49f8-a3cd-8d56576a1688)\r\n\r\n\r\n# Deep Dives (10 mins)\r\n\r\n> Address non-functional requirements, edge cases, and bottlenecks. Proactively improve the design (e.g., scaling, caching, database sharding).\r\n\r\n## No internet connection\r\n\r\nWe can save data in local db and sync to server later\r\n\r\n## Support 100 DAU, 10m concurrent users\r\n\r\nstorage: 540TB/year\r\n\r\n- purge old storage of route date\r\n- shard data by time\r\n-  use cold storage to save cost\r\n-  compress the date\r\n\r\nQPS: 40k\r\n\r\n- use nosql database, time series database more specific \r\n- clients aggregate the data and use longer intervals if not in realtime view\r\n- shard the data by user id, activity id to distribute the load across servers\r\n\r\n\u3002", "top": 0, "createdAt": 1736989322, "style": "", "script": "<script src='https://blog.meekdai.com/Gmeek/plugins/GmeekTOC.js'></script>", "head": "", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "createdDate": "2025-01-16", "dateLabelColor": "#0969da"}}, "singeListJson": {}, "labelColorDict": {"coding": "#006b75", "database": "#d4c5f9", "ML": "#d4c5f9", "system design": "#54E34A"}, "displayTitle": "Lead Tech Interview", "faviconUrl": "https://github.githubassets.com/favicons/favicon.svg", "ogImage": "https://github.githubassets.com/favicons/favicon.svg", "primerCSS": "<link href='https://mirrors.sustech.edu.cn/cdnjs/ajax/libs/Primer/21.0.7/primer.css' rel='stylesheet' />", "homeUrl": "https://leadtechinterview.github.io", "prevUrl": "/index.html", "nextUrl": "disabled"}