Angchua, Jerwyn
Paragas, Glenn
Vilar, Louise
* The repository is divided into three sections:
List Implementations
Tester Files
Miscellaneous
* The directories for each of the 4 list implementations.
Each contain their own header and source file. Example: d-linked-list.h d-linked-list.c
doubly-linked-list - Implementation of the Doubly Linked List.
dynamic-arr - Implementation of Dynamic Array.
skip-list - Implementation of Skip List.
tree-sequence - Implementation of Sequence of Trees.
H_global.h - The global header file containing LENGTH (size_t) and DATA (int64_t), for easier conventions across all list implementations.
* This directory contains all files relevant for testing.
* Please see Unit Test for more details.
PROOFS.md - Proof Sketches for the implementations.
DETAILS.md - Implementation details for each of the lists.
README.md - Me.
For the list implementations, we used the references and concepts provided by the MP1 document for each of the implementations to implement our own version of each Abstract Data Type of a list.
All lists are pointers of a struct, and any members for a struct are pointers to that struct as well, so that we only use the arrow -> sugar operations for all lists for consistency.
We also have the global types LENGTH (size_t) and DATA (int64_t) to represent values based on the length of the list and data values, respectively.
Lastly, we aimed for reverse operations.
DETAILS.md holds more details on each of the list's implementations.
For unit testing, we primarily used a controlled, semi-randomized test generator for testing edge cases and determining efficiency. For this reason, asserts on more complex checks were not used within the implementation (e.g Sequence of Trees increasing/decreasing
One thing to note is that the tester directly includes a source file based on the selected list implementation in its settings header, which will be discussed later on in this section. I believe it may be an unusual and risky way of source management, though it has helped us simplify things in the meantime to not worry about object file linking when compiling and running the tester's source file directly.
* All relevant tester files are located in mirror-flower
test_settings_m.py - Generator (settings)
mirror.py - Generator
test_settings_r.h - Tester (settings)
reflection.c - Tester
test_settings_r.h - Grapher (settings)
graph.c - Grapher Generator
graph.py - Grapher Display
inputs - Test inputs (operations)
outputs - Graph plot points
inputs/test_input_0.txt - For testing the tester itself
[PYTHON] Mirror (settings): test_settings_m.py
[Python-side.]
The settings for the test generator.
Imported as a Python module by the generator.
| SETTING | DATATYPE | DEFAULT |
|---|---|---|
| INPUT_DIRECTORY | string The test inputs' file directory. |
inputs |
| SEED | any supported by random.seed The randomizer seed. |
None |
[PYTHON] Mirror: mirror.py
[Python-side.]
The generator for the tests. It acts as the "mirror" for the list to appropriately match as its reflection.
It implements a working list in Python, and outputs test operations to the corresponding INPUT_DIRECTORY. These set of files are .txt files delimeted by a bar |, with its fields as follows:
OPERATION|ARG1|ARG2|RESULT
Where OPERATION is the name of the operation function, ARG1/ARG2 as the arguments for the function call, and RESULT being the correct resulting output that the target list needs to match to be considered correct.
Their text can be of the following:
| OPERATION | ARG1 | ARG2 | RESULT (can be "X" to disable correctness) |
|---|---|---|---|
| make | LENGTH n number |
DATA *seq "EMPTY" OR number sequence, separated by comma (e.g. "100,200,300") |
raw list sequence "EMPTY" OR number sequence, separated by comma (e.g. "100,200,300") |
| size | "None" | "None" | returned LENGTH number |
| empty | "None" | "None" | returned bool "f" / "t" |
| reverse | "None" | "None" | raw list sequence "EMPTY" OR number sequence, separated by comma (e.g. "100,200,300") |
| get | LENGTH i number |
"None" | returned DATA number |
| set | LENGTH i number |
DATA v number |
returned DATA number |
| peek_left | "None" | "None" | returned DATA number |
| peek_right | "None" | "None" | returned DATA number |
| push_left | DATA v number |
"None" | raw list sequence "EMPTY" OR number sequence, separated by comma (e.g. "100,200,300") |
| push_right | DATA v number |
"None" | raw list sequence "EMPTY" OR number sequence, separated by comma (e.g. "100,200,300") |
| pop_left | "None" | "None" | returned bool with (raw list sequence "EMPTY" OR number sequence, separated by comma (e.g. "100,200,300")) EXAMPLE: "f,EMPTY" "t,100,200,300" |
| pop_right | "None" | "None" | returned bool with (raw list sequence "EMPTY" OR number sequence, separated by comma (e.g. "100,200,300")) EXAMPLE: "f,EMPTY" "t,100,200,300" |
Note that an empty number sequence is represented by "EMPTY".
Note that make, reverse, push_*, and pop_* operations check for correctness on the entire list every time. This is to absolutely make sure that everything is working as expected within the actual list.
Note also that RESULT can be set to "X" to disable checking for correctness at that line's execution. This is mainly for operations that are provided large inputs, and where checking for correctness is too expensive and takes too long.
*Please see inputs/test_input_0.txt for a complete example.
Lastly, by specifying OPERATION to be "MSG", the tester can send custom messages to the output stream using the other three columns, such as telling the user the current set of test types currently being performed, as it reads the .txt file line-by-line, or the beginning and end of a Layer.
For the actual tests themselves, to test for each ADT's correctness and efficiency, we have a Layered Testing system. The tests are divided into Layers.
Each Layer tests a particular set of operations, edge cases, and concepts, with later layers potentially being harder to pass.
This is so that it's easier to catch bugs in the earlier layers that test a set of operations instead of everything all at once, as the later layers throw eveything they can at the list to absolutely make sure nothing is broken and no lists pass if they have unnoticed broken edge cases, while being a bit more confusing to debug for the implementer.
LAYER 0
INITIALIZATION TEST
- MAKE (0 -> 1000)
- MAKE (length of RANDOM_INTEGER(0, 1000))
- with DATA in ranges [-10^18, 10^18]
This layer tests for the make operation, as well as the list's ability to hold DATA in large ranges.
LAYER 1
>> BASIC OPERATIONS TEST
- GET (Random Index)
- SET (Random Index and Random Data)
- GET (Edge Indices)
- SET (Edge Indices and Random Data)
- PEEK_*
- SIZE
- EMPTY
- REVERSE
* Repeat all of the above but done for each MAKE (0 -> 1000)
This layer is especially important for testing all non-insertion and non-deletion operations to see if they work as intended.
Additionally, it repeats the basic operations test, but right after making a new list for each size make's correctness once more.
LAYER 2:
>> INSERTIONS/DELETIONS TEST
- PUSH_* (Random Data) (in sequence)
- POP_* (in sequence)
This is the start of basic insertion/deletion operations. It does each push and pop operation in sequence (left first, then right.) Lists that don't implement it correctly have a high chance to fail here.
LAYER 3:
>> BASIC OPERATIONS TEST [Harder]
- GET (Random Index)
- SET (Random Index and Random Data)
- PEEK_*
- occasional REVERSE
- SIZE, EMPTY
>> INSERTIONS/DELETIONS TEST [Harder]
- PUSH_* (Random Data)
- POP_*
- occasional REVERSE and SET (Random Index, Random Data)
- SIZE, EMPTY
Attempts to break the list by doing random operations.
It features more detailed basic operation tests, as well as more insertion/deletion operations. It may catch edge cases for some list implementations, and break them.
LAYER 4:
>> BREAKER TEST:
- PUSH_* (Random Data) for RANDOM_INTEGER(2000, 5000) times
- POP_* until empty
- Along with all other OPERATIONS throughout (to test for UB)
>> BREAKER TEST 2 (Random Operations)
- PUSH_* (Random Data)
- POP_*
- Along with all other OPERATIONS throughout (to test for UB)
Attempts to shatter the Reflection, one last time, with testing all operations alongside a continuous insertion/deletion operation, for a large amount of times, to test for edge cases and UB. It throws everything it can towards the list. If a list was not caught broken before, it could be now.
The most precise implemented lists with a couple of uncaught possible errors may have a difficult time passing this layer without catching any wrong edge cases.
LAYER 5:
> FINAL LAYER:
>> FINALE:
- PUSH_* (Random Data) for RANDOM_INTEGER(500, 1000) times first
- All OPERATIONS for RANDOM_INTEGER(15000, 30000) times
- bound size to <=4444
- After the loop, pops list until n == 0
The final Layer. The tester gives up and surrenders itself to probability due to its inability to break the list in Layer 4. Now, it only tries to break the list with random operations. It is maybe possible that the list still breaks in this layer due to some unforeseen edge cases.
[C] Reflection (settings): test_settings_r.h
[C-side].
The settings for the automatic tester.
Included as a C header by the tester.
| SETTING | VALUE | DEFAULT |
|---|---|---|
| IMPLEMENTATION | DOUBLY_LINKED_LIST / DYNAMIC_ARRAY / SKIP_LIST / TREE_SEQUENCE The desired list to test. |
DOUBLY_LINKED_LIST |
| LINE_DISPLAY | boolean Whether to display the current line executing. In place of a debugger, this is useful for segfaults where the tester abruptly stops and the faulty line is unknown. |
true |
| CHECK_FOR_EFFICIENCY | boolean Whether the automatic tester checks for efficiency (TLE). |
false |
| TLE_BOUND | double (milliseconds) Time boundary for throwing TLE. |
1000.0 |
| INPUT_DIRECTORY | string The test inputs' file directory. |
inputs |
[C] Reflection: reflection.c
[C-side.]
The automatic tester for all the generated test cases.
It first obtains each line of the .txt and stores it in an array.
Then, it sifts through each line. If the line's RESULT is not X, then it verifies for correctness and notifies the user if an operation's output failed to match RESULT. It also tests for its efficiency (if CHECK_FOR_EFFICIENCY setting is true.) Additionally, it verifies for the test operation TEST_internal to verify the internal tests, discussed below.
If all tests pass, it notifies the user that they have passed all Layers.
The tester also uses special test functions for its testing. These are global test operations that must be implemented for each list.
This is used by the Unit Tester to check for correctness against the true raw sequence of DATA values of the list (unaffected by reversal flags), WITHOUT using get or any other operation.
For this reason, the implementer must absolutely make sure that it works correctly for any n and seq, so that there is no confusion on if it's the executed operation that failed, or if it's TEST_elements that's the culprit.
This function is used in place of expensive assertions within the list implementations themselves.
It is used by the Unit Tester to check if any internal testing was successful. The implementer of the list can put any test within this function, as long as it returns either true or false to determine if the test was successful or not.
For example, the Sequence of Trees implementation tests for the required sequence of k's for each tree to be a concatenation of strictly increasing, and then strictly decreasing types. It will return false if it fails to satisfy this test at any point.
This is used by the Unit Tester to check if the list is reversed. The implementer must return their reversal flag in this function. Since all lists aim for reverse, this can apply to all lists.
[C] Grapher (settings): test_settings_g.h
The settings for the graph tester. Included as a C header by the tester.
| SETTING | VALUE | DEFAULT |
|---|---|---|
| IMPLEMENTATION | DOUBLY_LINKED_LIST / DYNAMIC_ARRAY / SKIP_LIST / TREE_SEQUENCE The desired list to test. |
DOUBLY_LINKED_LIST |
| OUTPUT_DIRECTORY | string The graph outputs' file directory. |
outputs |
[C] Grapher (Generator): graph.c
A specialized generator which tests the list against large inputs, without checking for correctness.
After the tests, it outputs plot points/deltatime benchmarks for each operation in their own .txt file in OUTPUT_DIRECTORY.
[PYTHON] Grapher (Display): graph.py
Displays the graphs using the outputs in OUTPUT_DIRECTORY for judging whether the graph of OPERATION is constant, linear, or logarithmic in nature.
With that, the steps are as follows:
- GENERATE NEW TEST CASES (Optional)
- Update test_settings_m.py settings to desired values
- Run mirror.py
- TEST THE LIST
- Update test_settings_r.h settings to desired values
- Run reflection.c (without linking the target list's source file, as it is managed by the settings already)
- If any bugs are caught, debug with given error info and look at the line the operation failed in the files in INPUT_DIRECTORY
- GRAPHS (Optional)
- Update test_settings_g.h settings to desired values
- Run graph.c to generate plot points
- Run graph.py to display the graphs
- Judge whether each operation is constant, linear, or logarithmic in nature. (This is only for analyzing, and not a true determiner of its time complexity.)
- Wikipedia - Binomial heap
- NOI.PH DS 3 Section 6.2.2
- Data Structures in Typescript #17 - Binomial Heap Introduction by j4orz