The final version of the insider threat detection. It has the following features:
- Extracts more compact and discriminative features.
- Propose graph based detection algorithm to improve performance.
- Apache Spark
- pip install wrapt
- pip install pgmpy
-
Download CERT data r6.2.tar.bz2
-
Download answers.tar.bz2
-
Extract both r6.2.tar.bz2 and answers.tar.bz2, and place extracted answers under r6.2 folder.
- SPARK_MASTER: master address of the Spark.
- config.io.data_dir: root of the extracted r6.2 data.
- bash run.sh
- cache: all necessary intermediate results.
- result: scores of baseline systems.
- CR scores: printed to the terminal with highlighted colors.
The metric used for evaluation is cumulative recall (CR), with bucket size 25.
Table 1. The CR for 400 (perfect score is 16)
| Algorithms | PCA | SVM | ISO-Forest | DNN |
|---|---|---|---|---|
| No GTM | 13.64 | 10.36 | 8.10 | 13.91 |
| GTM Enabled | 15.00 | 12.00 | 11.27 | 15.54 |
Table 2. The CR for 1000 (perfect score is 40)
| Algorithms | PCA | SVM | ISO-Forest | DNN |
|---|---|---|---|---|
| No GTM | 37.18 | 34.36 | 32.10 | 36.45 |
| GTM Enabled | 39.00 | 35.73 | 35.27 | 39.54 |