Skip to content

AkshithMahalinga1-gep/Applicable-Control-Migration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ApplicableControl Migration: Neo4j → MongoDB

Overview

This migration pipeline extracts ApplicableControl data from Neo4j (via the LEO Storage DataService API), maps it to the MongoDB domain model, and generates a MongoDB update script.

Files

├── README.md
├── cypher/                              # Neo4j Cypher queries
│   ├── BulkGetApplicableControls.cypher
│   ├── CountRiskAssessments.cypher
│   └── GeApplicableControl.cypher
├── scripts/                             # Python & Shell scripts
│   ├── ExecuteNeo4jQuery.sh
│   ├── FetchApplicableControls.py
│   ├── TransformAndGenerateMongoScript.py
│   └── FetchAndGenerateMigrationScripts.py
├── schema/                              # Reference/schema files
│   ├── applicable-control-structure-in-domain-model.json
│   └── GetApplicableControl-Output.json
├── data/                                # Generated intermediate data
│   ├── neo4j-raw-applicable-controls.json
│   ├── mapped-applicable-controls.json
│   └── fetched-ae-data.json
└── output/                              # Generated migration scripts
    ├── MongoDBMigration.js              (legacy single-file script)
    └── migration-scripts/
        ├── migration-batch-1.js
        └── migration-batch-2.js
File Purpose
cypher/BulkGetApplicableControls.cypher Bulk Cypher query with SKIP/LIMIT pagination
cypher/CountRiskAssessments.cypher Count query to determine total batches
scripts/FetchApplicableControls.py Python script – calls the API in batches, saves raw JSON
scripts/TransformAndGenerateMongoScript.py Python script – maps raw data to domain model, generates MongoDB script
scripts/FetchAndGenerateMigrationScripts.py Python script – fetches ae* data from MongoDB, generates batched migration scripts
data/neo4j-raw-applicable-controls.json (generated) Raw Neo4j output
data/mapped-applicable-controls.json (generated) Data mapped to MongoDB domain model
data/fetched-ae-data.json (generated) Fetched ae* data from MongoDB
output/MongoDBMigration.js (generated) Legacy MongoDB bulk update script
output/migration-scripts/ (generated) Batched MongoDB migration scripts (max 5K docs each)

Prerequisites

  1. Python 3.7+ with requests library:

    pip install requests
  2. Register the Cypher queries in the LEO query resolver:

    • CountRiskAssessmentsWithApplicableControlsCountRiskAssessments.cypher
    • BulkGetApplicableControlsBulkGetApplicableControls.cypher
  3. MongoDB shell (mongosh) for running the migration script.

Step-by-Step

Step 1 – Fetch data from Neo4j

python scripts/FetchApplicableControls.py \
    --auth-token "Bearer <YOUR_JWT_TOKEN>" \
    --batch-size 500

This will:

  • Call the count query to determine total records
  • Fetch all records in batches of 500 (configurable)
  • Save raw Neo4j response to data/neo4j-raw-applicable-controls.json

Step 2 – Transform and generate MongoDB script

python scripts/TransformAndGenerateMongoScript.py

This will:

  • Read the raw Neo4j data from data/neo4j-raw-applicable-controls.json
  • Map each record to the domain model structure (see schema/applicable-control-structure-in-domain-model.json)
  • Save mapped data to data/mapped-applicable-controls.json
  • Generate output/MongoDBMigration.js

Step 3 – Generate batched migration scripts (with ae* data)

pip install "pymongo[srv]"
python scripts/FetchAndGenerateMigrationScripts.py

This will:

  • Read mapped controls from data/mapped-applicable-controls.json
  • Fetch ae* data (aeEnitityId, aeUniqueIdValue, aeEntityCode) from QC MongoDB
  • Save fetched ae* data to data/fetched-ae-data.json
  • Generate batched scripts (max 5000 docs each) in output/migration-scripts/

Step 4 – Run the MongoDB migration

mongosh "<MONGODB_CONNECTION_STRING>" output/migration-scripts/migration-batch-1.js
mongosh "<MONGODB_CONNECTION_STRING>" output/migration-scripts/migration-batch-2.js

Domain Model Mapping

Neo4j Property MongoDB Field Default
isMigrated isMigrated false
applicableControlId applicableControlId null
manuallyAdded manuallyAdded false
isMraLevelApplicable isMraLevelApplicable false
masterFormInternalDocId masterFormInternalDocId null
(not in Neo4j) comments null
transactionalFormInternalDocId (from query) transactionalFormInternalDocId null
triggeringSourceId triggeringSourceId 0
triggeringActionTypeId triggeringActionTypeId ""
(not in Neo4j) reasonCode null
createdBy createdBy null
createdOn createdOn null
isDeleted isDeleted false
isEditable isEditable false
aeEnitityId aeEnitityId null
aeUniqueIdValue aeUniqueIdValue null
aeEntityCode aeEntityCode null

Batching Strategy

  • Count first: A count query determines the total number of RiskAssessments with ApplicableControls.
  • SKIP/LIMIT: The bulk Cypher query uses SKIP $skipCount LIMIT $batchSize for pagination.
  • Default batch size: 500 records per API call (configurable via --batch-size).
  • MongoDB bulkWrite: For datasets > 50 records, the migration script uses bulkWrite with chunks of 500 operations. For smaller datasets, individual updateOne calls are used for easier debugging.

About

Migration pipeline to extract ApplicableControl data from Neo4j (via LEO Storage DataService API), map it to the MongoDB domain model, and generate batched MongoDB update scripts with $push/$each operations. Includes ae* field enrichment from assessableEntities.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors