This migration pipeline extracts ApplicableControl data from Neo4j (via the LEO Storage DataService API), maps it to the MongoDB domain model, and generates a MongoDB update script.
├── README.md
├── cypher/ # Neo4j Cypher queries
│ ├── BulkGetApplicableControls.cypher
│ ├── CountRiskAssessments.cypher
│ └── GeApplicableControl.cypher
├── scripts/ # Python & Shell scripts
│ ├── ExecuteNeo4jQuery.sh
│ ├── FetchApplicableControls.py
│ ├── TransformAndGenerateMongoScript.py
│ └── FetchAndGenerateMigrationScripts.py
├── schema/ # Reference/schema files
│ ├── applicable-control-structure-in-domain-model.json
│ └── GetApplicableControl-Output.json
├── data/ # Generated intermediate data
│ ├── neo4j-raw-applicable-controls.json
│ ├── mapped-applicable-controls.json
│ └── fetched-ae-data.json
└── output/ # Generated migration scripts
├── MongoDBMigration.js (legacy single-file script)
└── migration-scripts/
├── migration-batch-1.js
└── migration-batch-2.js
| File | Purpose |
|---|---|
cypher/BulkGetApplicableControls.cypher |
Bulk Cypher query with SKIP/LIMIT pagination |
cypher/CountRiskAssessments.cypher |
Count query to determine total batches |
scripts/FetchApplicableControls.py |
Python script – calls the API in batches, saves raw JSON |
scripts/TransformAndGenerateMongoScript.py |
Python script – maps raw data to domain model, generates MongoDB script |
scripts/FetchAndGenerateMigrationScripts.py |
Python script – fetches ae* data from MongoDB, generates batched migration scripts |
data/neo4j-raw-applicable-controls.json |
(generated) Raw Neo4j output |
data/mapped-applicable-controls.json |
(generated) Data mapped to MongoDB domain model |
data/fetched-ae-data.json |
(generated) Fetched ae* data from MongoDB |
output/MongoDBMigration.js |
(generated) Legacy MongoDB bulk update script |
output/migration-scripts/ |
(generated) Batched MongoDB migration scripts (max 5K docs each) |
-
Python 3.7+ with
requestslibrary:pip install requests
-
Register the Cypher queries in the LEO query resolver:
CountRiskAssessmentsWithApplicableControls→CountRiskAssessments.cypherBulkGetApplicableControls→BulkGetApplicableControls.cypher
-
MongoDB shell (
mongosh) for running the migration script.
python scripts/FetchApplicableControls.py \
--auth-token "Bearer <YOUR_JWT_TOKEN>" \
--batch-size 500This will:
- Call the count query to determine total records
- Fetch all records in batches of 500 (configurable)
- Save raw Neo4j response to
data/neo4j-raw-applicable-controls.json
python scripts/TransformAndGenerateMongoScript.pyThis will:
- Read the raw Neo4j data from
data/neo4j-raw-applicable-controls.json - Map each record to the domain model structure (see
schema/applicable-control-structure-in-domain-model.json) - Save mapped data to
data/mapped-applicable-controls.json - Generate
output/MongoDBMigration.js
pip install "pymongo[srv]"
python scripts/FetchAndGenerateMigrationScripts.pyThis will:
- Read mapped controls from
data/mapped-applicable-controls.json - Fetch ae* data (aeEnitityId, aeUniqueIdValue, aeEntityCode) from QC MongoDB
- Save fetched ae* data to
data/fetched-ae-data.json - Generate batched scripts (max 5000 docs each) in
output/migration-scripts/
mongosh "<MONGODB_CONNECTION_STRING>" output/migration-scripts/migration-batch-1.js
mongosh "<MONGODB_CONNECTION_STRING>" output/migration-scripts/migration-batch-2.js| Neo4j Property | MongoDB Field | Default |
|---|---|---|
isMigrated |
isMigrated |
false |
applicableControlId |
applicableControlId |
null |
manuallyAdded |
manuallyAdded |
false |
isMraLevelApplicable |
isMraLevelApplicable |
false |
masterFormInternalDocId |
masterFormInternalDocId |
null |
| (not in Neo4j) | comments |
null |
transactionalFormInternalDocId (from query) |
transactionalFormInternalDocId |
null |
triggeringSourceId |
triggeringSourceId |
0 |
triggeringActionTypeId |
triggeringActionTypeId |
"" |
| (not in Neo4j) | reasonCode |
null |
createdBy |
createdBy |
null |
createdOn |
createdOn |
null |
isDeleted |
isDeleted |
false |
isEditable |
isEditable |
false |
aeEnitityId |
aeEnitityId |
null |
aeUniqueIdValue |
aeUniqueIdValue |
null |
aeEntityCode |
aeEntityCode |
null |
- Count first: A count query determines the total number of RiskAssessments with ApplicableControls.
- SKIP/LIMIT: The bulk Cypher query uses
SKIP $skipCount LIMIT $batchSizefor pagination. - Default batch size: 500 records per API call (configurable via
--batch-size). - MongoDB bulkWrite: For datasets > 50 records, the migration script uses
bulkWritewith chunks of 500 operations. For smaller datasets, individualupdateOnecalls are used for easier debugging.