Only submit changes/deltas when processing bulk requests#191
Open
Only submit changes/deltas when processing bulk requests#191
Conversation
|
| GitGuardian id | Secret | Commit | Filename | |
|---|---|---|---|---|
| 6465341 | Generic High Entropy Secret | edb25ac | docker-compose.yml | View secret |
| 6465341 | Generic High Entropy Secret | edb25ac | hub3.toml | View secret |
| 6465341 | Generic High Entropy Secret | b8ae4ce | hub3.toml | View secret |
| 6753863 | Redis Server Password | 81ac24d | docker-compose.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Our GitHub checks need improvements? Share your feedbacks!
feat(go.mod): add oklog/ulid v1.3.1 dependency chore(hub3.toml): change minio endpoint port from 9000 to 9010 refactor(hub3/fragments/graph.go): modify Reader method to return the length of the byte array refactor(rdfstream.go): reorganize import statements feat(rdfstream.go): add HubID and OrgID to the fragment metadata in IndexFragments() function refactor(resource.go): fix typo in CreateDateRange error message refactor(resource.go): rename year variable to date in padYears function refactor(resource.go): rename formattedDate variable in padYears function refactor(resource.go): fix typo in hyphenateDate error message refactor(resource.go): rename splitDate function to splitPeriod for clarity refactor(resource.go): improve error messages in SetContextLevels and NewResourceMap functions fix(resource.go): fix typo in error message in SetContextLevels function refactor(sparql.go): add omitempty to SparqlUpdate struct fields feat(config/bulk.go): add StoreRequests field to Bulk struct feat(config/elasticsearch.go): add LogRequests field to BulkConfig struct feat(handle_upload.go): add GetGraph method to Service struct feat(options.go): add SetLogRequests option to Service struct feat(parser.go): add support for logging raw requests in bulk parser service feat(parser.go): add support for storing bulk request to disk for debugging feat(parser.go): add support for storing graphs to MinIO fix(parser.go): fix variable naming inconsistency in setDataSet function refactor(parser.go): remove unused code and comments feat(parser.go): add HubID and OrgID to RDF bulk request fix(parser.go): use IterTriplesOrdered instead of IterTriples to serialize triples in order refactor(service.go): reformat code for better readability feat(service.go): add logRequests boolean option to NewService function
refactor(config): remove unused SparqlUsername and SparqlPassword fields from RDF configuration feat(config): enable storing only changed triples in the triple store feat(bulk): add support for storing RDF data in Redis for delta updates feat(bulk): add support for finding and dropping orphaned graphs in Redis for delta updates refactor(bulk): remove unused SetDBPath option refactor(parser.go): remove unused imports and variables feat(parser.go): add setUpdateDataset and dataset methods to safely access and modify the dataset feat(parser.go): add storeGraphDeltasOld method to be removed later feat(parser.go): add storeGraphDeltas method to store graph deltas in redis and S3 feat(parser.go): add dropGraphOrphans method to drop orphan graphs from redis and triple store feat(parser.go): add incrementRevision method to increment the revision of the dataset feat(parser.go): add process method to process requests and increment revisions feat(parser.go): add stats field to Stats struct to track graphs stored refactor(service.go): remove unused imports and variables feat(service.go): add Redis support to the bulk service to store and retrieve data
… project up to date and reduce clutter chore(go.sum): update dependencies
|
Kudos, SonarCloud Quality Gate passed!
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.








The load on downstream services such as the elasticsearch index and Fuseki triplestore is heavy when processing identical data. This pull request changes the bulk indexing workflow to store the data first in a blob-storage (minio) and keep track of the records and changes in an embedded database (duckdb). After the final call to drop-orphans, the changes are calculated and submitted for processing.
For the SPARQL insert and updates, we remove the 'drop' statement for each named graph, and replace it with a delta insert statement. This deletes specific triples and adds the new ones.