Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
147 commits
Select commit Hold shift + click to select a range
0ea7d6d
added pattern mode to the config
dude-with-a-mug Aug 18, 2025
ab349c4
added pattern mode properties to the container and view
dude-with-a-mug Aug 18, 2025
db1ee7c
pattern mode job creation
dude-with-a-mug Aug 18, 2025
8c9e0cf
function parameter change
dude-with-a-mug Aug 18, 2025
58873f2
run pattern mode in process batch
dude-with-a-mug Aug 18, 2025
373e9be
separated pattern samples to files and assets
dude-with-a-mug Aug 18, 2025
6bd4598
added a file category property to the entities to support pattern mode
dude-with-a-mug Aug 18, 2025
e02c575
adjustment in comments
dude-with-a-mug Aug 18, 2025
a61ec9e
Corrected ICacheService implementation
dude-with-a-mug Aug 18, 2025
579941c
matched changes from launch to finalize
dude-with-a-mug Aug 18, 2025
5b3d46e
removed pattern mode status from data model and incorporated bits of …
dude-with-a-mug Aug 18, 2025
c0f061b
no reason to not keep job_id
dude-with-a-mug Aug 18, 2025
1ea2bcc
added pattern table to ep config and code to update pattern table
dude-with-a-mug Aug 18, 2025
f50a16b
removed patternModeJobId whenever annotation state fails
dude-with-a-mug Aug 18, 2025
c6b2e72
efficiency from one time retrieval of file node
dude-with-a-mug Aug 18, 2025
83444b0
made function signature same for implementation and interface
dude-with-a-mug Aug 18, 2025
20546d0
moved generate sample log to make more sense
dude-with-a-mug Aug 19, 2025
0ea20b9
adjusted container and view
dude-with-a-mug Aug 19, 2025
3558e9c
added indexes
dude-with-a-mug Aug 19, 2025
2e4528a
adjusted local_setup
dude-with-a-mug Aug 19, 2025
d201cf8
fixed var name and made search_property always a list
dude-with-a-mug Aug 19, 2025
b71965a
removed the post processing status
dude-with-a-mug Aug 19, 2025
0cd2908
working finalize function
dude-with-a-mug Aug 19, 2025
8949f5c
improved wording
dude-with-a-mug Aug 19, 2025
18a004b
improved logging for launch
dude-with-a-mug Aug 19, 2025
619eccc
changed log var access
dude-with-a-mug Aug 19, 2025
b046a99
changed log var access
dude-with-a-mug Aug 19, 2025
0d28807
added logging of pettern mode job id
dude-with-a-mug Aug 19, 2025
228798c
improved logging
dude-with-a-mug Aug 19, 2025
321fcde
added to README
dude-with-a-mug Aug 24, 2025
70c5aef
storing results of pattern mode to be more in line with referenced_in…
dude-with-a-mug Aug 28, 2025
4a1ec2d
improved streamlit experience for multiple extraction pipelines
dude-with-a-mug Aug 28, 2025
c04079d
Refactored apply service
dude-with-a-mug Aug 28, 2025
33c30d3
added annotation quality page to streamlit module
dude-with-a-mug Aug 30, 2025
5696065
changed the color of pattern mode results to green
dude-with-a-mug Sep 2, 2025
9ade999
added manual pattern mode module, integrated to cache service, update…
dude-with-a-mug Sep 2, 2025
0c9ecf9
improved annotation percentage calculations
dude-with-a-mug Sep 3, 2025
8acda8d
made per-file-annotation a header instead of subheader
dude-with-a-mug Sep 3, 2025
55515aa
updated annotation quality report for better filtering of large data …
dude-with-a-mug Sep 5, 2025
cbba680
added to the default columns displayed
dude-with-a-mug Sep 8, 2025
9d5e5b9
refactored pattern mode logging to accurately reflect total sample pa…
Sep 8, 2025
bc8ed62
handled potential type error from the function_id_str and call_id_str
dude-with-a-mug Sep 8, 2025
c3552b2
Merge pull request #42 from cognitedata/LaunchServices_bugfix
htnguyen19 Sep 8, 2025
8c20695
added new properties to the data model
dude-with-a-mug Sep 8, 2025
31ecef5
adding function call info to the annotation state
dude-with-a-mug Sep 9, 2025
ebe05b6
Improved streamlit dashboard to account for new logging properties
dude-with-a-mug Sep 11, 2025
4a20ed4
improved the UI of the pipeline health dashboard
dude-with-a-mug Sep 11, 2025
fae4bef
Merge pull request #44 from cognitedata/logging-improvements
dude-with-a-mug Sep 11, 2025
0af0e7e
improved the UI of the annotation quality dashboard
dude-with-a-mug Sep 11, 2025
f4422a5
Merge pull request #45 from cognitedata/annotation-quality-UI
dude-with-a-mug Sep 11, 2025
ddeb478
fixed bug where multi-page documents that have clean old annotations …
dude-with-a-mug Sep 11, 2025
e41a18e
log deleted annotations
dude-with-a-mug Sep 11, 2025
d75a291
Merge pull request #46 from cognitedata/clean-old-annotations-bug
dude-with-a-mug Sep 11, 2025
651a060
skip empty and unhashable columns in pipeline health dashboard
dude-with-a-mug Sep 12, 2025
fe47955
Making canvas URL generation dynamic and environment-aware
lucasguimaraes-rdx Sep 25, 2025
ec3627d
Removing default domain value when generating file canvas
lucasguimaraes-rdx Sep 26, 2025
9cc494c
Merge pull request #49 from cognitedata/fix/issue-48
dude-with-a-mug Sep 26, 2025
0fffed8
added instance space in default config
dude-with-a-mug Sep 26, 2025
0025b30
deleted auto creation of annotation state instance space
dude-with-a-mug Sep 26, 2025
0ac7dc2
normalized results of diagram detect and pattern detect such that com…
dude-with-a-mug Sep 29, 2025
44070bb
remove leading zero before comparison
dude-with-a-mug Sep 30, 2025
9a05fec
always turn remove leading zero off for pattern detect
dude-with-a-mug Sep 30, 2025
05c26ab
Merge pull request #52 from cognitedata/pattern-remove-leading-zero
dude-with-a-mug Sep 30, 2025
4ed052b
Merge pull request #51 from cognitedata/normalize-detect-results
dude-with-a-mug Oct 1, 2025
ccd6b42
Merge pull request #50 from cognitedata/scoped-instance-config
dude-with-a-mug Oct 1, 2025
73c6817
create a CogniteSolutionTag instance used for canvas labels
dude-with-a-mug Oct 1, 2025
d1793a7
add the file annotation label to the generated canvas
dude-with-a-mug Oct 1, 2025
2fd47b3
Merge pull request #53 from cognitedata/add-solution-tag
dude-with-a-mug Oct 1, 2025
aed58ce
updated cache storage to support creation of edges
dude-with-a-mug Oct 6, 2025
c536f07
updated finalize function to support creation of edges
dude-with-a-mug Oct 6, 2025
20a9e46
added new instance space for holding cogniteDiagramAnnotation instances
dude-with-a-mug Oct 6, 2025
0eef9d8
only capture non-duplicate edges between pattern mode and regular dia…
dude-with-a-mug Oct 6, 2025
59fb057
updated comments
dude-with-a-mug Oct 7, 2025
4f8dbb8
updated streamlit module
dude-with-a-mug Oct 7, 2025
99d2597
fixed pattern generation
dude-with-a-mug Oct 7, 2025
c510e3e
added default annotation type for manual patterns in case it's null
dude-with-a-mug Oct 8, 2025
41c7a7b
resolve duplicate external ids for overlapping annotation edges from …
dude-with-a-mug Oct 8, 2025
8140e39
Including more views in fetch extraction pipeline config method
lucasguimaraes-rdx Oct 8, 2025
98c288d
Including annotation_type field for patterns
lucasguimaraes-rdx Oct 8, 2025
6451d73
Refactoring canvas to fetch existing canvas and create canvas annotat…
lucasguimaraes-rdx Oct 8, 2025
8f566a4
Removing unnecessary second call to fetch extraction pipeline config
lucasguimaraes-rdx Oct 8, 2025
39ed2c7
Creating annotation_type selectbox when fetching manual patterns
lucasguimaraes-rdx Oct 8, 2025
3a31037
Creating component to connect tags with entities through patterns
lucasguimaraes-rdx Oct 8, 2025
a8cfe60
refactored the list annotations function in apply service
dude-with-a-mug Oct 10, 2025
2b5624d
Including annotation type reference when saving manual patterns
Oct 10, 2025
162c283
Implementing "Customize Table Columns" multiselect box when fetching …
Oct 10, 2025
88b702e
Ensuring sink node space for created edges and fetching all target en…
Oct 10, 2025
ce8862a
Merge pull request #57 from cognitedata/remove-end-node-filter
dude-with-a-mug Oct 13, 2025
25fbfd0
Adding fallback for resourceType using entity view when promoting edges
lucasguimaraes-rdx Oct 14, 2025
e83d355
Merge pull request #55 from cognitedata/streamlit-improvements-for-pa…
dude-with-a-mug Oct 14, 2025
4c8c8ed
added random delay and corrected retrieval of oldest job first
dude-with-a-mug Oct 14, 2025
59c830d
renamed file containing nodes to deploy
dude-with-a-mug Oct 14, 2025
95c0b6e
removed the report service section in the extraction pipeline config
dude-with-a-mug Oct 14, 2025
48259fe
Merge pull request #58 from cognitedata/remove-report-service
dude-with-a-mug Oct 14, 2025
8c811e3
added doc strings to each method in google format
dude-with-a-mug Oct 14, 2025
9db710d
broke out the README file into separate more manageable readme files
dude-with-a-mug Sep 11, 2025
88ec5cb
updated readme and added supporting markdown files
dude-with-a-mug Oct 14, 2025
ff2cb7b
filter overly aggressive numeric patterns and special character handling
dude-with-a-mug Oct 14, 2025
568221b
Merge pull request #59 from cognitedata/improve-pattern-generation
dude-with-a-mug Oct 14, 2025
1f51ed5
formatted line length to 120 characters
dude-with-a-mug Oct 14, 2025
0fea0e0
misc changes
dude-with-a-mug Oct 14, 2025
677049e
updated mermaid diagrams for launch and finalize phase
dude-with-a-mug Oct 14, 2025
c939766
updated detailed_guides
dude-with-a-mug Oct 14, 2025
cdd7683
updated the quickstart_setup to create assets from the equipment
dude-with-a-mug Oct 14, 2025
0f0a5f7
extraction pipeline now deploys with up to date yaml guide
dude-with-a-mug Oct 14, 2025
952b9f3
Merge pull request #60 from cognitedata/update-documentation
dude-with-a-mug Oct 14, 2025
a93d09e
edited length of ep documentation
dude-with-a-mug Oct 14, 2025
0b8722c
Merge pull request #54 from cognitedata/store-pattern-results-edges
dude-with-a-mug Oct 14, 2025
9b2dc9a
initial commit - added the repair service
dude-with-a-mug Oct 13, 2025
649c006
changed repair service to promote service
dude-with-a-mug Oct 15, 2025
018621e
built out cache mechanism and query plan for the text
dude-with-a-mug Oct 15, 2025
33be3f4
added a sourceUser for rows uploaded to promote_text cache
dude-with-a-mug Oct 16, 2025
254956f
updated docstring and type hints
dude-with-a-mug Oct 16, 2025
c264943
improved logging
dude-with-a-mug Oct 16, 2025
b0938db
added a promoteFunction config section
dude-with-a-mug Oct 16, 2025
2de30be
implemented the promote service flags
dude-with-a-mug Oct 16, 2025
ef2b746
refactored the existing_annotation and global_entity functions
dude-with-a-mug Oct 16, 2025
db9fadf
provide the correct viewId to use as source when global entity search
dude-with-a-mug Oct 16, 2025
f31d3c5
retired find_from_existing_annotations() function
dude-with-a-mug Oct 16, 2025
7ee95b0
updated the README with promote phase
dude-with-a-mug Oct 16, 2025
b2b77d1
updated supporting modules configs
dude-with-a-mug Oct 17, 2025
373afb8
refactored the handler to run independent of 7min timeout
dude-with-a-mug Oct 17, 2025
850fbea
added error handling to the promote service
dude-with-a-mug Oct 17, 2025
c7466f6
updated configs and added new flags to promote function
dude-with-a-mug Oct 17, 2025
ddb543a
delete edges based off flags
dude-with-a-mug Oct 17, 2025
43c3db8
made flags camel case in ep_config
dude-with-a-mug Oct 17, 2025
066685a
clearer logging
dude-with-a-mug Oct 17, 2025
7f13188
Merge pull request #70 from cognitedata/delete-non-approved-edges
dude-with-a-mug Oct 17, 2025
9c2a244
Merge pull request #65 from cognitedata/automate-pattern-connections
dude-with-a-mug Oct 17, 2025
7f85984
created a launch finalize and promote workflow version
dude-with-a-mug Oct 18, 2025
a5ee502
added a workflow trigger for each workflow version
dude-with-a-mug Oct 18, 2025
ed7ea8c
made different versions and triggers an array in yaml
dude-with-a-mug Oct 18, 2025
70d4308
removed the prepare function from launch service
dude-with-a-mug Oct 19, 2025
3886f2c
created separate prepare function
dude-with-a-mug Oct 19, 2025
ef26cbe
created workflow version and trigger for the prepare function
dude-with-a-mug Oct 19, 2025
e60b8f1
Merge pull request #71 from cognitedata/separate-prepare-from-launch
dude-with-a-mug Oct 19, 2025
062ca83
Merge pull request #72 from cognitedata/separate-workflow-versions
dude-with-a-mug Oct 19, 2025
16c3dee
dm workflow trigger for prepare, launch, and finalize
dude-with-a-mug Oct 19, 2025
072eef8
resolved conflicts
dude-with-a-mug Oct 19, 2025
1cf5b03
only pick up annotation state instances with a linkedFile
dude-with-a-mug Oct 19, 2025
6179bc6
no longer need to worry about cluttering extraction pipeline run hist…
dude-with-a-mug Oct 19, 2025
fc29a0a
logging indentation change
dude-with-a-mug Oct 19, 2025
7715567
small ep variable value adjustment
dude-with-a-mug Oct 19, 2025
de646e1
added instance space to the data model triggers
dude-with-a-mug Oct 19, 2025
b805ba0
added a promote function trigger
dude-with-a-mug Oct 19, 2025
a5dfaa7
misc changes needed for promote function
dude-with-a-mug Oct 19, 2025
34a16ba
consider reducing finalize tasks to 1
dude-with-a-mug Oct 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
196 changes: 196 additions & 0 deletions modules/contextualization/cdf_file_annotation/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
# Contributing to CDF File Annotation Module

Thank you for your interest in contributing to the CDF File Annotation Module! This document outlines the process for contributing to this project.

## Contribution Workflow

All contributions to this project must follow this workflow:

### 1. Create a GitHub Issue

Before making any changes, please create a GitHub issue to discuss:

- **Bug Reports**: Describe the bug, steps to reproduce, expected vs. actual behavior, and your environment
- **Feature Requests**: Describe the feature, its use case, and how it would benefit the project
- **Documentation Improvements**: Describe what documentation is missing or needs clarification
- **Code Improvements**: Describe the refactoring or optimization you'd like to make

**Why create an issue first?**

- Ensures alignment on the problem and proposed solution
- Prevents duplicate work
- Allows for discussion before investing time in implementation
- Provides context for the eventual pull request

### 2. Create a Pull Request

Once the issue has been discussed and you're ready to contribute:

1. **Fork the repository** to your GitHub account
2. **Create a feature branch** from `main`:

```bash
git checkout -b feature/issue-123-short-description
```

or

```bash
git checkout -b fix/issue-456-short-description
```

3. **Make your changes** following the code standards below

4. **Commit your changes** with clear, descriptive commit messages:

```bash
git commit -m "Fix: Resolve cache invalidation issue (#123)

- Updated cache validation logic to handle edge cases
- Added unit tests for cache service
- Updated documentation"
```

5. **Push to your fork**:

```bash
git push origin feature/issue-123-short-description
```

6. **Create a Pull Request** on GitHub:
- Reference the related issue in the PR description (e.g., "Closes #123" or "Fixes #456")
- Provide a clear description of what changed and why
- Include any relevant testing details or screenshots
- Add `@dude-with-a-mug` as a reviewer (or the current maintainer)

### 3. Code Review and Approval

- **All PRs require approval** from the project maintainer (@dude-with-a-mug or designated reviewer) before merging
- The maintainer will review your code for:

- Code quality and adherence to project standards
- Test coverage
- Documentation updates
- Breaking changes or backward compatibility
- Performance implications

- Address any feedback or requested changes
- Once approved, the maintainer will merge your PR

**Note**: PRs will not be merged without maintainer approval, even if all automated checks pass.

## Code Standards

### Python Code Style

- Use type hints for all function parameters and return values
- Maximum line length: 120 characters (as configured in the project)
- Use meaningful variable and function names

### Documentation

- **All functions must include Google-style docstrings** with:
- Brief description
- `Args`: Parameter descriptions
- `Returns`: Return value description
- `Raises`: Exception descriptions (if applicable)
- Update README.md or relevant documentation if your changes affect user-facing behavior
- Add inline comments for complex logic or non-obvious decisions

### Example Docstring Format

```python
def process_annotations(
self,
file_node: Node,
regular_item: dict | None,
pattern_item: dict | None
) -> tuple[str, str]:
"""
Processes diagram detection results and applies annotations to a file.

Handles both regular entity matching and pattern mode results, applying
confidence thresholds and deduplication logic.

Args:
file_node: The file node instance to annotate.
regular_item: Dictionary containing regular diagram detect results.
pattern_item: Dictionary containing pattern mode results.

Returns:
A tuple containing:
- Summary message of regular annotations applied
- Summary message of pattern annotations created

Raises:
CogniteAPIError: If the API calls to apply annotations fail.
ValueError: If the file node is missing required properties.
"""
# Implementation...
```

### Testing

- Add tests for new functionality where applicable
- Ensure existing tests pass before submitting your PR
- Test locally using the VSCode debugger setup (see [DEPLOYMENT.md](DEPLOYMENT.md))

### Configuration Changes

- If you modify the configuration structure (`ep_file_annotation.config.yaml`), ensure:
- Pydantic models are updated accordingly
- Documentation in `detailed_guides/CONFIG.md` is updated
- Backward compatibility is maintained or a migration path is provided

## What We're Looking For

Contributions that align with the project's philosophy:

- **Configuration-driven**: Prefer adding configuration options over hardcoded behavior
- **Interface-based**: Extend functionality through interfaces rather than modifying core logic
- **Well-documented**: Code should be self-explanatory with clear documentation
- **Production-ready**: Code should handle edge cases, errors, and scale considerations
- **Backward compatible**: Avoid breaking changes unless absolutely necessary

## Types of Contributions We Welcome

- **Bug fixes**: Resolve issues, fix edge cases, improve error handling
- **Performance improvements**: Optimize queries, caching, or processing logic
- **Documentation**: Improve guides, add examples, clarify confusing sections
- **New configuration options**: Add flexibility through new config parameters
- **New service implementations**: Create alternative implementations of existing interfaces
- **Test coverage**: Add unit tests, integration tests, or test utilities
- **Examples**: Add example configurations or use cases

## Types of Changes Requiring Extra Discussion

These types of changes require significant discussion in the GitHub issue before proceeding:

- Breaking changes to the configuration format
- Changes to the core architecture or interfaces
- New external dependencies
- Changes affecting the data model structure
- Performance changes that trade off memory/CPU/network differently

## Questions or Need Help?

- Create a GitHub issue with your question
- Tag it with the "question" label
- The maintainer will respond as soon as possible

## Code of Conduct

- Be respectful and constructive in all interactions
- Provide thoughtful, actionable feedback during code reviews
- Assume good intentions from all contributors
- Focus on the code and ideas, not the person

## License

By contributing to this project, you agree that your contributions will be licensed under the same license as the project (see LICENSE file).

---

Thank you for contributing to making this project better! 🚀

Return to [Main README](README.md)
Loading