feat: add darkness filtering capability to dataloader#150
feat: add darkness filtering capability to dataloader#150maharajamihir wants to merge 3 commits intomainfrom
Conversation
maharajamihir
commented
Aug 12, 2025
- Introduced a DarknessFilter class to filter out sequences with images that are too dark based on a specified threshold.
- Added a darkness_threshold parameter to Args and the get_dataloader function to support this feature.
- DarknessFilter is optional. A threshold value greater than 0.0 activates it.
- Introduced a DarknessFilter class to filter out sequences with images that are too dark based on a specified threshold. - Added a darkness_threshold parameter to Args and the get_dataloader function to support this feature.
There was a problem hiding this comment.
Pull Request Overview
This PR adds darkness filtering capability to the dataloader to filter out video sequences containing images that are too dark based on a configurable threshold.
- Introduced a
DarknessFilterclass that computes average brightness of image sequences and filters them based on a threshold - Added
darkness_thresholdparameter to theArgsclass andget_dataloaderfunction with a default value of 0.0 - Integrated the darkness filter into the data processing pipeline between episode processing and batching
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| utils/dataloader.py | Implements DarknessFilter class and adds darkness filtering to the dataloader pipeline |
| train_dynamics.py | Adds darkness_threshold parameter to configuration and passes it to the dataloader |
| darkness_threshold=darkness_threshold | ||
| ), | ||
| grain.transforms.Batch(batch_size=per_process_batch_size, drop_remainder=True), | ||
| ] |
There was a problem hiding this comment.
The DarknessFilter is always instantiated even when darkness_threshold is 0.0 (disabled). Consider conditionally adding the filter only when darkness_threshold > 0.0 to avoid unnecessary processing overhead.
| ] | |
| ] | |
| if darkness_threshold > 0.0: | |
| operations.append( | |
| DarknessFilter( | |
| darkness_threshold=darkness_threshold | |
| ) | |
| ) | |
| operations.append( | |
| grain.transforms.Batch(batch_size=per_process_batch_size, drop_remainder=True) | |
| ) |
| element_greyscale = np.dot(element[...,:3], [0.2989, 0.5870, 0.1140]) | ||
| average_brightness = np.mean(element_greyscale) | ||
| if average_brightness < self.darkness_threshold: | ||
| print( |
There was a problem hiding this comment.
Using print() statements for logging in production code is not recommended. Consider using a proper logging framework like Python's logging module to allow for better control over log levels and output destinations.
| print( | |
| logging.info( |
utils/dataloader.py
Outdated
| True if the sequence is not too dark, False otherwise. | ||
| """ | ||
| # Convert the RGB image to grayscale using numpy | ||
| element_greyscale = np.dot(element[...,:3], [0.2989, 0.5870, 0.1140]) |
There was a problem hiding this comment.
The RGB to grayscale conversion coefficients [0.2989, 0.5870, 0.1140] are magic numbers. Consider defining these as named constants (e.g., RGB_TO_GRAYSCALE_WEIGHTS) to improve code readability and maintainability.
| element_greyscale = np.dot(element[...,:3], [0.2989, 0.5870, 0.1140]) | |
| element_greyscale = np.dot(element[...,:3], RGB_TO_GRAYSCALE_WEIGHTS) |