Skip to content

Conversation

@hlee9212
Copy link

Hi @nielsleadholm, here are the two files for 2D Sensor Module that can currently extract edges. Let me know if you'd also like to see any experimental configs as well. :)

@nielsleadholm
Copy link

Great thanks Hojae, just fyi that @vkakerbeck has offered to help review this given my earlier than expected paternity leave.

@nielsleadholm nielsleadholm requested review from vkakerbeck and removed request for nielsleadholm October 23, 2025 09:54
Copy link

@vkakerbeck vkakerbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is very useful to read through. I left a couple of comments and questions, mostly around readability and variable names. On a higher level, it would be nice if we don't have to introduce opencv as a new dependency, especially if it is just to do things we are already doing with other libraries in Monty (like gaussian smoothing and convolution).

Some things that would be nice to test but don't have to be part of this PR is to not throw away the color information, and to compare gabor filters for more precise angle detection.

(default: DEFAULT_WINDOW_SIGMA from edge_detection_utils)
- kernel_size: Kernel size for Gaussian blur
(default: DEFAULT_KERNEL_SIZE from edge_detection_utils)
- edge_threshold: Minimum edge strength threshold (default: 0.1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the range for this value? What does 0.1 mean?

- kernel_size: Kernel size for Gaussian blur
(default: DEFAULT_KERNEL_SIZE from edge_detection_utils)
- edge_threshold: Minimum edge strength threshold (default: 0.1)
- coherence_threshold: Minimum coherence threshold (default: 0.05)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit more here on what the coherence_threshold is? From the description I am not getting a lot more info than the variable name.

observed_state, telemetry = self._habitat_observation_processor.process(data)

if observed_state.use_state and observed_state.get_on_object():
observed_state = self.extract_2d_edge(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this not be part of what the _habitat_observation_processor? I am not sure if you are changing anything else in the SM, but if not then you could just implement a new observation processor instead of a new SM.


state.morphological_features["pose_vectors"] = np.vstack(
[
surface_normal,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get the pose from the surface texture, we don't want to include the surface normal here. The point of the 2D SM is that it sends the same information up for the logo on a flat surface as for a logo on a cylinder. This would not be the case if we include the surface normal. Surface normals encode information about the 3D structure of the object.

if "edge_strength" in self.features:
state.non_morphological_features["edge_strength"] = edge_strength
if "coherence" in self.features:
state.non_morphological_features["coherence"] = coherence

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is coherence? Maybe add some more info on that. It could also help to give it a more informative name like "2d_edge_coherence"

Comment on lines +113 to +115
Jxx = cv2.GaussianBlur(Jxx, (ksize, ksize), win_sigma) # noqa: N806
Jyy = cv2.GaussianBlur(Jyy, (ksize, ksize), win_sigma) # noqa: N806
Jxy = cv2.GaussianBlur(Jxy, (ksize, ksize), win_sigma) # noqa: N806

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are only interested in the center pixel, do we need to gaussian blur the entire image? Couldn't we just look at the window size around the center pixel and extract the most consistent edge direction?

r, c = get_patch_center(*gray.shape)
jxx, jyy, jxy = float(Jxx[r, c]), float(Jyy[r, c]), float(Jxy[r, c])

disc = np.sqrt((jxx - jyy) ** 2 + 4.0 * (jxy**2))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does disc stand for? What is happening here?


edge_strength = np.sqrt(max(lam1, 0.0))

coherence = (lam1 - lam2) / (lam1 + lam2 + EPSILON)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't quite understand what coherence is after reading this? Can you try more expressive variable names or adding comments of what is being calculated here?

Patch with annotations drawn on it.
"""
patch_with_pose = patch.copy()
center_y, center_x = patch.shape[0] // 2, patch.shape[1] // 2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to add the get_patch_center function, you should use it here too


from typing import Tuple

import cv2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants