Question about Semantic-guided Spatial-temporal Consistency Regularization

Thanks for the great work!
I have three questions about Semantic-guided Spatial-temporal Consistency Regularization.
1. What is the reason for dividing the complete stitched point cloud into regular grids rather than using short-term temporality directly?
2. What does the symbol * represent in Equation 3? Does it indicate a cross product operation?
3. It is stated that the image is matched to the first frame of the point cloud $P_1$ using pixel-point correspondences $\{\hat{x}_i^1, \hat{p}_i^1\}_{i=1}^{\hat{M}}$. This implies that for values of $k$ ranging from 1 to $K$, we have $t_{\hat{i}}^k = t_{\hat{i}}^1$ and $\hat{x}_{\hat{i}}^k = \hat{x}_{\hat{i}}^1$. However, in Equation 4, the text embeddings are denoted as $t_{\hat{i}}^1$, while the image embeddings are denoted as $\hat{x}_{\hat{i}}^{\hat{k}}$. Why is this the case?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Semantic-guided Spatial-temporal Consistency Regularization #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about Semantic-guided Spatial-temporal Consistency Regularization #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions