Skip to content

getting distances #15

@Armand1

Description

@Armand1

I am finding distances (euclidean) on a subset of the 821 vases to see if I can cluster them into smaller groups. I have done this before, but this time --- because I am dealing with a real heterogeneity of shapes -- I noticed something disturbing.

I standardize my vases by size and position: a kind of pseudo-registration. For each vase, I have a bunch of y values: the particular y values differ between the vases (since they were originally from very different image sizes). But to make a distance matrix you need to compare the same variables (y values) across your vases. To do that I bin the y values. But that doesn't quite work because there are big "gaps" in the y values. See below:

vasepoints

The points are the actual y values, the lines connect them. You can see that, for many vases, there are big bits where there is a line but no point. That's where you chopped off a handle or something: you just drew a line. That's fine: but when I "bin" the y values, I get a string of "NA"s there; and the distance matrix function does not like that. And I am not sure how to fix it: interpolate?

How do you deal with this when estimating your distances --- or does the problem simply not arise since you're working with SRVs or whatever?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions