Skip to content

Names can refer to сlasses of humans/locations/organizations #4

@vikmary

Description

@vikmary

Thank you for publishing the preprocessing pipeline! Successfully launched it on linux. It is convenient that the preprocessed entries are dumped during the process so that you can monitor it in progress.

After looking at some resulting entries in the combined/ folder, I see that the terms "PERSON", "ORGANIZATION" and "LOCATION" are more loose than I expected it to be (my misunderstanding here).
Besides taking instances of "Q5"(persons), "Q82794" (locations) and "Q43229" (organizations), the paranames/io/wikidata_dump_transliterations.py is taking instances of their subclasses as well. This leads to "Hispanic and Latino-American teenage boys" and "Government secretaries of Policies for Women of the State of Bahia" be classified as a person along with ["Samuel Hamington"[(https://www.wikidata.org/wiki/Q111165240), while I expected only "Samuel Hamington" to be included.
image

Maybe it would be good to have data samples in the git repo or the original paper =)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions