Skip to content
This repository was archived by the owner on Oct 29, 2023. It is now read-only.
This repository was archived by the owner on Oct 29, 2023. It is now read-only.

add support for STRICT intra-shard boundary #201

@deflaux

Description

@deflaux

When the genomic region we want to examine is divided into shards, we use a STRICT shard boundary to remove duplicate data that would occur at the end of the current shard and also at the beginning of the next shard.

  • This works fine when we are working over an entire chromosome.
  • BUT when we want to shard a subset of a chromosome, we are filtering out the records at the beginning of the very first shard even though they would not be duplicated in any other shards.
    • Some times we do want to make use of those records that overlap the beginning of the shard boundary.
    • We need a way to use OVERLAPS for the first shard and STRICT for all subsequent shards.

Confirm this functionality with a JoinNonVariantSegmentsWithVariants integration test that operates over a small genomic region specified by both normal sharding and SitesToShards.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions