Hi Borzoi team,
Thank you for releasing Borzoi and the analysis code.
I have a question about converting Borzoi ATAC-seq coverage predictions into peak-level accessibility scores. In the paper, gene-level RNA expression is computed as the sum of predicted coverage over exonic bins. For ATAC peaks, would you also recommend summing predicted ATAC coverage over bins overlapping each peak?
I am wondering whether using the sum may introduce a dependence on peak length. Would it be better to use the mean over peak-overlapping bins, or a fixed-width peak-centered window, when comparing predictions to a peak-by-cell-type accessibility matrix?
Is there an aggregation strategy you would recommend for ATAC peak-level evaluation?
Best,
Amber
Hi Borzoi team,
Thank you for releasing Borzoi and the analysis code.
I have a question about converting Borzoi ATAC-seq coverage predictions into peak-level accessibility scores. In the paper, gene-level RNA expression is computed as the sum of predicted coverage over exonic bins. For ATAC peaks, would you also recommend summing predicted ATAC coverage over bins overlapping each peak?
I am wondering whether using the sum may introduce a dependence on peak length. Would it be better to use the mean over peak-overlapping bins, or a fixed-width peak-centered window, when comparing predictions to a peak-by-cell-type accessibility matrix?
Is there an aggregation strategy you would recommend for ATAC peak-level evaluation?
Best,
Amber