Skip to content

Conversation

@trvrb
Copy link
Member

@trvrb trvrb commented Feb 3, 2025

Instead of "scaffolding" across independently run timepoints, instead run a hierarchical model where each timepoint is its own "location", ie

location	variant	date	sequences
USA_2020	19B	2020-01-19	7
USA_2020	19B	2020-01-22	2
...
USA_2021	20H	2021-01-30	2
USA_2021	20H	2021-02-01	8

It should be okay that variants don't fully span locations (aka timepoints) in the hierarchical model as long as there is a pivot bridge between locations.

This combines sequence counts from different timepoints into a single aggregated sequence count file. In this aggregation, the "location" field is suffixed by timepoint. Hierarchical MLR is run on the aggregated timepoints.

trvrb added 3 commits February 3, 2025 12:36
Go from collapsed_sequence_counts.tsv to annotated_sequence_counts.tsv. This converts location of "USA" to "USA_2023-24", etc... by appending timepoint
If "other" is the same label across timepoints, estimating a single fitness for "other" will be wonky
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants