Distribute course data via synapse #55

gaow · 2024-04-17T00:25:57Z

gaow
Apr 17, 2024
Maintainer

@lamanda0227 currently our data is distributed by our S3 bucket which would work for the week of the course but not good idea in the long run for people to still access the data. On the other hand, for annovar we distribution via S3 bucket on the fly but we should not distribute it because of license issues.

Also since the data is >20GB for people running the image on their local machine we should not download all data at start-up. We should give people an easy to boot interface first then they have to download data at their own discretion -- so they will realize the challenge to work with this large data-set and potentially just rely on cloud solutions.

I made this change to acheive both goals:

cumc/handson-tutorials@c560c55

The idea is simple -- we now put all our data on synapse, and provide a command to download. However, the command will only be triggered if we can succesfully deploy annovar software, which is an indication that we are running it on mmcloud. Otherwise, it is going to throw error message and prompt users to download on their own.

For users to download separately, they will have to start the server then type get-data command to download everything.

@lamanda0227 i think this is going to work, if you could upload all the data to synapse and change this line with the synapse project ID:

https://github.com/cumc/handson-tutorials/blob/main/setup/course_entrypoint.sh#L32

the data should be of the exact structure on synapse. I suggest you test it out locally first by first clone the code repository then see if this command adds data on top of the code repo structure. I think it is going to work.

Once this is done, let @yiweizh-memverge know so we can try rebuild the image, to use this new way of getting the data. That means our image more or less independent of S3.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distribute course data via synapse #55

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Distribute course data via synapse #55

Uh oh!

gaow Apr 17, 2024 Maintainer

Replies: 0 comments

gaow
Apr 17, 2024
Maintainer