Skip to content

How to launch course tutorials

gaow edited this page Sep 16, 2025 · 75 revisions

Our course tutorials are available via two options: a pre-configured cloud server you can assess directly, or on your own Linux computing environment after following our software setup instruction to configure. In both cases, you will be able to launch and work with all exercises using JupyterLab.

Running on a pre-configured cloud server

At the beginning of the course, you will be assigned an active JupyterLab server on hosted on the cloud. Unless otherwise announced, the link to your server is

https://statgenetics.github.io/statgen-courses/<your_name>

where you replace <your_name> with what the course organizers provide you based on the name used in your course registration form in lower case, most likely your firstname_lastname. Please adjust this URL and enter it to your web browser. You should find the codes and data needed to run the tutorials in the left sidebar of the JupterLab GUI. The tutorial commands are available in their respective folders, in either of two formats: Jupyter Notebooks or text files. Detailed explanations for each format will be provided in the following sections.

This will be the JupyterLab interface once you open the URL,

image

On the left you see a list of folders for the exercises that we support for the course. Exercises in the archive folder were used in courses from previous years and are no longer being actively maintained.

On the right you may find a launcher window where you can choose to launch a Notebook or a Console of different kernels, or a command Terminal.

Tutorial commands available in IPython notebook

Currently available tutorials based on IPython Notebooks:

Exercise / folder name IPython Notebook name
finemapping finemapping.ipynb, finemapping_answers.ipynb
ldpred2 ldpred2_example.ipynb
multivariate_finemapping multivariate_finemapping.ipynb
ngs_qc_annotation NGS_QC_Annotation.ipynb
plink plink_Data_QC.ipynb plink_Substurcture.ipynb
regenie regenie_example.ipynb
statgen_basic statgen_equations.ipynb
twas mr_mash.ipynb, twas_test.ipynb

Tutorial commands available in text files

Currently available tutorial with commands provided in text files:

Exercise / folder name Text file name
plink_r_nothnagel plink_r_commands.txt
epistasis epistasis_commands.txt
fastlmm_gcta FASTLMM_GCTA_commands.txt
mendelian_randomization MR_exercise_TwoSampleMR.R
pleiotropy pleiotropy_commands.txt
popgen_nothnagel popgen_drift.R, popgen_selection.R, genepi_popgen.q
regression_nothnagel regression.R, multifactorial_script.txt

Since commands for most of the exercises are written in a mix of bash and R, we recommended running with a SoS Notebook. This is because SoS Notebook allows you to change kernels for each code chunk, which prevents the confusion of opening multiple notebooks or consoles.

To launch a new SoS Notebook, simply select SoS under the Notebook section in the launcher window, and you will be able to select kernels for each code chunk on the top right corner,

image

Trouble shoot

Epistasis

If you are running the exercise in an jupyter notebook, the plink commands will be stuck for part of the analysis. This is because this command:

more plink.assoc

is used to read the result interactively in a console or terminal. You can either run this line separately in console or terminal, or change to

head plink.assoc

to read the beginning of the file.

Regenie

When running the exercise, you may receive the following error:

ERROR: regenie_qc (id=387a6c10b5d599d9) returns an error.
ERROR: [regenie_qc (regenie_qc)]: [0]: Failed to obtain lock /tmp/jovyan/.sos/ae7fddb7f73ac3ee.lock for input regenie_statgen_mwe/1000G.EUR.mwe.pruned.bed and output /home/jovyan/handson-tutorials/contents/regenie/output_vc/cache/1000G.EUR.mwe.pruned.qc_pass.id /home/jovyan/handson-tutorials/contents/regenie/output_vc/cache/1000G.EUR.mwe.pruned.qc_pass.snplist. It is likely that these files are protected by another SoS process or concurrant task that is generating the same set of files. Please manually remove the lockfile if you are certain that no other process is using the lock.

This may happen when you run code chunks containing sos run pipeline/regenie.ipynb. It happens sometimes because of disk latency on the cloud server that created unnecessary file locks. Simply re-run the chunk where you receive this error; or, run in your command line terminal rm -rf /tmp/ubuntu/.sos/* to remove all lock files.

Running on local computer with software installed

Please follow the setup here to install all the software needed and launch the exercises. The instruction works for Linux, MacOS and Windows with WSL system configured.

Note that even though the same instruction works on MacOS, some software packages used in the exercises do not support MacOS. This include plink.multivariate and fastlmm. Also, regenie does not support MacOS with Apple Silicon chips. We therefore recommend using your institutes HPC Linux system to configure these local environments.

Data download

Data download: all data used for the exercise are available on synapse.org, with synapse ID syn18700992. You can either download the entire data-set to your computer, or, depending on which exercise you run, only download that sub-folder. Please follow these instructions for download from synapse.org.

Get help

For those who participate the course please reach out to the instructors on site and/or remote teaching assistants for help via slack. For those who have not participated in the course but still interested in setting it up for self-education, please open a ticket in our issue tracker if you have any difficulty setting up your system: https://github.com/statgenetics/statgen-courses/issues We will help you trouble-shoot.