diff --git a/source/access.rst b/source/access.rst index e5e1eb1..45c75bf 100644 --- a/source/access.rst +++ b/source/access.rst @@ -1,9 +1,9 @@ -Access to subMIT +Access to SubMIT ---------------- .. tags:: JupyterHub, VSCode, GPU -You have several options to connect to subMIT, view and edit your files, and do your work. +You have several options to connect to SubMIT, view and edit your files, and do your work. 1. **ssh** is the simplest way to connect to the login nodes, see `the starting guide `_. 2. **`JupyterHub _`** provides another easy alternative to connect to the cluster. You can log in using your Kerberos ID, and get access to an interactive graphical interface, terminal, text editor, and more. @@ -18,7 +18,7 @@ Jupyterhub SubMIT has a `custom installation of JupyterHub `_. -This is set up through the subMIT machines meaning that you have access to all of your files and data. You will have access to basic python3 configurations. In addition, if you need a more complex environment, you can run your notebooks in any conda environment that you have set up. You can check the name and location of your environments using the command ``jupyter kernelspec list``. This allows you to create the exact environment you need for your projects. An example on how to set up a conda environment is shown above, and how it is implemented in jupyter is described below. +This is set up through the SubMIT machines meaning that you have access to all of your files and data. You will have access to basic python3 configurations. In addition, if you need a more complex environment, you can run your notebooks in any conda environment that you have set up. You can check the name and location of your environments using the command ``jupyter kernelspec list``. This allows you to create the exact environment you need for your projects. An example on how to set up a conda environment is shown above, and how it is implemented in jupyter is described below. A few examples of simple Jupyter notebooks can be found in the `Github jupyter examples `_. Several other intro notebooks can be found in the link below: `JupyterHub_examples `_ @@ -160,7 +160,7 @@ Visual Studio Code (VSCode) is a free, versatile source-code editor that support * **Integrated file browser:** easily navigate and manage your files within the editor. -One of the capabilities of VSCode is its client-server mode for `remote development `_ on subMIT. This functionality allows you to edit, run, and debug code on the subMIT servers directly from your personal computer. This setup provides the ease of a GUI-based development environment on your local machine while executing the code on subMIT's infrastructure. +One of the capabilities of VSCode is its client-server mode for `remote development `_ on SubMIT. This functionality allows you to edit, run, and debug code on the SubMIT servers directly from your personal computer. This setup provides the ease of a GUI-based development environment on your local machine while executing the code on SubMIT's infrastructure. For `most languages `_, VScode enhances your coding experience with features like: @@ -175,18 +175,18 @@ For `most languages `_, V * **Accessibility features:** `learn about accessibility in VSCode `_. -Getting Started with VSCode on subMIT +Getting Started with VSCode on SubMIT ..................................... Microsoft provides some handy `videos `_ for getting started with VSCode, as well as detailed information on `remote connection `_. #. **Install VSCode:** `download and install instructions `_ -#. **SSH Configuration:** Follow the `general configuration guide `_ in the subMIT User's Guide. Also have a look at the `VSCode configuration guide `_ due to a recent VSCode upgrade which removed the compatibility with CentOS 7. +#. **SSH Configuration:** Follow the `general configuration guide `_ in the SubMIT User's Guide. Also have a look at the `VSCode configuration guide `_ due to a recent VSCode upgrade which removed the compatibility with CentOS 7. #. **Remote-SSH Extension:** Available in the VSCode Extensions tab or on the `VSCode website `_. -#. **Connect to subMIT:** Click the green "Open a Remote Window" button in the lower-left of the VSCode window. Select "submit" from the menu (VSCode automatically reads your ssh config file). Then, simply "open" a folder or workspace. Opening a folder is typically more convenient than opening a single code file. Remember: VSCode is now connected to subMIT, so you are looking at and navigating your files on the subMIT servers, not on your laptop/desktop. +#. **Connect to SubMIT:** Click the green "Open a Remote Window" button in the lower-left of the VSCode window. Select "submit" from the menu (VSCode automatically reads your ssh config file). Then, simply "open" a folder or workspace. Opening a folder is typically more convenient than opening a single code file. Remember: VSCode is now connected to SubMIT, so you are looking at and navigating your files on the SubMIT servers, not on your laptop/desktop. #. **Note:** Only run *light* calculations in VSCode; VSCode is intended for editing/debugging, not production runs. If the execution of your code will consume significant resources (time, memory, processors, ...) then please run it outside VSCode using `Slurm or HTCondor `_. For example, you can debug using a smaller subset of data than a production run. diff --git a/source/acknowledging.rst b/source/acknowledging.rst index 7beb803..6566310 100644 --- a/source/acknowledging.rst +++ b/source/acknowledging.rst @@ -1,18 +1,18 @@ -Acknowledging subMIT +Acknowledging SubMIT -------------------- It is very helpful for us to have a record of publications that made use -of subMIT resources (be those the physical hardware, or the human support +of SubMIT resources (be those the physical hardware, or the human support that we provide). If possible, please send an e-mail letting us know you've published a preprint or paper (to submit-help@mit.edu) that made use of our resources. -We would also appreciate it if you could acknowledge subMIT in your +We would also appreciate it if you could acknowledge SubMIT in your publications, where appropriate, with the following language: -*This work made use of resources provided by subMIT at MIT Physics.* +*This work made use of resources provided by SubMIT at MIT Physics.* In addition, you can cite the following preprint:: @@ -27,4 +27,4 @@ In addition, you can cite the following preprint:: } We will also periodically send around e-mails to the users lists reminding -users to let us know about any recently published papers. \ No newline at end of file +users to let us know about any recently published papers. diff --git a/source/backup.rst b/source/backup.rst index 766c483..421783f 100644 --- a/source/backup.rst +++ b/source/backup.rst @@ -1,7 +1,7 @@ Data backup ----------- -In this section we will to discuss the backup policy of subMIT. In short, the only space that has a conventional backup is the home directory for users (/home/submit). The directories under work (/work/submit) and data (/ceph/submit/data) have intrinsic resilience by raiding and erasure coding but are not backed up. The subMIT team is making its best effort to keep data safe but due to the size a full backup is not feasible. +In this section we will to discuss the backup policy of SubMIT. In short, the only space that has a conventional backup is the home directory for users (/home/submit). The directories under work (/work/submit) and data (/ceph/submit/data) have intrinsic resilience by raiding and erasure coding but are not backed up. The SubMIT team is making its best effort to keep data safe but due to the size a full backup is not feasible. If there is a particular emergency situation involving backups please contact submit-help@mit.edu. diff --git a/source/conda.rst b/source/conda.rst index a2c3a4a..3071908 100644 --- a/source/conda.rst +++ b/source/conda.rst @@ -70,7 +70,7 @@ You can then write your code, let's say in a file called ``example.jl``, and run Conda for C++ ============= -Natively, subMIT currently has a C++ compiler, ``g++``. While Conda doesn’t directly install C++ as a standalone compiler, it can install related tools (like GCC [GNU Compiler Collection] or Clang) and libraries for building C++ projects, e.g. +Natively, SubMIT currently has a C++ compiler, ``g++``. While Conda doesn’t directly install C++ as a standalone compiler, it can install related tools (like GCC [GNU Compiler Collection] or Clang) and libraries for building C++ projects, e.g. .. code-block:: sh @@ -88,7 +88,7 @@ You can then write your code, let's say in a file called ``example.cpp``, and co Conda for FORTRAN ================= -Natively, subMIT currently has a FORTRAN compiler, ``gfortran``. Similarly to C++, Conda can install FORTRAN compilers, such as a specific version of ``gfortran``, through the command: +Natively, SubMIT currently has a FORTRAN compiler, ``gfortran``. Similarly to C++, Conda can install FORTRAN compilers, such as a specific version of ``gfortran``, through the command: .. code-block:: sh @@ -132,7 +132,7 @@ To run a script called ``example.R`` in R, use ``Rscript example.R``. Conda for Java ============== -Java is also natively installed on subMIT. If you wish a different version, you can for example install it using +Java is also natively installed on SubMIT. If you wish a different version, you can for example install it using .. code-block:: sh @@ -149,7 +149,7 @@ Some, but not all, Java-related libraries are available via Conda, e.g. Conda for Perl ============== -Perl is also natively installed on subMIT. If you wish a different version, you can for example install it using +Perl is also natively installed on SubMIT. If you wish a different version, you can for example install it using .. code-block:: sh @@ -166,7 +166,7 @@ To import Perl libraries, such as ``perl-dbi``, run Conda for Ruby ============== -Ruby is not natively installed on subMIT. You can install it through +Ruby is not natively installed on SubMIT. You can install it through .. code-block:: sh diff --git a/source/intro.rst b/source/intro.rst index 29bfd16..a5f5935 100644 --- a/source/intro.rst +++ b/source/intro.rst @@ -1,19 +1,19 @@ Introduction and creating an account ------------------------------------ -Welcome to the subMIT users guide! This will guide you on how to make an account to access the submit machines and how to create basic and more advanced workflows for applications. We have examples and tutorials to help you get going. +Welcome to the SubMIT users guide! This will guide you on how to make an account to access the submit machines and how to create basic and more advanced workflows for applications. We have examples and tutorials to help you get going. Introduction ~~~~~~~~~~~~ -The subMIT login pool is designed to let users login safely prepare and tests their research computing tasks and submit them to the large computing resources of their choice. There are for now a limited number of resources connected but we are working on quickly expanding the available resources. +The SubMIT login pool is designed to let users login safely prepare and tests their research computing tasks and submit them to the large computing resources of their choice. There are for now a limited number of resources connected but we are working on quickly expanding the available resources. We have HTcondor connection to the Tier-2 computing cluster, the Tier-2 integration cluster (aka Tier-3) in building 24, the engaging cluster and the OSG. For CMS users the global CMS queue is also seamlessly integrated. What do I need for an account if I have an MIT kerberos? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -To make login convenient and secure we allow login to the subMIT pool using ssh keys only. Go to our `submit portal `_ to upload your ssh key (or a number of keys). Upload of ssh keys is secured through MIT touchstone authentication. +To make login convenient and secure we allow login to the SubMIT pool using ssh keys only. Go to our `submit portal `_ to upload your ssh key (or a number of keys). Upload of ssh keys is secured through MIT touchstone authentication. What do I need for an account if I don't have an MIT kerberos? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -23,4 +23,4 @@ You will need to obtain an MIT guest account. A sponsor -- usually a faculty you How to get help? ~~~~~~~~~~~~~~~~ -If you have any trouble, you can contact submit-help via email at submit-help@mit.edu. +If you have any trouble, you can contact the help team via email at submit-help@mit.edu. diff --git a/source/monit.rst b/source/monit.rst index ac9cd32..0842302 100644 --- a/source/monit.rst +++ b/source/monit.rst @@ -1,19 +1,19 @@ -Monitoring at submit +Monitoring at SubMIT -------------------- .. tags:: Slurm, Condor -This section will detail the monitoring available at submit. Here we will detail how you can keep track of the submit machines as you work as well as monitor your condor jobs. +This section will detail the monitoring available at SubMIT. Here we will detail how you can keep track of the submit machines as you work as well as monitor your condor jobs. -The main submit page +The main SubMITt page ~~~~~~~~~~~~~~~~~~~~ -On the main `submit page `_ you can find interesting links useful for monitoring. Most of these links are explained in more detail below. +On the main `SubMIT page `_ you can find interesting links useful for monitoring. Most of these links are explained in more detail below. -Ganglia Monitoring for submit +Ganglia Monitoring for SubMIT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Ganglia is a distributed monitoring system for high-performance computing systems such as submit. The Ganglia monitoring can be found through a link on the main submit page or can be found directly `here `_. Information on the individual servers can be found at the bottom of the page or through the following `link to servers `_. +Ganglia is a distributed monitoring system for high-performance computing systems such as SubMIT. The Ganglia monitoring can be found through a link on the main SubMIT page or can be found directly `here `_. Information on the individual servers can be found at the bottom of the page or through the following `link to servers `_. CondorMon ~~~~~~~~~ @@ -30,7 +30,7 @@ To monitor your slurm jobs you can use the slurm monitoring `SlurmMon `_. +There are additional summary plots to help keep track of the growth and health of the SubMIT system `SubMIT Monitoring Tools `_. Monitoring for the T3 ~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/program.rst b/source/program.rst index 779bbba..260ec30 100644 --- a/source/program.rst +++ b/source/program.rst @@ -3,7 +3,7 @@ Available software .. tags:: Julia, Mathematica, Conda, Slurm, VSCode, Containers, JupyterHub -This section briefly describes several options in which to set up your environment for working on subMIT. This section is not exhaustive but introduces several tools which can help you set up your code. +This section briefly describes several options in which to set up your environment for working on SubMIT. This section is not exhaustive but introduces several tools which can help you set up your code. You have several options available for either using installed software, or installing your own: @@ -18,12 +18,12 @@ You have several options available for either using installed software, or insta 4. **CVFMS** is provided by CERN, and has many environments. Find below more details about each of your choices. -Feel free to reach out to us for how to best set up your software or environments on subMIT! +Feel free to reach out to us for how to best set up your software or environments on SubMIT! Native system ~~~~~~~~~~~~~ -All of the subMIT machines come with several tools to help you get started with your work. Under `/usr/bin` you will find: +All of the SubMIT machines come with several tools to help you get started with your work. Under `/usr/bin` you will find: - python - gcc @@ -119,7 +119,7 @@ If you wish to get an interface similar to a Mathematica notebook (.nb file), yo Now that the kernel is installed, you want to use jupyterhub on ``submit00``. Here's how to do this: -Go to the submit website and open jupyterhub. Choose the job profile to "Slurm for Wolfram Mathematica - submit00 - 1 CPU, 500 MB". The server should start. If you get the error message "Spawn failed: Timeout", it means the CPUs are already busy with other jobs and cannot be used at the moment. You can still use the method below. +Go to the SubMIT website and open jupyterhub. Choose the job profile to "Slurm for Wolfram Mathematica - submit00 - 1 CPU, 500 MB". The server should start. If you get the error message "Spawn failed: Timeout", it means the CPUs are already busy with other jobs and cannot be used at the moment. You can still use the method below. You can make sure that you are on submit00 by opening a terminal within the webpage, which should show ``username@submit00.mit.edu``. You can now open a jupyter notebook (.ipynb file), make sure you are using the Wolfram kernel (choose the kernel in the top right of the screen), and use Wolfram syntax as you would in a Wolfram notebook. The outputs will even have the Wolfram fonts! @@ -150,7 +150,7 @@ Conda Conda is an open source package management system and environment management system. We can use this to set up consistent environments and manage the package dependencies for various applications. Below is an example to set up a python environment as well as a different gcc compiler. -Important Notes for Using Conda on submit +Important Notes for Using Conda on SubMIT ......................................... Please note that downloading many conda packages takes a large amount of space which can very quickly use up the quota in your home. If you plan to use conda heavily **it is suggested to download and configure it in your work directory** where there is much more space. @@ -291,9 +291,9 @@ How to use your container in your jobs There are a couple of options for this. -**If your jobs are running only on subMIT and you have a singularity image built**, your singularity image can be placed on some commonly-readable directory from any of the compute nodes (/ceph), so you can access it directly from any of your jobs. +**If your jobs are running only on SubMIT and you have a singularity image built**, your singularity image can be placed on some commonly-readable directory from any of the compute nodes (/ceph), so you can access it directly from any of your jobs. -**If your jobs are running on subMIT, MIT T3, MIT T2, OSG, or anywhere on the grid**, you can mirror your Docker container as a Singularity container to CVMFS. You can upload it to DockerHub with ``podman push`` and then add it to ``/cvmfs/singularity.opensciencegrid.org/``. This can be done by making a pull request to add the container to the `master list of docker images `_, which is a simple .txt file which controls the synchronization of images on the CVMFS. Your container will then appear as a singularity image in ``/cvmfs/singularity.opensciencegrid.org/``, which is mounted on all the machines of the aforementioned systems. +**If your jobs are running on SubMIT, MIT T3, MIT T2, OSG, or anywhere on the grid**, you can mirror your Docker container as a Singularity container to CVMFS. You can upload it to DockerHub with ``podman push`` and then add it to ``/cvmfs/singularity.opensciencegrid.org/``. This can be done by making a pull request to add the container to the `master list of docker images `_, which is a simple .txt file which controls the synchronization of images on the CVMFS. Your container will then appear as a singularity image in ``/cvmfs/singularity.opensciencegrid.org/``, which is mounted on all the machines of the aforementioned systems. **If you need this available on worker nodes on the MIT T3 and T2**, you can add them to a space in your work directory. You will then need to email Max (Kerberos ID: maxi) or submit-help@mit.edu to create this CVMFs area for you. diff --git a/source/running.rst b/source/running.rst index 2541b45..5c95bd0 100644 --- a/source/running.rst +++ b/source/running.rst @@ -9,18 +9,18 @@ Batch computing .. tags:: Slurm, Condor, GPU -This section will give you a quick guide on how to submit batch jobs at subMIT. +This section will give you a quick guide on how to submit batch jobs at SubMIT. There will be a couple of simple examples to help get you started. You have three options: 1. **Running locally**: limited to the interactive usage of CPUs in the login nodes. Ideal for developing, not for running jobs. -2. **Slurm**: medium-sized pool of CPUs and some GPUs available on subMIT worker-nodes. Slurm is set up as a federation with all of the subMIT machines as clusters. This means that Slurm submissions will have access to the /home, /work, and /ceph directories. -3. **HTCondor**: large pools of CPUs and some GPUs are available in clusters at MIT and around the world. Ideal for large scale processing. Worker nodes in HTCondor do not have access to your subMIT directories: this means that any input files and software that you need must be passed into the submission, or already be on the worker node. Several tools are available to achieve this, read below. +2. **Slurm**: medium-sized pool of CPUs and some GPUs available on SubMIT worker-nodes. Slurm is set up as a federation with all of the SubMIT machines as clusters. This means that Slurm submissions will have access to the /home, /work, and /ceph directories. +3. **HTCondor**: large pools of CPUs and some GPUs are available in clusters at MIT and around the world. Ideal for large scale processing. Worker nodes in HTCondor do not have access to your SubMIT directories: this means that any input files and software that you need must be passed into the submission, or already be on the worker node. Several tools are available to achieve this, read below. Running locally ~~~~~~~~~~~~~~~ -The subMIT login machines are powerful servers which can be used for local testing. +The SubMIT login machines are powerful servers which can be used for local testing. This allows users to thoroughly test their code before expanding to batch submission. When you are ready to scale up your framework, you can study the guide below to start submitting to HTCondor or Slurm. @@ -168,7 +168,7 @@ Slurm also has the sacct command to help you to look at information from past jo HTCondor ~~~~~~~~ -The subMIT machines have access to several clusters with thousands of available cores via HTCondor. +The SubMIT machines have access to several clusters with thousands of available cores via HTCondor. These following sections describe which clusters are available to run on, a brief description of what is available on each cluster, and what is needed in your submission script in order to send your HTCondor jobs to each cluster. Available clusters @@ -304,7 +304,7 @@ Here is an example sample list of sites you can use, +DESIRED_Sites = "T2_AT_Vienna,T2_BE_IIHE,T2_BE_UCL,T2_BR_SPRACE,T2_BR_UERJ,T2_CH_CERN,T2_CH_CERN_AI,T2_CH_CERN_HLT,T2_CH_CERN_Wigner,T2_CH_CSCS,T2_CH_CSCS_HPC,T2_CN_Beijing,T2_DE_DESY,T2_DE_RWTH,T2_EE_Estonia,T2_ES_CIEMAT,T2_ES_IFCA,T2_FI_HIP,T2_FR_CCIN2P3,T2_FR_GRIF_IRFU,T2_FR_GRIF_LLR,T2_FR_IPHC,T2_GR_Ioannina,T2_HU_Budapest,T2_IN_TIFR,T2_IT_Bari,T2_IT_Legnaro,T2_IT_Pisa,T2_IT_Rome,T2_KR_KISTI,T2_MY_SIFIR,T2_MY_UPM_BIRUNI,T2_PK_NCP,T2_PL_Swierk,T2_PL_Warsaw,T2_PT_NCG_Lisbon,T2_RU_IHEP,T2_RU_INR,T2_RU_ITEP,T2_RU_JINR,T2_RU_PNPI,T2_RU_SINP,T2_TH_CUNSTDA,T2_TR_METU,T2_TW_NCHC,T2_UA_KIPT,T2_UK_London_IC,T2_UK_SGrid_Bristol,T2_UK_SGrid_RALPP,T2_US_Caltech,T2_US_Florida,T2_US_MIT,T2_US_Nebraska,T2_US_Purdue,T2_US_UCSD,T2_US_Vanderbilt,T2_US_Wisconsin,T3_CH_CERN_CAF,T3_CH_CERN_DOMA,T3_CH_CERN_HelixNebula,T3_CH_CERN_HelixNebula_REHA,T3_CH_CMSAtHome,T3_CH_Volunteer,T3_US_HEPCloud,T3_US_NERSC,T3_US_OSG,T3_US_PSC,T3_US_SDSC" In order to use the CMS global pool, you will need to add a few additional lines to your submission script. -The lines below, with the proper ID and username (uid and id from subMIT), are necessary in order to get into the global pool: +The lines below, with the proper ID and username (uid and id from SubMIT), are necessary in order to get into the global pool: .. code-block:: sh @@ -332,7 +332,7 @@ General Tips for HTCondor Jobs Transferring Input Scripts and Data *********************************** -Since HTCondor jobs are running on external computing resources, your subMIT storage areas (``/home``, ``/work``, ``/ceph``) are not accessible on the worker nodes. +Since HTCondor jobs are running on external computing resources, your SubMIT storage areas (``/home``, ``/work``, ``/ceph``) are not accessible on the worker nodes. Thus, you either need to transfer your input and output files through your submission script, or use XRootD to transfer files via the network. via the submission script @@ -362,14 +362,14 @@ Once that is set up, in your bash script that is executed in the worker-node, yo Transferring Outputs ******************** -If your code produces an output you want to bring back to subMIT, you have the same two options as for the input files. +If your code produces an output you want to bring back to SubMIT, you have the same two options as for the input files. You can either let the job copy it back, or transfer the output via XRootD. The same considerations apply here: for larger files and more control, use XRootD. via the submission script ************************* -Adding the following to your submission script will copy the outputs of your job back to subMIT automatically. +Adding the following to your submission script will copy the outputs of your job back to SubMIT automatically. .. code-block:: sh @@ -386,7 +386,7 @@ Here is a simple example that writes the ``out.out`` file produced in the HTCond via XRootD ********** -You can add something like the following in your script that gets executed on the worker-node to copy your output back to the subMIT ceph space, +You can add something like the following in your script that gets executed on the worker-node to copy your output back to the SubMIT ceph space, .. code-block:: sh @@ -395,16 +395,16 @@ You can add something like the following in your script that gets executed on th Distributing Software to Worker Nodes ************************************* -Again since the HTCondor nodes don't have access to the subMIT storage areas, you need to distribute your software to the worker-node. +Again since the HTCondor nodes don't have access to the SubMIT storage areas, you need to distribute your software to the worker-node. This is further complicated that the OS on each worker-node or cluster may be different. Your best options are either to make your software available as a singularity image on CVMFS, or transfer it by hand. via CVMFS ********* -`CVMFS `_ is mounted on subMIT and all clusters connected to subMIT via HTCondor, and supports the distribution of containers. +`CVMFS `_ is mounted on SubMIT and all clusters connected to SubMIT via HTCondor, and supports the distribution of containers. -Several pre-built containers are available already that may meet your needs. Check our the ``/cvmfs`` space on subMIT. +Several pre-built containers are available already that may meet your needs. Check our the ``/cvmfs`` space on SubMIT. Please see the relevant `Available Software `_ section of the User's Guide for how to distribute your custom container. @@ -431,7 +431,7 @@ It may be useful for you to impose on the HTCondor job some specific OS and set For some clusters, you can do this via the ``requirements`` in the submission script: see sections pertaining to each cluster for more information on this, and check their documentation. -For all clusters supported by subMIT, as well as subMIT itself, you can also use CVMFS, which has many pre-built images of OSs that can be accessed: see `this section `_ of the guide for more information. See the above section for how to use singularity in your jobs. For example, to use rocky9, you can add the following to your submission script, +For all clusters supported by SubMIT, as well as SubMIT itself, you can also use CVMFS, which has many pre-built images of OSs that can be accessed: see `this section `_ of the guide for more information. See the above section for how to use singularity in your jobs. For example, to use rocky9, you can add the following to your submission script, .. code-block:: sh diff --git a/source/starting.rst b/source/starting.rst index 0213443..13ac2e6 100644 --- a/source/starting.rst +++ b/source/starting.rst @@ -15,12 +15,12 @@ Getting started --------------- -We allow login to the subMIT pool using ssh keys with authentication done through LDAP. Once, you have uploaded your ssh keys, you will also be given a home and work directory in which you can directly start working. This section will guide you on how to set up your ssh keys and upload them to the submit portal to allow login as well as describe the initial resources available to you. +We allow login to the SubMIT pool using ssh keys with authentication done through LDAP. Once, you have uploaded your ssh keys, you will also be given a home and work directory in which you can directly start working. This section will guide you on how to set up your ssh keys and upload them to the submit portal to allow login as well as describe the initial resources available to you. How to get an account ~~~~~~~~~~~~~~~~~~~~~ -If you already have a general MIT account then getting access to subMIT is easy. You only need to upload your ssh key to the `submit portal `_. +If you already have a general MIT account then getting access to SubMIT is easy. You only need to upload your ssh key to the `submit portal `_. You might be prompted for not being authorized to access the portal. Please, follow the instructions on the screen. @@ -75,7 +75,7 @@ This should create both a private and a public key (``id_ed25519``, the private :black:`Or simply open the public key file in your favorite text editor and highlight & copy the text.` -Simply paste the contents of the public key (``id_ed25519.pub``) into the submit portal link above and you are ready. The private key is like your password and should never be exposed to anybody. Please do not paste this into the subMIT website; if you do you should re-create your keys by running the ``ssh-keygen`` command again. +Simply paste the contents of the public key (``id_ed25519.pub``) into the submit portal link above and you are ready. The private key is like your password and should never be exposed to anybody. Please do not paste this into the SubMIT website; if you do you should re-create your keys by running the ``ssh-keygen`` command again. We recommend that you use the standard name (as prompted by ``ssh-keygen``) for the keys, as this will make the process easier. Some advanced users may want to create differently named keys within their ``.ssh`` directory, as they may wish to keep separate keys for separate machines. If you do this, please remember to either create the appropriate configuration within ``.ssh/config``, or log in with ``ssh -i /path/to/identity/file``. @@ -182,13 +182,13 @@ You can customize the appearance and content of your webpage for example by addi Tips when coming from another cluster ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Each cluster is a bit different and subMIT is no exception. Here are a few things about subMIT which may be different from a cluster you previously used: +Each cluster is a bit different and SubMIT is no exception. Here are a few things about SubMIT which may be different from a cluster you previously used: -* In addition to the SLURM-managed subMIT nodes, subMIT is a login pool which also connects to other resources +* In addition to the SLURM-managed SubMIT nodes, SubMIT is a login pool which also connects to other resources -* subMIT does not provide software through ``module avail``. Instead, we prefer users to install the specific toolbox they need for their workflow. We give you examples, for instance on `how to setup conda `_ to manage environments and packages for Python and more, or `Mathematica `_. When you install software, make sure they are installed in your ``/work`` directory. +* SubMIT does not provide software through ``module avail``. Instead, we prefer users to install the specific toolbox they need for their workflow. We give you examples, for instance on `how to setup conda `_ to manage environments and packages for Python and more, or `Mathematica `_. When you install software, make sure they are installed in your ``/work`` directory. -* On subMIT, SLURM is not set to reserve entire nodes by default; SLURM will request the resources (cores & memory) you request for your job. On subMIT, it is best to think in units of cores, not nodes when making SLURM requests. The subMIT SLURM cluster contains several 'standard' nodes as well as high-density nodes with a large number of cores and memory on a single node. Given this heterogeneous nature, it is important to think how many cores your jobs need and request number of cores explicitly in your batch scripts. One high-density node can do the work of several standard nodes, and you will likely wait a long time (and end up with more cores than you need) if you request a full high-density node. +* On SubMIT, SLURM is not set to reserve entire nodes by default; SLURM will request the resources (cores & memory) you request for your job. On SubMIT, it is best to think in units of cores, not nodes when making SLURM requests. The SubMIT SLURM cluster contains several 'standard' nodes as well as high-density nodes with a large number of cores and memory on a single node. Given this heterogeneous nature, it is important to think how many cores your jobs need and request number of cores explicitly in your batch scripts. One high-density node can do the work of several standard nodes, and you will likely wait a long time (and end up with more cores than you need) if you request a full high-density node. The rules for an account ~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/storage.rst b/source/storage.rst index 5f95130..8b330a1 100644 --- a/source/storage.rst +++ b/source/storage.rst @@ -22,7 +22,7 @@ The storage filesystem Users also have a larger quota in storage filesystem, under the directory ``/ceph/submit/data/user//`` with quota 1TB. SubMIT uses ceph to form the filesystem. -The filesystem is accessible from all subMIT nodes (e.g. any node you can log in to, and any node connected via Slurm) directly via the ``/ceph`` mount. +The filesystem is accessible from all SubMIT nodes (e.g. any node you can log in to, and any node connected via Slurm) directly via the ``/ceph`` mount. :red: Keep in mind that filesystem is optimized for large files, therefore it is not recommended to save large numbers of small files in the filesystem, for example, 100k+ small log files. This will seriously hinder the performance of the filesystem for all users. @@ -107,7 +107,7 @@ The storage on Tier2 ~~~~~~~~~~~~~~~~~~~~ Upon request, users may also have some storage on MIT Tier2 sites. Note that tier2 is external computing resources and users can only use xrootd to transfer the files. In other words, to use storage in tier2, users must have x509 certificate. The details of how to get such certificates are above. -Group storage at submit +Group storage at SubMIT ~~~~~~~~~~~~~~~~~~~~~~~ Upon request, we can create user group storage spaces on /ceph at ``/ceph/submit/data/`` to easily share files. Unless specified otherwise, this group space has between 1 and 10 TB of storage, although we are flexible to create larger spaces if necessary. Upon request we can also create backed up group storage space in ``/home/submit/`` with a 10GB quota that can be extended if needed. By default, all members of the group, and only them, can access, modify, and execute the contents of the group storage space. A ``public_html`` can be added in ``/home/submit/`` to create a group webpage in order to view or share your files in the same way as possible for users (see ``_). To create this group space, please email submit-help@mit.edu with the requested group name, amount of storage, if a ``/home/submit`` space is needed, and email address or Kerberos ID of the users who should have access to the resources. diff --git a/source/working.rst b/source/working.rst index 4b3884b..9b892df 100644 --- a/source/working.rst +++ b/source/working.rst @@ -3,12 +3,12 @@ Best practices .. tags:: Slurm -Submit is a shared tool. As such, you are responsible for setting up your work to properly use the resources available to you through submit. This section covers a few examples of avoidable problems. +SubMIT is a shared tool. As such, you are responsible for setting up your work to properly use the resources available to you through SubMIT. This section covers a few examples of avoidable problems. Use batch submission systems to scale up your workflow ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The submit machines are powerful servers. However, if your jobs will take longer than approximately 15 minutes, then it is better to submit them through a batch system. Additionally, if you want to analyze many files, batch systems should be used. On submit, we provide use for both `HTCondor `_ and `Slurm `_. Setting up these tools will allow you to scale out your tools and will also prevent clutter on the submit machines. There are simple examples on how to use these batch submission systems later in this guide. +The submit machines are powerful servers. However, if your jobs will take longer than approximately 15 minutes, then it is better to submit them through a batch system. Additionally, if you want to analyze many files, batch systems should be used. On SubMIT, we provide use for both `HTCondor `_ and `Slurm `_. Setting up these tools will allow you to scale out your tools and will also prevent clutter on the submit machines. There are simple examples on how to use these batch submission systems later in this guide. Avoid massive parallel access of a single file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -35,9 +35,9 @@ If you are using xrootd for the first time, you will need to be added to the map Be aware of your own data ~~~~~~~~~~~~~~~~~~~~~~~~~ -You are given storage spaces on submit that you are in charge of. Make sure to keep these areas clean and remove data that is no longer needed. Keep in mind that the hadoop storage is scratch for submit users. +You are given storage spaces on SubMIT that you are in charge of. Make sure to keep these areas clean and remove data that is no longer needed. Keep in mind that the hadoop storage is scratch for SubMIT users. Software environments ~~~~~~~~~~~~~~~~~~~~~ -Submit provides several tools in order to help you set up and configure your software environments to suit your needs. If possible, it is better to set up your environments through these tools rather than installing things on your own. There is a section later in this guide describing some of these tools. +SubMIT provides several tools in order to help you set up and configure your software environments to suit your needs. If possible, it is better to set up your environments through these tools rather than installing things on your own. There is a section later in this guide describing some of these tools.