Getting started
Computing is performed on Yale's Bouchet cluster. This page contains instructions on how to setup Julia on bouchet.
Getting started with Bouchet
Make sure you have an account on Bouchet! You can request an account here.
Once an account is created you can log ssh into Bouchet with your net ID as the username:
ssh netid@bouchet.ycrc.yale.edu
Example Login
For example, my net ID is ljg48 so I would type the following into the terminal:
ssh ljg48@bouchet.ycrc.yale.edu
If this is your first time using Bouchet I strongly encourage you to look over Bouchet's Getting Strated pages. If you have any questions either reach out to YCRC or ask Elizabeth or Luke.
YCRC uses the Duo multi-factor authentication (MFA), the same that is used elsewhere at Yale. To get set up with Duo, follow these instructions.
Yale's clusters can only be accessed on the Yale network. Thus, you will need to connect to Yale's VPN before SSH'ing into Bouchet. See the ITS webpage for more details. YCRC has some recommendations for VPN software here
Setting up Julia
- SSH into Bouchet with
ssh netid@bouchet.ycrc.yale.edu - Once connected, log into the
develpartitionsalloc -c4 -p devel - Load miniconda
module load miniconda - Create a new
condaenviroment, useful if you use OODconda create --name climaocean python jupyter jupyterlab. Here I named it climaocean, but you can choose anything you like, and I'm adding python, jupyter, and jupyterlab - Make the environment a kernel so you can use it with notebooks with the following command:
ycrc_conda_env.sh update - Activate your new environment:
conda activate climaocean - Load Julia
module load Julia/1.11.4-linux-x86_64 - Start the Julia REPL:
julia - Once in the REPL, type
]to go into thePkgenvironment - From here, type
add IJuliato add the IJulia kernel. This is so you can use Julia with Jupyter - Now, this next part is very important, if you start an OOD session make sure to include
Julia/1.11.4-linux-x86_64underadditional modulesand make sure to activate the environment you just created.
Julia Depot Path
By defult, ~/.julia/ is the Julia depot path, the directory where all packages, configurations, and other Julia-related files are stored. This is set by the JULIA_DEPOT_PATH envionrment variable. Since there is limited space in your home directory on Bouchet, it best to change the depot to something like scratch.
DEPOT_PATH=/home/${USER}/project/JULIA_DEPOT
mkdir -p ${DEPOT_PATH}
export JULIA_DEPOT_PATH=${DEPOT_PATH}
Alternatively, use symlink
- move
.juliato scratchmv /home/${USER}/.julia /home/${USER}/scratch/.julia - symlink to your home direcotry
ln -s /home/${USER}/project/.julia /home/${USER}/.julia
Create a Julia environment
- Navigate to where you want your environment to live. I suggest a folder in your project directory
- Type
juliato start to the REPL - Then type
]to start the package manager - Type
activate .to set the current working directory as the active environment. This creates aProject.tomlandManifest.toml, two files that include information about dependencies, versions, package names, UUIDs etc. - Add packages to the environment:
add ClimaOcean OceanBioME - Once that is complete, press
deleteto go to the julia repl then typeexit()to close out the REPL
Note
When you add packages the the code is stored in the directory defined by JULIA_DEPOT_PATH.
On Bouchet this should be the scratch directory since it can get large.
The downside this directory is purged after 90 days.
Running a simple model
Now that everything is setup, let's go through the steps of setting of a simple simulation and submitting the run to the cluseter.
Setup simulation
Submit to cluster
Open the note below to show the submit.sh script used to submit a job to the cluster. Copy the code store it in your project directory. If you are new to slurm, I suggest looking at the YCRC documentation
submission script
#!/bin/bash
#SBATCH --job-name=clima
#SBATCH --ntasks=1
#SBATCH --time=3:00:00
#SBATCH --account eisaman
#SBATCH --nodes 1
#SBATCH --mem 10G
#SBATCH --partition gpu
#SBATCH --gpus=a100:1
module purge
module load miniconda/24.9.2
module load Julia/1.11.4-linux-x86_64
###---------------------------------------------------------
### if true, this will instantiste the project
### set to true only if the project is not instantiated yet
### this will only need to be done once
###---------------------------------------------------------
INSTANTIATE=false
###---------------------------------------------------------
### path to simulation you want to run
###---------------------------------------------------------
SIMULATION=/home/${USER}/project/repos/oceananigans-coupled-global/simulations/global-simulation-dev.jl
###---------------------------------------------------------
### this is path where where Project.toml is
### note: do not put Project.toml at end of the path
###---------------------------------------------------------
PROJECT=/home/${USER}/project/repos/oceananigans-coupled-global/
###---------------------------------------------------------
### this is where all downloaded file will live
### will make scratch directory if does not already exist
###---------------------------------------------------------
DEPOT_PATH=/home/${USER}/project/JULIA_DEPOT
mkdir -p ${DEPOT_PATH}
export JULIA_DEPOT_PATH=${DEPOT_PATH}
###-------------------------------------------
### this contains ECCO credentials
### your ~/.ecco-credentials
### should contain only these two lines:
###
### export ECCO_USERNAME=your-username
### export ECCO_PASSWORD=your-password
###-------------------------------------------
source /home/${USER}/.ecco-drive
###-------------------------------------------
### instantiates packages
### should only have to run this once
###-------------------------------------------
if ${INSTANTIATE}; then
julia --project="${PROJECT}" -e "using Pkg; Pkg.instantiate()"
fi
wait
###-------------------------------------------
### runs the actual simulation
###-------------------------------------------
julia --project=${PROJECT} ${SIMULATION}
submit.sh is a slurm script the tells the cluster the resources needed to run the job and what you want it to run. It is good practice to run jobs on a compute node and save unprocessed output to scratch.
Submit Job To Cluster
sbatch submit.sh
While a job is running, output logs are saved to slurm-jobID.out. You can tail this file to checkup on the running simulation.
Check Status of a Job
squeue --me
Efficiency Report of Completed Job
seff jobID
Note
The effeciency report can help you more effeciently use resources for future runs. Remember, Bouchet is a shared cluster. If you request nodes that are not being utilized then nobody else can use them until your job is complete. Be mindful of others.