Bath HPC
Balena
- Getting Access
- Getting Started
- System Architecture
- Developer Guides
- Premium Accounts
- MPI Guide
- Training
- Scheduler
- Storage
- Software
- Environment Modules
- Getting Help
Balena is an Linux-based High Performance Computing system specifically designed for running parallel applications.
The Balena HPC cluster can only be accessed from within the campus network, it is not available from external connections.
Unix/Mac users can access Balena using their local terminal using SSH
ssh [user_name]@balena.bath.ac.uk
User should start the Xming service on the Windows system before opening a session to view graphical output
To connect to Balena when outside the campus network you use either of the below options
SSH to the linux.bath service
ssh abc123@linux.bath.ac.uk
Unix/Mac users can access scp
, rsync
or sftp
tools to transfer data to and from Balena.
Graphical tools like FileZilla can also be used.
The Balena cluster is a shared research resource available to all researchers in the University. There may be many users logged on at the sametime accessing the filesystem, hundred of jobs my be running on the compute nodes, with a hundred jobs queued up waiting for resources. All users must Respect the shared file systems Do not run workloads on the login nodes The two login nodes are shared among all users of the service. A single user running computational workloads will negatively impact performance and responsiveness for other users. Should an administrator discover users using the login nodes in this way, they will kill these jobs without notice. Instead, submit batch jobs or request interactive sessions on the compute nodes as detailed below. Fair use of Interactive Test and Development nodes Kinldy refrain from running production workloads on the ITD nodes. This will impact other users having a fair access to the limited resources available on the ITD nodes. The purpose of the ITD nodes is serve the testing and developing needs of our community$HOME
directory, this will suffer in performance as well as possibly impacting the responsiveness of the filesystem. Run jobs in $SCRATCH or $DATA.
#!/bin/bash # set the account to be used for the job #SBATCH --account=free # set name of job #SBATCH --job-name=myjob #SBATCH --output=myjob.out #SBATCH --error=myjob.err # set the number of nodes #SBATCH --nodes=1 #SBATCH --ntasks-per-node=16 # set max wallclock time #SBATCH --time=04:00:00 # mail alert at start, end and abortion of execution #SBATCH --mail-type=END # send mail to this address #SBATCH --mail-user=user123@bath.ac.uk # run the application ./myparallelscript
To submit a job to Balena use the sbatch command along with the job-script file.
sbatch jobscript.slurm
squeue --user=user123
Balena offers a high-end visualisation service, which will allow users to utilise the high-end graphics cards available on Balena to visualise large and complex graphical models from their remote desktop.
For more information about how VirtualGL work see: http://www.virtualgl.org/
The default settings should be sufficient. The table below describes the various options available.
Option | Default | Description |
---|---|---|
Walltime | 1 hour | Set the time of the Visualisation session. By default the visualisation service uses the free service, making the maximum session time 6 hours. |
Shared | no | Will allow selected users to access the VNC session
|
Allowed users | none selected | Select the users you would like to share your VNC session with. You can select multiple users by holding down the Ctrl button. |
Nodes | 1 | Recommend leaving this set to 1. Unless needed for MPI based visualisation workloads. |
Cores | 1 | Recommend leaving this set to 1. The visualisation nodes are fully shared to allow multiple users and do not restrict the size of jobs being run on the node. |
CUDA offload devices | 0 | Will reserve a dedicated GPU card. This is only required if you are running an intensive visualisation. |
Resolution | current desktop window size | Set the size of the resolution. By default it will use your current window size. |
Bandwidth | auto | Auto default setting should be sufficient, but if you experience responsiveness issues then try adjusting the bandwidth setting. Lowering the bandwidth will reduce the quality of the pixel rendering to increase the responsiveness of the interactions. |
Only Java Applet | no | To only allow visualisation access via the Java Applet. |
Extra parameters | Additional parameters to pass to sbatch, see SBATCH options. To run a visualisation job under a premium project use "--account=project-code --time=8:00:00". |
Using the Java Applet VNC session will launch a Java version of the VNCclient from the Balena servers. This is particularly useful if you do not have a VNC client on your desktop; however the performance of the visualisation will be slightly lower.
To launch the Java Applet VNC session:
This session opens up a direct connection to the VNC service on the Visualisation nodes. You will need a VNC client/viewer installed on your desktop to be able to connect. The recommended version is TurboVNC, but you can also use TigerVNC or TightVNC, which have similar 3D performance capabilities, although other VNC clients will also work. If you do not have Administrator/root access on your local machine to install the VNC client, then you can use the VNC viewers available under the download section on the userportal home page.
To launch the Local VNC Client session:
To end a visualisation session:
The below instructions use the VMD visualisation tools to load up example data, and will allow you to interact with the molecule.
module list
` should show that you have the esm module loaded; this is done be default on the visualisation nodes. module load vmd
`vglrun
` is needed before any graphical command to capture the OpenGL instructions and process them on the local graphics card.vglrun vmd /beegfs/scratch/group/training/vis/vmd-example.pdb
module load cuda/toolkit
nvidia-smi
Application | Version(s) | Description | Supported Hardware | License | Last Modified |
---|---|---|---|---|---|
IGV | 2.4.3 | The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations. | MIT License |
| |
VMD | 1.9.2 | Molecular visualization program | UIUC Open Source License |
|
Some of the settings in your ~/.java directory will need resetting or cache cleared out. To clean out the cache use `javaws -viewer
`, the cache settings can be found from the Settings button under the General tab.
Failing that, the next simplest method would be to remove the ~/.java directory.
The environment path of vglrun is setup when the esm module is loaded. If `module list` does not show esm in the list, then you will need to load it with `module load esm`. If could be that in your ~/.bashrc or ~/.bash_profile you might have purged the modules.
Start an Interactive Test and Development session
[rtm25@balena-01 mpi]$ sinteractive --time=15:00
salloc: Granted job allocation 18683
srun: Job step created
[rtm25@itd-ngpu-01 mpi]
Source code
// www.mpitutorial.com
// An intro MPI hello world program that uses MPI_Init, MPI_Comm_size,
// MPI_Comm_rank, MPI_Finalize, and MPI_Get_processor_name.
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d"
" out of %d processors\n",
processor_name, world_rank, world_size);
// Finalize the MPI environment.
MPI_Finalize();
}
Compiling your code
[rtm25@itd-ngpu-01 mpi]$ module load intel/compiler
[rtm25@itd-ngpu-01 mpi]$ module load intel/mpi
[rtm25@itd-ngpu-01 mpi]$ mpicc helloworld.c -o helloworld.exe
Create the slurm job script
#!/bin/bash
# set the account to be used for the job
#SBATCH --account=prj-cc001
# set name of job
#SBATCH --job-name=helloworld
#SBATCH --output=helloworld.%j.o
#SBATCH --error=helloworld.%j.e
# set the number of nodes and partition
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --partition=batch-acc
# set max wallclock time
#SBATCH --time=04:00:00
# mail alert at start, end and abortion of execution
#SBATCH --mail-type=END
# send mail to this address
#SBATCH --mail-user=rtm25@bath.ac.uk
# Load dependant modules
module load intel/mpi
# run the application
mpirun -np $SLURM_NTASKS ./helloworld.exe
Submit job to the scheduler
[rtm25@itd-ngpu-01 mpi]$ sbatch jobscript.slm
Submitted batch job 18685
Check job
[rtm25@itd-ngpu-01 mpi]$ squeue -u rtm25
JOBID NAME USER ACCOUNT PARTITION ST NODES CPUS MIN_MEMORY START_TIME TIME_LEFT PRIORITY NODELIST(REASON)
18685 helloworld rtm25 prj-cc001 batch-acc R 2 32 62K 2015-07-23T12:08:57 3:59:57 13537 node-as-agpu-001,node-as-ngpu-005
18683 interactiv rtm25 free itd R 1 1 0 2015-07-23T12:04:58 10:58 89 itd-ngpu-01
Job Output
[rtm25@itd-ngpu-01 mpi]$ ls
helloworld.18685.e helloworld.18685.o helloworld.c helloworld.exe jobscript.slm
[rtm25@itd-ngpu-01 mpi]$ cat helloworld.18685.o
Hello world from processor node-as-agpu-001, rank 1 out of 32 processors
Hello world from processor node-as-agpu-001, rank 2 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 17 out of 32 processors
Hello world from processor node-as-agpu-001, rank 3 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 21 out of 32 processors
Hello world from processor node-as-agpu-001, rank 4 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 16 out of 32 processors
Hello world from processor node-as-agpu-001, rank 5 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 18 out of 32 processors
Hello world from processor node-as-agpu-001, rank 6 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 19 out of 32 processors
Hello world from processor node-as-agpu-001, rank 7 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 20 out of 32 processors
Hello world from processor node-as-agpu-001, rank 0 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 22 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 23 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 24 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 25 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 26 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 27 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 28 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 29 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 30 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 31 out of 32 processors
Hello world from processor node-as-agpu-001, rank 8 out of 32 processors
Hello world from processor node-as-agpu-001, rank 9 out of 32 processors
Hello world from processor node-as-agpu-001, rank 10 out of 32 processors
Hello world from processor node-as-agpu-001, rank 11 out of 32 processors
Hello world from processor node-as-agpu-001, rank 12 out of 32 processors
Hello world from processor node-as-agpu-001, rank 13 out of 32 processors
Hello world from processor node-as-agpu-001, rank 14 out of 32 processors
Hello world from processor node-as-agpu-001, rank 15 out of 32 processors
1 Comment
Unknown User (jmf45)
LCPU: You can automate the logging in via LCPU with a few extra lines in your ssh-config. I have it setup so that 'ssh balena' works directly within Bath (or on the VPN), whereas 'ssh balena-tunnel' will bounce in via LCPU from any internet connection in the world.
It's a bit messy, but it should be clear what's going on:-
https://github.com/jarvist/filthy-dotfiles/blob/69033a9d43412e4e9c9eab3add554469c658c8c6/ssh-config#L52-L105
There's also other bits in that ssh-config that make the SSH connections more resistant to being dropped on poor quality WiFi, and which multiplex SSH connections via the Server setup (avoids hammering login nodes when you are using automated HPC drivers such as USPEX).