Skip to end of metadata
Go to start of metadata

Introduction

Balena is an Linux-based High Performance Computing system specifically designed for running parallel applications.


 

How do I log into Balena?

The Balena HPC cluster can only be accessed from within the campus network, it is not available from external connections.

Inside the campus

Unix/Mac users

Unix/Mac users can access Balena using their local terminal using SSH

ssh [user_name]@balena.bath.ac.uk

Windows users

Windows users will need to use a terminal emulator software like PuTTY or KiTTY to access Balena. 

Standard session to Balena

Session with X11 Forwarding (graphics)

User should start the Xming service on the Windows system before opening a session to view graphical output

Starting Xming



Outside the campus

To connect to Balena when outside the campus network you use either of the below options 

  • SSH to the linux.bath service 

    ssh abc123@linux.bath.ac.uk
  • Use VPN connection to the campus network, once connected you can then SSH directly to Balena, see instructions above.

 


How do I transfer data to and from Balena?

Unix/Mac users

Unix/Mac users can access scp, rsync or sftp tools to transfer data to and from Balena.

Graphical tools like FileZilla can also be used. 

 

Windows users

Windows users will need to use a terminal emulator software like PuTTY or KiTTY to access Balena. File transfer from Windows system to Balena can be done using a SFTP clients, such as FileZilla or WinSCP.

 


Basic Linux and Editors

Reference Guide

Linux Quick Reference Guide

Text Editors

  • Emacs
  • ViM
  • Nano

 


Good user guide

The Balena cluster is a shared research resource available to all researchers in the University. There may be many users logged on at the sametime accessing the filesystem, hundred of jobs my be running on the compute nodes, with a hundred jobs queued up waiting for resources. All users must 

Respect the shared file systems

  • Avoid running jobs under $HOME directory, this will suffer in performance as well as possibly impacting the responsiveness of the filesystem. Run jobs in $SCRATCH or $DATA.
  • Avoid too many simultaneous file transfers.

Do not run workloads on the login nodes

The two login nodes are shared among all users of the service. A single user running computational workloads will negatively impact performance and responsiveness for other users. Should an administrator discover users using the login nodes in this way, they will kill these jobs without notice. 

Instead, submit batch jobs or request interactive sessions on the compute nodes as detailed below.

Fair use of Interactive Test and Development nodes

Kinldy refrain from running production workloads on the ITD nodes. This will impact other users having a fair access to the limited resources available on the ITD nodes.

The purpose of the ITD nodes is serve the testing and developing needs of our community

 


Submitting jobs to the compute nodes

 

Create a Job Submission Script

Example
#!/bin/bash

# set the account to be used for the job
#SBATCH --account=free

# set name of job 
#SBATCH --job-name=myjob
#SBATCH --output=myjob.out
#SBATCH --error=myjob.err

# set the number of nodes
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16

# set max wallclock time
#SBATCH --time=04:00:00

# mail alert at start, end and abortion of execution
#SBATCH --mail-type=END

# send mail to this address
#SBATCH --mail-user=user123@bath.ac.uk

# run the application
./myparallelscript

Submit your Job to the Queue

To submit a job to Balena use the sbatch command along with the job-script file.

sbatch jobscript.slurm

Monitoring your Job

squeue --user=user123


 

 

Visualisation Guide

 

Balena offers a high-end visualisation service, which will allow users to utilise the high-end graphics cards available on Balena to visualise large and complex graphical models from their remote desktop. 

For more information about how VirtualGL work see: http://www.virtualgl.org/

Some things to be aware of

  • To be able to use these tools you will need an active account on the HPC service.
  • The visualisation tools can only be accessed when your machine is on the campus network. If you are off campus you will need to use the VPN service to open up a network tunnel from your machine to the campus network, instructions can be found here.
  • These instructions are a rough guide and are written for use on Windows OS with the Firefox web browser.  Connecting to the visualisation service with other OS and browsers will differ slightly, but the main instructions should be similar. We have tested Internet Explorer, Firefox and Chrome on Windows, Mac and Linux clients, and have found that Chrome is not the
  • By default all visualisation jobs will use the free service with run times being limited to 6 hours, after which your session will be terminated, similar to a batch job running longer than the requested walltime. 

Step-by-step guide to 

Starting a visualisation session

  1. Open up a browser and navigate to http://balena.bath.ac.uk and log in with your University username and password. 
  2. You will be presented with the Bright Cluster Manager userportal, from here select the VISUALISATION tab.
  3. The default settings should be sufficient. The table below describes the various options available. 

    OptionDefaultDescription
    Walltime1 hourSet the time of the Visualisation session. By default the visualisation service uses the free service, making the maximum session time 6 hours.
    Sharedno

    Will allow selected users to access the VNC session

    • shared = all selected users to have full access
    • viewonly = selected users to only view your session
    Allowed usersnone selectedSelect the users you would like to share your VNC session with. You can select multiple users by holding down the Ctrl button.
    Nodes1Recommend leaving this set to 1. Unless needed for MPI based visualisation workloads.
    Cores1Recommend leaving this set to 1. The visualisation nodes are fully shared to allow multiple users and do not restrict the size of jobs being run on the node.
    CUDA offload devices0Will reserve a dedicated GPU card. This is only required if you are running an intensive visualisation.
    Resolutioncurrent desktop window sizeSet the size of the resolution. By default it will use your current window size.
    BandwidthautoAuto default setting should be sufficient, but if you experience responsiveness issues then try adjusting the bandwidth setting. Lowering the bandwidth will reduce the quality of the pixel rendering to increase the responsiveness of the interactions.
    Only Java AppletnoTo only allow visualisation access via the Java Applet.
    Extra parameters Additional parameters to pass to sbatch, see SBATCH options. To run a visualisation job under a premium project use "--account=project-code --time=8:00:00".
  4. To launch a visualisation session click on the blue 'Submit Job' button.  This will submit a job to the scheduler to start a VNC server on the visualisation nodes.  
  5. It takes a few seconds for the connection to be setup, and the browser will auto-refresh the page. Once ready you will see the job appear and green buttons underneath the VNC Session title. 
  6. From here you will have two options to launch the VNC session, these are described in the next two sections. 

Launching a JavaApplet VNC session

Using the Java Applet VNC session will launch a Java version of the VNCclient from the Balena servers.  This is particularly useful if you do not have a VNC client on your desktop; however the performance of the visualisation will be slightly lower. 

To launch the Java Applet VNC session:

  1. Click on the green 'Java Applet' button, this will open up a new browser window. 
  2. You may get prompted by Java to run TightVNC Viewer, click on Run.
  3. After a couple of seconds you will have a Linux Desktop displayed in the window. 

Launching a Local VNC Client session

This session opens up a direct connection to the VNC service on the Visualisation nodes.  You will need a VNC client/viewer installed on your desktop to be able to connect.  The recommended version is TurboVNC, but you can also use TigerVNC or TightVNC, which have similar 3D performance capabilities, although other VNC clients will also work.  If you do not have Administrator/root access on your local machine to install the VNC client, then you can use the VNC viewers available under the download section on the userportal home page.

To launch the Local VNC Client session:

  1. Click on the green 'Local VNC Client' button. This will open up a new window with the session connection details for the VNC client.  The passwords are single use One Time Passwords (OTP) and will expire when an attempt has been made to login. 
  2. These connection details can be copied directly to the VNC client or at the bottom of the window there is a 'Launch local VNC' button. 
  3. In firefox, if this is the first time your browser tries to use VNC files you will be asked what you would like to do with it, see window image below.  Select 'Open With' and browse to and select your vncview executable and request Firefox to remember these settings.  
    .
  4. After clicking 'OK' this will open the VNC client using your session connection details and you should see a Linux desktop.

Ending a visualisation session

To end a visualisation session:

  1. Log out of the desktop session by clicking on 'Log Out' under the System drop-down menu at the top of the screen. 
  2. Cancel the visualisation session in the userportal, by clicking on the red 'Delete' button under the Manage Job section. 

Visualisation 3D performance with VirtualGL

The below instructions use the VMD visualisation tools to load up example data, and will allow you to interact with the molecule.

  1. Launch a terminal session, either by:
    1. Clicking on the Terminal Icon on the top menu bar
    2. Right-clicking on the desktop and select 'Open in Terminal'
  2. `module list` should show that you have the esm module loaded; this is done be default on the visualisation nodes. 
  3. Load the VMD software environment: `module load vmd`
  4. The command `vglrun` is needed before any graphical command to capture the OpenGL instructions and process them on the local graphics card.
  5. To run the VMD tools use the command: vglrun vmd /beegfs/scratch/group/training/vis/vmd-example.pdb
  6. Screen shot of VMD
  7. To take this a step further, the nvidia-smi tool will show the utilisation of the GPU card whilst using the visualisation tools.
    1. module load cuda/toolkit
    2. nvidia-smi

Visualisation software available on Balena

FAQ

When I try to connect from a Linux OS using the Java Applet the window is all greyed out.

Some of the settings in your ~/.java directory will need resetting or cache cleared out. To clean out the cache use `javaws -viewer`, the cache settings can be found from the Settings button under the General tab.

Failing that, the next simplest method would be to remove the ~/.java directory.

 

'vglrun: No such file or directory'

The environment path of vglrun is setup when the esm module is loaded. If `module list` does not show esm in the list, then you will need to load it with `module load esm`. If could be that in your ~/.bashrc or ~/.bash_profile you might have purged the modules. 


 

Example - MPI Hello World

Start an Interactive Test and Development session

[rtm25@balena-01 mpi]$ sinteractive --time=15:00
salloc: Granted job allocation 18683
srun: Job step created
[rtm25@itd-ngpu-01 mpi]

Source code

helloworld.c
// www.mpitutorial.com
// An intro MPI hello world program that uses MPI_Init, MPI_Comm_size,
// MPI_Comm_rank, MPI_Finalize, and MPI_Get_processor_name.

#include <stdio.h>
#include <mpi.h>


int main(int argc, char** argv) {
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);

    // Print off a hello world message
    printf("Hello world from processor %s, rank %d"
           " out of %d processors\n",
           processor_name, world_rank, world_size);

    // Finalize the MPI environment.
    MPI_Finalize();
}

Compiling your code

[rtm25@itd-ngpu-01 mpi]$ module load intel/compiler
[rtm25@itd-ngpu-01 mpi]$ module load intel/mpi
[rtm25@itd-ngpu-01 mpi]$ mpicc helloworld.c -o helloworld.exe

Create the slurm job script

#!/bin/bash

# set the account to be used for the job
#SBATCH --account=prj-cc001

# set name of job
#SBATCH --job-name=helloworld
#SBATCH --output=helloworld.%j.o
#SBATCH --error=helloworld.%j.e

# set the number of nodes and partition
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --partition=batch-acc

# set max wallclock time
#SBATCH --time=04:00:00

# mail alert at start, end and abortion of execution
#SBATCH --mail-type=END

# send mail to this address
#SBATCH --mail-user=rtm25@bath.ac.uk

# Load dependant modules
module load intel/mpi

# run the application
mpirun -np $SLURM_NTASKS ./helloworld.exe

Submit job to the scheduler

[rtm25@itd-ngpu-01 mpi]$ sbatch jobscript.slm 
Submitted batch job 18685

Check job

[rtm25@itd-ngpu-01 mpi]$ squeue -u rtm25
     JOBID       NAME       USER    ACCOUNT  PARTITION    ST NODES  CPUS  MIN_MEMORY           START_TIME     TIME_LEFT  PRIORITY NODELIST(REASON)
     18685 helloworld      rtm25  prj-cc001  batch-acc     R     2    32         62K  2015-07-23T12:08:57       3:59:57     13537 node-as-agpu-001,node-as-ngpu-005
     18683 interactiv      rtm25       free        itd     R     1     1           0  2015-07-23T12:04:58         10:58        89 itd-ngpu-01

Job Output

[rtm25@itd-ngpu-01 mpi]$ ls
helloworld.18685.e helloworld.18685.o helloworld.c helloworld.exe jobscript.slm
[rtm25@itd-ngpu-01 mpi]$ cat helloworld.18685.o
Hello world from processor node-as-agpu-001, rank 1 out of 32 processors
Hello world from processor node-as-agpu-001, rank 2 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 17 out of 32 processors
Hello world from processor node-as-agpu-001, rank 3 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 21 out of 32 processors
Hello world from processor node-as-agpu-001, rank 4 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 16 out of 32 processors
Hello world from processor node-as-agpu-001, rank 5 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 18 out of 32 processors
Hello world from processor node-as-agpu-001, rank 6 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 19 out of 32 processors
Hello world from processor node-as-agpu-001, rank 7 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 20 out of 32 processors
Hello world from processor node-as-agpu-001, rank 0 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 22 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 23 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 24 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 25 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 26 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 27 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 28 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 29 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 30 out of 32 processors
Hello world from processor node-as-ngpu-005, rank 31 out of 32 processors
Hello world from processor node-as-agpu-001, rank 8 out of 32 processors
Hello world from processor node-as-agpu-001, rank 9 out of 32 processors
Hello world from processor node-as-agpu-001, rank 10 out of 32 processors
Hello world from processor node-as-agpu-001, rank 11 out of 32 processors
Hello world from processor node-as-agpu-001, rank 12 out of 32 processors
Hello world from processor node-as-agpu-001, rank 13 out of 32 processors
Hello world from processor node-as-agpu-001, rank 14 out of 32 processors
Hello world from processor node-as-agpu-001, rank 15 out of 32 processors

1 Comment

  1. Unknown User (jmf45)

    LCPU: You can automate the logging in via LCPU with a few extra lines in your ssh-config. I have it setup so that 'ssh balena' works directly within Bath (or on the VPN), whereas 'ssh balena-tunnel' will bounce in via LCPU from any internet connection in the world. 

    It's a bit messy, but it should be clear what's going on:-

    https://github.com/jarvist/filthy-dotfiles/blob/69033a9d43412e4e9c9eab3add554469c658c8c6/ssh-config#L52-L105

    There's also other bits in that ssh-config that make the SSH connections more resistant to being dropped on poor quality WiFi, and which multiplex SSH connections via the Server setup (avoids hammering login nodes when you are using automated HPC drivers such as USPEX).