Job execution

Computing infrastructures and associated storage are shared between all users. The sharing of these resources is managed by a resource manager who will be in charge of distributing the user’s computations in an optimal way on the machine, taking into account the needs of each computation (number of cores, number of nodes, GPU, memory, computation time, etc.).

This type of system is essential on any calculation machine, you will not be able to execute your computations independently of the resource manager.

Cluster usage rules

Computing infrastructures are co-financed by specific collective projects and by scientific projects that accept the pooling of their resources. Concerning their use, there are basic rules and rules concerning the co-financed part that has been contributed by certain projects.

Common rules

Resource sharing is handled by the OAR job manager which implements a FIFO queue with fairsharing. The fairsharing tries to ensure a fair sharing between users by giving priority to users who have calculated the least on a sliding time window set to 3 months. The fairsharing index translates into karma, which is a value displayed by the oarstat command. The higher this index is, the more it means that the user has consumed hours of calculation and the less he will have priority over a user with lower karma.
Le partage des ressources est assuré par le gestionnaire OAR qui implémente un ordonnancement FIFO avec Fairsharing. Le fairsharing tente d’assurer un partage équitable entre les utilisateurs en donnant la priorité aux utilisateurs qui ont le moins calculé sur une fenêtre de temps glissante réglée à 3 mois. L’indice de fairsharing se traduit par le karma qui est une donnée affichée par la commande oarstat. Plus cet indice est élevé, plus cela signifie que l’utilisateur a consommé des heures de calcul et moins il sera prioritaire par rapport à un utilisateur au karma plus faible.
The absolute maximum time of jobs is limited. There are two reasons for this:
- Allowing fairsharing to be effective by avoiding blockages of the machine by a few users over too long periods of time.
- Since compute nodes are not on a backed-up power grid, jobs should not last too long to avoid harmful loss of computation results.

The CPU time is the number of cores of the job multiplied by the total time of the job (**walltime): cpu-time = number_of_cores * walltime

Dahu cluster case. The absolute maximum time for jobs on Dahu is 2 days.

The number of waiting jobs is limited to 50 unless they are started from the CIGRI computing grid.
The maximum number of simultaneous active jobs is 200.
Interactive jobs are limited to 12 hours maximum.
Development jobs on dedicated nodes are limited to half an hour.

Dahu being a parallel computing machine equipped with a powerful (and expensive!) computing network, it is primarily intended for parallel computing. Classic jobs take from 64 to 1024 cores and last more than 30 minutes. Small sequential jobs (taking less than 32 cores or one whole node) are tolerated but they must not be too short and too numerous: too many short jobs (less than 10 minutes) will overload the resource manager and lead to poor cluster efficiency. For sequential jobs, it is very preferable to use the CIGRI computing grid.

Rules specific to certain projects

The karma (fairsharing index) of certain users is weighted when they are part of a project that contributed to the purchase of resources. The idea is to grant an advantage in computing hours to these projects, commensurate with their financial participation.
Luke cluster case: Specific nodes have been funded by certain projects or teams. In this case, a special queue is created for the members of these teams or projects so that they always have priority over these particular nodes.

Challenges

In some cases, it may be necessary to run much larger jobs, but since computing machines and in particular Dahu is shared by many users, such use is considered a “challenge” and must be validated by the GRICAD Users Committee, so that it is programmed and all users are notified. If you think you are in the “challenge” case, please describe your project at sos-calcul-gricad@univ-grenoble-alpes.fr.

Proper use of computing and storage infrastructures

Do not overload the head nodes, test your jobs on the sandbox nodes.
Delete unused files, especially in scratch space.
Don’t put an exaggeratedly large walltime: the more the walltime is well estimated, the more efficient the scheduling will be.
Do not take a complete node if a small number of cores and an amount of memory equivalent to this small number of cores is enough.

It is strictly forbidden to run jobs directly on the login nodes (head or front nodes). You must submit your jobs via the OAR Resource Manager so that they are executed on the compute nodes.