crctool
crctool
crctool is a command that will display various statistics about compute nodes you have access to and storage usage.
Partitions
--------------------------------- Partitions -----------------------------------
| **(Nodes) Cores / Memory / (G)PU or (P)HI -- Features** |
| partition1 |
| (2) 24 / 128GB -- avx2,ib |
| (1) 24 / 512GB -- avx2,ib |
| (3) 48 / 256GB -- avx2,avx512,ib |
| |
| (216) - Cores |
| (1536 GB) - Memory |
| (6) - Nodes |
| partition2 |
| (1) 48 / 192GB -- avx2,avx512,noib |
| (1) 48 / 128GB / 4G -- avx2,avx512,noib,a40,single |
| |
| (96) - Cores |
| (320 GB) - Memory |
| (4) - GPUs |
| (1) - Nodes |
The first part of the output will display which partitions you have access to and the nodes that are in each partition.
- Parentehsis "()", is the quantity of that type of node in the partition
- Number of cores in each node is next before the slash "/"
- After the slash "/" is the amount of memory each node has has
- After the "--" are the features for that node
- "G" stands for GPU and denotes how many GPUs are in that node
The total of all resources is given a the end
Storage Variables
------------------------------- Storage Variables ------------------------------
| Variable Path |
| $HOME /home/r557e636 |
| $WORK /kuhpc/work/crc/r557e636 |
| $SCRATCH /panfs/pfs.local/scratch/crc/r557e636 |
--------------------------------------------------------------------------------
These are variables automatically assigned to you based on your primary group. They can be used in your submit script.
Storage Groups
------------------------------- Storage Groups ---------------------------------
| Name Paths |
| crc (Primary) ----------- /kuhpc/work/crc |
| /kuhpc/scratch/crc |
| bigjay ------------------ /kuhpc/work/bigjay |
| /kuhpc/scratch/bigjay |
--------------------------------------------------------------------------------
A list of all storage groups you belong to and the corresponding paths for those groups in which you have access to.
Your Usage
--------------------------------- Your Usage -----------------------------------
| Volume Used Limit Files Limit |
| /home/r557e636 47.21GB 100GB 67051 300000 |
| /kuhpc/work/crc 36.66GB 0 28816 0 |
| /kuhpc/work/bigjay 9.70MB 0 298 0 |
| /kuhpc/scratch 0B 0 16 0 |
--------------------------------------------------------------------------------
This is how much you are using in each storage volume. This is not the total usage of the volume, just how much you are using. There is a quota limit on your $HOME directory, but by default there is no quota applied to you in $WORK or $SCRATCH directories.
Volume Usage
-------------------------------- Volume Usage ----------------------------------
| Volume Used Size Files Limit |
| /kuhpc/work/crc 1.02TB 2TB 268544 0 |
| /kuhpc/work/bigjay 76.37TB 235TB 13338827 0 |
| /kuhpc/scratch 104.2TB 972.8TB 19029057 0 |
--------------------------------------------------------------------------------
Total volume usage. This is the total of how much every member of that group is using.
crctool -p par1
Display all jobs, including sixhour jobs, running on the specified partition.
JOBID PARTITION USER STATE TIME TIME_LIMIT NODES CPUS MIN_CPUS MIN_MEMORY NODELIST(REASON)
r05r18n03 - 257415 MB - CPU(48/0/0/48)
57435688 sixhour c123h419 RUNNING 2:40:10 6:00:00 10 512 48 4G r05r18n[02-04],r05r20n[01-02],r20r20n04,r21r07n01,r21r08n01,r21r09n01,r21r10n01
r05r18n04 - 257415 MB - CPU(48/0/0/48)
57435688 sixhour c123h419 RUNNING 2:40:10 6:00:00 10 512 48 4G r05r18n[02-04],r05r20n[01-02],r20r20n04,r21r07n01,r21r08n01,r21r09n01,r21r10n01
r05r20n03 - 257415 MB - CPU(2/46/0/48)
57069939 par1 y123a214 RUNNING 2-04:54:26 12-12:00:00 1 1 1 6G r05r20n03
57137254 par1 y123a214 RUNNING 2-04:54:26 12-12:00:00 1 1 1 150G r05r20n03
r10r08n01 - 128000 MB - CPU(0/24/0/24)
r10r08n02 - 128000 MB - CPU(0/24/0/24)
r10r12n04 - 515072 MB - CPU(0/24/0/24)
JOBID PARTITION USER STATE TIME TIME_LIMIT NODES CPUS MIN_CPUS MIN_MEMORY NODELIST(REASON)
As seen above the par1 partition has a sixhour job running across 2 of the nodes. A user from the par1 group may be wondering why their job is not starting because squeue -p par1 shows that there are nodes available. What squeue doesn't show is the sixhour jobs running on the nodes in the par1 partition.
Header
- JOBID - Job Id
- PARTITION - Partition job is running in
- USER - User running the job
- STATE - Current state of the job
- TIME - How much time has the job been running
- TIME_LIMIT - Requested time limit for job
- NODES - Number of nodes that job is runnig on
- CPUS - Total number of CUs requested
- MIN_CPUS - Number of CPUs requested for each node
- MIN_MEMORY - Memory per CPU
- NODELIST(REASON) - Nodes the job is running across
Partition: atmo CPUs (98/118/0/216) allocated/idle/other/total Nodes (3/3/0/6) GPUs (0/0) allocated/total MICs (0/0)
### Node Line
Node Name - Amount of memory on node - (Active CPUS / Idle CPUS / Offline CPUS / Total CPUS )