Skip to content

Storage

KUHPC Storage

The KUHPC storage utilizes the same storage system as ResFS. The only difference is that the KUHPC is managed by the CRC team, while ResFS is managed by your department's TSC.

There are three types of storage directly connected to the cluster. All directories are mounted on both submit nodes and all compute nodes. Each directory is assigned a variable when you log in that you may use in your scripts.

To see any of the storage information below, run the command crctool

Name Quota Variable Purpose
Home 50GB
300,000 files
$HOME Personal Stroage assigned to every user. Backed up.
Work Up to 15TB Free $WORK Shared storage between research group to collaborate within group. Store raw data sets. Backed up.
Scratch 1PB shared $SCRATCH Temporary storage used to process raw data sets. Subject to 120 day purge with notice. Not Backed up

You can use $HOME, $WORK, and $SCRATCH in you submit scripts and make it easier to get around the file systems.

Purchase

All owner groups will receive up to 15TB of $WORK storage for free. Any additional space of $WORK can be purchased for $50 per TB per year. Invoices for storage will be sent quarterly for the previous quarter.

Paths

Storage Path
KUHPC /kuhpc
ResFS /resfs/GROUPS

Map Network Drive

Mapping a network drive allows you to access the KUHPC and ResFS file system. This allows you to access the KUHPC from your desktop computer to transfer files back and forth.

Folders

KUHPC: \\kuhpc.home.ku.edu:\kuhpc
ResFS: \\resfs.home.ku.edu:\resfs\GROUPS\

Research File Stoage (ResFS) on Cluster

Info

C1 and HIPAA Shares are not able to mounted on the KU Community Cluster

ResFS shares are able to be mounted on the cluster. You must send an email to crchelp@ku.edu with the share name.

ResFS Permissions on Cluster

Permissions of files and folders in a ResFS share must be handled through a CIFS connection. This will likely be your Windows or Mac computer. You are unable to run chown or chmod on ResFS shares on the cluster. Contact your Technology Support Center for assistance mounting ResFS on your computer.

It is possible to convert your ResFS share to Linux permissions. We only recommend this if you use your share primarily on the cluster and not your desktop. You must be the owner of the ResFS share to make this request. Any permission issues will be handled by CRC then and not your TSC.

Transfer Data

Using the Data Transfer Node (dtn.ku.edu), files on KUHPC and ResFS may be accessed. You can also transfer files via a mapped network drive as described above.

SCP

Note

Transfer small data sets to and from the cluster. If source or destination is off-campus, must use KU Anywhere. Must keep connection open while transferring.

SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH protocol. SCP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SCP can copy single files, but will also recursively copy directory contents if given a directory name.

Host: dtn.ku.edu

Windows

Mac

Linux

  • Terminal
Terminal

From desktop to cluster

scp sourcefilename KUOnlineID@dtn.ku.edu:somedirectory/destinationfilename
From cluster to desktop
scp KUOnlineID@dtn.ku.edu:somedirectory/sourcefilename destinationfilename
Recursive directory copy to cluster
scp -r sourcedirectory/ KUOnlineID@dtn.ku.edu:somedirectory/

Globus

Note

Transfer large data sets between storage and the world. Share data sets to anyone to be downloaded or uploaded. Uses web application. Can transfer data outside of KU easily. Must use Globus software for destination.

Globus is a mechanism to transfer files between managed endpoints and personal endpoints. It does not store any data. Globus works through your internet browser, so therefore works on Windows, Linux, and MacOS.

Globus Guides

A full guide written by KU Research TSC can be found at Globus Guides.

Topics include:

Guides for Faculty and Staff
  • Globus Account and Client Setup
  • Globus Sharing Files from ResFS
  • Retrieving Collaborator Data with Globus
Guides for External Collaborator
  • Globus Account and Client Setup
  • Retrieving KU Data with Globus
  • Using Globus to Share Data with KU

Quotas

Each type of storage has an enforced quota. To determine how much you are using and how much the total volume is using for each of these volumes, login to the cluster and run crctool.

This will produce output similar to the following:

Info

Information is updated every 30 minutes

------------------------------- Storage Variables ------------------------------
| Variable     Path                                                            |
| $HOME        /home/r557e636                                                  |
| $WORK        /kuhpc/work/crc/r557e636                                        |
| $SCRATCH     /panfs/pfs.local/scratch/crc/r557e636                           |
--------------------------------------------------------------------------------

------------------------------- Storage Groups ---------------------------------
| Name                      Paths                                              |
| crc (Primary) ----------- /kuhpc/work/crc                                    |
|                           /kuhpc/scratch/crc                                 |
| bigjay ------------------ /kuhpc/work/bigjay                                 |
|                           /kuhpc/scratch/bigjay                              |
--------------------------------------------------------------------------------

--------------------------------- Your Usage -----------------------------------
| Volume                            Used       Limit        Files        Limit |
| /home/r557e636                 47.21GB       100GB        67044       300000 |
| /kuhpc/work/crc                36.66GB           0        28816            0 |
| /kuhpc/work/bigjay              9.70MB           0          298            0 |
| /kuhpc/scratch                      0B           0           16            0 |
--------------------------------------------------------------------------------

-------------------------------- Volume Usage ----------------------------------
| Volume                            Used        Size        Files        Limit |
| /kuhpc/work/crc                 1.02TB         2TB       268544            0 |
| /kuhpc/work/bigjay             76.25TB       235TB     13357333            0 |
| /kuhpc/scratch                 104.0TB     972.8TB     19009582            0 |
--------------------------------------------------------------------------------

Violation

Users will not be able to write to any directory that has hit the hard quota limit. Groups will receive an email when a quota for a work volume has been exceeded (Hard Quota) or is about to be exceeded (Soft Quota).

Soft Quota

A warning email is sent to the user that the specified resource is about to exceed the size or file limit quota. Work soft quotas are set at 97% usage.

Hard Quota

The maximum allotted space has been reached for that volume. This could be for any of locations above. No further writes are allowed, and you must remove files before creating any new ones.

Recovering your files

Info

Snapshots are taken at 12:10AM Central every day

One of the features of storage system is are snapshots. Snapshots are a daily capture of files in a given directory. All snapshots are user accessible, but only for volumes that are owned by a group the user is part of. Snapshots are read-only, but can be used for when you accidentally delete a file, you can retrieve that file up to 30 days later.

Snapshots are stored in the .snapshot directory in the root of the your work or home directory, but this directory is hidden, and won't be displayed in listings (ls) of that directory. Snapshots are captured for $HOME and $WORK directories but not $SCRATCH

If you accidentally delete a file with the path /kuhpc/work/group1/username/oops.txt. To restore that file from a previous snapshot, you can navigate to the .snapshot directory for your group's work and there you will find directories containing snapshots from the past 30 days. Each of these directories contain a file structure similar to that of /kuhpc/work/group1 and has a snapshot of what was in those files when that snapshot was taken. You can navigate into those directories and copy the file(s) you accidentally deleted back to your work directory.

cd /kuhpc/work/group1/.snapshot
ls
cd daily.date-of-snapshot
cd username
cp oops.txt /kuhpc/work/group1/username

If you edited the file after 12:10AM, the snapshot of that file will not have those changes but only what the file was at 12:10AM.

Snapshots of home directories can also be found in

/home/.snapshot/daily.date-of-snapshot/username

Snapshots are on a rolling 30 day purge, so if you accidentally delete a file you will need to restore it within 30 days or it will be gone forever.