Office of Science
FAQ
Capabilities

Computing: Data File Storage (Aurora) (GB)

Quick Specs

  • Designed for safe data storage
  • No cost
  • Petabytes available
  • Research groups own their own data
  • Access is easy

Aurora, EMSL's scientific data archive, is a dedicated computer system specifically designed for long-term storage of data collected by EMSL instruments. It is available at no cost to EMSL users who are part of an active EMSL user proposal.

Aurora is safe.

Aurora is free.

Aurora is easy to use.

ABOUT THE AURORA FILE SYSTEM (/ARCHIVE)

Aurora is based on IBM's High Performance Storage System (HPSS) with customizations specific to EMSL. The EMSL customizations provide simple access methods for users who are not accustomed to HPSS.

HPSS uses a combination of high-performance disk storage and high-capacity tape storage. This allows a good tradeoff between performance and expense for high-capacity storage systems. New files are written first to disk then copied to tape shortly thereafter. After a period of time, the disk copy will be deleted to save space, leaving the data on tape.

HOW TO TRANSFER FILES TO AURORA

For users outside of PNNL, use FTP, SFTP, or SCP to transfer data from aurora.emsl.pnl.gov. SecureID credentials are necessary to login.

For users inside PNNL, there are several transfer methods:

GETTING AN AURORA ACCOUNT

All EMSL users who are part of an active EMSL user proposal are allowed access to Aurora. Request an account by contacting your PI (or by requesting an account using IOPS if you are an internal user).

Aurora is a permanent file system that is backed up daily. If you will need more than 500 GB of space for storing your files, include this information in your request.

PERFORMANCE TIPS

Recently created or recently accessed files in the archive are likely to be stored on disk, and access to them will be fast. Files that are particularly large or have not been accessed in some time may only be stored on tape. In this case, it may take up to a minute to access the file contents because Aurora's tape robot must automatically retrieve the correct tape and begin transferring the data.

It is important not to try to search for data in large numbers of files while they are in the archive, as tape access times will cause such a process to be slow. Instead, retrieve copies of the files to a local file system with good performance and perform the operation(s) there.

When storing data in Aurora, it is important to store fewer, larger files vs. many smaller files. Users are encouraged to use utilities such as ZIP and TAR to collate their data before transferring it to Aurora for long-term storage.

For more information see How To: Use EMSL's Aurora File System, which discusses Aurora in the context of Chinook use, and Aurora Policies.

  1. Direct Numerical Simulation of Pore-Scale Flow in a Bead Pack: Comparison with Magnetic Resonance Imaging Observations.
  2. Structure sensitivity of hydrogenolytic cleavage of endocyclic and exocyclic C-C bonds in methylcyclohexane over supported iridium particles.
  3. Metal-Centered 17-Electron Radicals CpM(CO)3• (M = Cr, Mo, W): A Combined Negative Ion Photoelectron Spectroscopic and Theoretical Study.
  4. Discovery of a Splicing Regulator Required for Cell Cycle Progression.
  5. pGraph: Efficient Parallel Construction of Large-Scale Protein Sequence Homology Graphs.
  1. Using an integrated ‘omics approach to study macrophages (Activating macrophages)
  2. Catalysis at industry conditions reveals resulting carbon monoxide chemistry (Large or small, platinum clusters provide new insights)
  3. Aluminum oxide in common soil minerals captures uranium (Migrating Contaminant Sticks To Minerals)
  4. Cryogenic NMR, theory help prove validity of photosynthesis model (Molecules Frozen Stiff)
  5. Scientists connect previous studies on electron transport in hematite (Grow Iron, Slow Pollution)