The National Institute for Computational Sciences

Data Transfer

  Data Transfer

Data Transfer Node (DTN) Servers

updated Oct 20, 2017

The ACF provides several ways for transferring files to/from the NFS home directories, NFS project directories, Lustre project directories, and Lustre scratch directories. The ACF and Newton both provide a capability called a Data Transfer Node, also known as a DTN. Currently, there are two DTNs that are part of the Newton cluster in Knoxville at the KPB building and six DTNs that are part of the ACF cluster in Oak Ridge at the University of Tennessee JICS building on the Oak Ridge National Laboratory campus. The table below shows the Newton and ACF DTNs and the relevant information.

Data Transfer NodeIP AddressAuthentication SupportedFile Transfer Protocol SupportedFile System Access
dtn1.newton.utk.edu160.36.136.10NetID,
x.509 certificate
bbcp,
gsiscp,
Globus GridFTP
Home,
/lustre,
/gamma (gpfs)
dtn2.newton.utk.edu160.36.136.9NetID,
x.509 certificate
bbcp,
gsiscp,
Globus GridFTP
Home,
/lustre,
/gamma (gpfs)
datamover1.nics.utk.edu192.249.6.163NetID+Duo,
x.509 certificate
bbcp,
scp, sftp, gsiscp,
Globus GridFTP
Home,
/lustre/medusa
datamover2.nics.utk.edu192.249.6.164NetID+Duo,
x.509 certificate
bbcp,
scp, sftp, gsiscp,
Globus GridFTP
Home,
/lustre/medusa
datamover3.nics.utk.edu192.249.6.165NetID+Duo,
x.509 certificate
bbcp,
scp, sftp, gsiscp,
Globus GridFTP
Home,
/lustre/medusa
datamover4.nics.utk.edu192.249.6.166NetID+Duo,
x.509 certificate
bbcp,
scp, sftp, gsiscp,
Globus GridFTP
Home,
/lustre/medusa
datamover6.nics.utk.edu192.249.6.169NetID+RSA,
x.509 certificate
bbcp,
scp, sftp, gsiscp,
Globus GridFTP
Home,
/lustre/medusa
datamover7.nics.utk.edu192.249.6.170NICS RSA,
x.509 certificate
bbcp,
scp, sftp, gsiscp,
Globus GridFTP
Home,
/lustre/medusa
datamover8.nics.utk.edu192.249.6.171NICS RSA,
x.509 certificate
bbcp,
scp, sftp, gsiscp,
Globus GridFTP
Home,
/lustre/medusa
Both Newton DTNs support logins directly with NetID authentication (NetID username and password) and x.509 authentication so users can setup scripts and automated file transfer programs as necessary.

There are three ACF DTNs, datamover1.nics.utk.edu, datamover2.nics.utk.edu, and datamover3.nics.utk.edu setup for NetID authentication (NetID username and password PLUS Duo two-factor authentication) and x.509 authentication so users can login to this node and perform data transfer functions and setup automated data transfer scripts as necessary.

There are three ACF DTNs, datamover6.nics.utk.edu, datamover7.nics.utk.edu, datamover8.nics.utk.edu setup for NICS RSA authentication (NICS username plus RSA two-factor authentication) and x.509 authentication so users can login to this node and perform data transfer functions and setup automated data transfer scripts as necessary.

To connect to a NICS ACF "RSA DTN" with a Secure Shell client do the following:
ssh {nics-username}@datamover8.nics.utk.edu

To connect to a NICS ACF "Duo DTN" with a Secure Shell client do the following:
ssh {UT-NetID}@datamover1.nics.utk.edu

To connect to one of the Newton DTNs with a Secure Shell client do the following:
ssh {UT-NetID}@dtn1.newton.utk.edu
or

ssh {UT-NetID}@dtn2.newton.utk.edu
and login with your NetID and password.

NICS will be adding more DTNs on the NICS ACF side, so check periodically for updates to this documentation.

Data Transfer Protocols

The ACF support team provides support on the DTNs for the following file transfer capabilities: SCP, SFTP, GSISCP, bbcp and GridFTP.

Performance note:SCP and SFTP utilities are available for transferring files but will usually perform slower than GSISCP and GridFTP. GSISCP with the data encryption turned off and GridFTP transfers will usually be the fastest file transfer protocol methods due to their high-performance networking (HPN) support.

transferfile sizesourcedestinationtransfer performance in MB/s
SCP - Newton to ACF1GBdtn1.newton.utk.edu:/gamma (gpfs)datamover1.nics.utk.edu:/lustre/medusa111 MB/s
GSISCP - Newton to ACF1GBdtn1.newton.utk.edu:/gamma (gpfs)datamover1.nics.utk.edu:/lustre/medusa123 MB/s
GSISCP with no data encryption - Newton to ACF1GBdtn1.newton.utk.edu:/gamma (gpfs)datamover1.nics.utk.edu:/lustre/medusa306 MB/s
GSISCP - Newton to ACF1GBdtn1.newton.utk.edu:/lustre)datamover1.nics.utk.edu:/lustre/medusa123 MB/s
GSISCP with no data encryption - Newton to ACF1GBdtn1.newton.utk.edu:/lustredatamover1.nics.utk.edu:/lustre/medusa240 MB/s
Globus GridFTP - Newton to ACF1GBdtn1.newton.utk.edu:/gamma (gpfs)datamover1.nics.utk.edu:/lustre/medusa167 MB/s
GSISCP script with no data encryption - Newton to ACF1000 1GB filesdtn1.newton.utk.edu:/lustredatamover1.nics.utk.edu:/lustre/medusa236 MB/s
Globus GridFTP - Newton to ACF1000 1GB filesdtn1.newton.utk.edu:/gamma (gpfs)datamover1.nics.utk.edu:/lustre/medusa550 MB/s

SCP, SFTP, GSISCP

The DTNs support file transfer with OpenSSH file transfer utilities SCP and SFTP. SCP and SFTP are installed and available on most Linux/Unix machines. To perform a file transfer from the Newton DTN to an ACF DTN do the following
dtn1$ scp {Newton file} {ACF-DTN}:{ACF file}
like the following
dtn1$ scp /lustre/scratch/victor/onegigabytefile datamover8.nics.utk.edu:/lustre/medusa/victor/files/onegigabytefile

The DTNs also support file transfer with GSI-OpenSSH file transfer utility GSISCP. GSI-OpenSSH is a variant of OpenSSH developed by the Globus project that supports high performance file transfer and authentication with Grid Security Infrastructure (GSI). GSISCP is installed on Newton and ACF DTNs and the software is available from https://globus.org. GSISCP will work with GSI user certificate authentication (x.509 and XSEDE MyProxy certificates) and with JICS RSA and UT NetID with Duo. GSISCP will attempt GSI certificate authentication first and then RSA authentication on the "RSA DTN" and GSI authentication first and then NetID with Duo authentication on the "Duo DTN".

GSISCP just like SCP uses encryption to protect both the authentication and the data being transferred. With GSISCP from the GSI-OpenSSH with high performance networking, a new cipher exists for use with GSISCP called "NONE". This cipher turns off data payload encryption and can lead to significant performance speedups over regular SCP. See how to use it in the example below. Note that both ends have to have the GSI-OpenSSH with HPN modifications to use the NONE cipher.

BBCP

The DTNs provide another tool to securely and quickly copy data from source to target. The BBCP utility is capable of breaking up file transfer into multiple simultaneously transferring streams, thereby transferring data faster than single-streaming utilities such as SCP and SFTP. Check for local availability, but if it is not available the source code can be obtained from its homepage. Several examples about how to use it, can be found at its dedicated BBCP page.


GridFTP

GridFTP, a file transfer server component of Globus Toolkit, is one of the two highest performance file transfer methods ACF supports. ACF DTNs provide GridFTP services with the GridFTP client commands. Many other computing centers support GridFTP for file transfer. This method can be used using a web interface or at the command line. For more detailed information about how to use GridFTP see our dedicated GridFTP page.

At the command line, access GridFTP using the Globus Toolkit command globus-url-copy using the source or destination address of either
gsiftp://datamover8.nics.utk.edu:2811
as well as, either
gsiftp://datamover1.nics.utk.edu:2811

Globus Web-based Data Transfer

ACF users can use the web-based Globus file transfer interface to perform data transfers to/from ACF supported resources. The visual interface makes it quite easy to move, back up or restore relevant data. To get you started, visit the Globus website and consult the Getting Started guide. There are some fantastic documentation on this capability located in the Globus How-To documentation.


The Globus GUI for file transfer between ACF DTNs and Newton DTNs

Globus Endpoints

The Globus endpoints to access ACF and Newton DTN resources are the following:

  • nics#datamover1
  • nics#datamover2
  • nics#datamover3
  • nics#datamover5
  • nics#datamover6
  • nics#datamover7
  • nics#datamover8
  • UTK OIT Newton DTN1
  • UTK OIT Newton DTN2

One of the latest features of Globus is Globus Connect Personal. Globus Connect Personal turns your personal computer into a Globus endpoint so you can share and transfer files to/from a local machine - campus server, desktop computer or laptop.

Setting up x.509 authentication

In order to use the GSISCP and Globus GridFTP transfer services each user needs to do two things

  1. In the NICS portal associate their NetID with their NICS account (see the image below) and
  2. In the NICS portal setup their X.509 user certificate by associating their InCommon credential with their NICS account
Both of these are shown in the image below. To start off login to the NICS portal at https://portal.nics.utk.edu and click on the "To associate your UTK or UTHSC NetID with your NICS account" follow the prompts, then click on the button to associate your InCommon credential with the NICS infrastructure. Click on the buttons shown in this example portal view as shown below:

To setup this credential you will select "University of Tennessee" as the identity provider and login using your University of Tennessee NetID username and password when prompted by the InCommon CILogon interface. You will set a password for your X.509 credential. Please note and remember this password as you will use it in setting up Globus or GSISCP with X.509 credentials. Once you go through the CILogon process the Distinguished Name (DN) of your X.509 credential will be associated with the NICS ACF infrastructure and will be available for use. Screeshots of the step by step process is shown below.

Step 0: Login to the Newton login node in order to save the credential you are about to create in Step 4

Step 1: select University of Tennessee as the Identity Provider

Step 2: Authenticate with your UT NetID and Password

Step 3: enter a password for your new InCommon credential (and remember this!)

Step 4: you will get a screen that shows you can click to download your certificate. Click to download and save locally. You could also use wget to this URL from Newton to save to your Newton home directory. There is a time limit for access to this certificate so be aware of that. You may have to move quickly to download the certificate.

This X.509 distinguished name (DN) information is put into the /etc/grid-security/grid-mapfile on the ACF DTNs and this process is done every hour so you may have to wait an hour to use this authentication method. Once you have this setup and your credential is in the /etc/grid-security/grid-mapfile on the ACF DTNs you are ready to start using Globus for data transfers. If you want to use GSISCP you will need to follow the instructions in the below paragraph to set that up. The ACF DTNs are configured to use CILogon OAuth credentials. For example, the nics#datamover1 Globus endpoint is setup to use your CILogon credential so just login to Globus, select the nics#datamover1 endpoint and authenticate with your CILogon password. No other authentication method will work for the ACF DTNs with Globus and the GSISCP protocols (one cannot use NetID and password, for example).

To use your new X.509 credential with GSISCP you will need to obtain a credential pem file and put it in your home directory on Newton. The file specifically needs to go into the in ~/.globus/usercert.pem with permissions 600. If you didn't save the credential following the instructions above you can get a new credential pem file by going back to the https://cilogon.org/ page and go through the process again to generate a new certificate. This will then prompt you for a credential password so go ahead and type one in. Again, be sure to remember what this password is for future reference. The CILogon page will give you a link to download the certificated needed as shown below.

Once you have this credential in the ~/.globus/usercred.pem file then login to one of the Newton DTNs (dtn1.newton.utk.edu or dtn2.newton.utk.edu) and run grid-proxy-init. grid-proxy-init will prompt you for your CILogon credential password. This will create a proxy credential which can be used with GSISCP. Once you have done the grid-proxy-init you can then do a gsiscp without having to type a username or password. The default credential lifetime is 12 hours. See the following transcript for an example.

Globus Web-based File Transfer Performance Example

As an example of the power and utility of Globus web-based file transfer capability, one thousand one gigabyte files were created in /gamma/victor (Newton GPFS file system) and transferred to /lustre/medusa/victor on datamover1.nics.utk.edu using the Globus web-based file transfer tool. After logging into https://www.globus.org and setting up the Globus transfer tool with the nics#datamover1 endpoint on the left and UTK OIT Newton DTN1 on the right side all the one gigabyte files were selected on the right and then the transfer button was selected at the top.

Globus managed the file transfer and in many cases does some file transfers in parallel. Using this method the 1000 files of 1 terabytes of data was transferred in 32 minutes and averaged 550 MB/s transfer performance.


Using FileZilla to Transfer Files to/from the ACF

FileZilla will work with file transfers to the ACF. Please only use the DTNs described in the Data Transfer Node Servers section for data transfer and refrain from doing data transfers to ACF login nodes. ACF login nodes are not optimized for data transfer.


To use the FileZilla client with your NetID, password and Duo multi-factor authentication follow these steps:

  • Open your FileZilla client
  • select "File" -> "Site Manager"

  • Click on "New Site" which has the below subset of steps

    • For Host put one of the following: datamover1.nics.utk.edu, datamover2.nics.utk.edu or datamover3.nics.utk.edu (those are the DTNs that support duo)
    • For Protocol select "SFTP - SSH File Transfer Protocol"
    • For Logon Type select "Interactive" (this is the only one that will work with Duo
    • Put in your NetId for User
    • I would rename the entry under "My Sites" from "New Site" to datamover1, or datamover2, or datamover3 whichever you used in #3a

  • Click Connect
  • When the FileZilla client prompts, enter your password and click Ok

  • When the FileZilla client prompts again for the Duo challenge, enter "1" in the "Password" field and click Ok

  • You should receive a Duo push request to your smartphone and on your smart phone select "accept" to authorize the authentication
  • That should be it and you should connect successfully with FileZilla

To connect to datamovers5-8 you can do exactly as described above but you will not receive a second password challenge as is described in the above.