Skip to content

SAGA Tutorial Part 4: Adding File Transfer

oleweidner edited this page Oct 22, 2012 · 20 revisions

SAGA Layers

In this fourth part of the tutorial, we again build on the previous example and some code that copies our job's output file back to the local machine. This is done using the saga.filesystem API package.

Prerequisites

This example assumes that you have SFTP access to the remote resource that you have used in the previous example. Again, this example assumes that you have a working public/private SSH key-pair and that you can sftp into your remote resource using those keys, i.e., your public key is in the ~/.ssh/authorized_hosts file on the remote machine. If you are not sure how this works, you might want to read SSH and GSISSH first.

Hands-On: Remote Job Submission with File Staging

Copy the code from the previous example 3 to a new file saga_example_remote_staging.py. Add the following code after the last print, right before the except statement:

Note: Make sure that you adjust the paths to reflect your home directory on the remote machine.

outfilesource = 'sftp://localhost/Users/oweidner/myjob.stdout'
outfiletarget = 'file://localhost/tmp/'
out = saga.filesystem.File(outfilesource, session=ses)
out.copy(outfiletarget)

print "Staged out %s to %s (size: %s bytes)" % (outfilesource, outfiletarget, out.get_size())

Run the Code

Save the file and execute it:

python saga_example_remote_staging.py

The output should look something like this:

Job ID    : None
Job State : saga.job.Job.New

...starting job...

Job ID    : [ssh://localhost]-[67712]
Job State : saga.job.Job.Running

...waiting for job...

Job State : saga.job.Job.Done
Exitcode  : 0

Staged out sftp://localhost/Users/oweidner/myjob.stdout to file://localhost/tmp/ (size: 16 bytes)

Check the Output

Your output file should now be in /tmp/myjob.stdout and contain the string "Hello from SAGA".

Details & Discussion

Another important feature of SAGA is its (remote) file and directory handling capabilities. These capabilities are packaged in the bliss.saga.filesystem module (API Doc). Two main classes are defined in this module:

  • The filesystem.File class (API Doc) provides a handle to a (remote) file.
  • The filesystem.Directory class (API Doc) provides a handle to a (remote) directory.

Together, these two classes can be used to traverse and modify local and remote filesystems. Currently (v. 0.2.4), SAGA supports the SFTP protocol, but other protocol plug-ins are under development.

Hands-On: Listing a Remote Directory

NOTE: For security reasons, SAGA does not support SSH authentication via plain username/password. In order to use the sftp plug-in with remote machines, it is hence necessary to set-up public-key-based ssh-keychain access to the remote hosts you want to use. For this tutorial, we use localhost for the sake of simplicity. If you need help setting up your ssh key, please check our guide to configuring SSH Password-Less Login

In your $HOME directory, open a new file saga_example_2.py with your favorite editor (e.g., vim) and paste the following content:

import os, sys, getpass
import bliss.saga as saga

def main():

    try:
        # create a new subdirectory in /tmp/
        tmp = saga.filesystem.Directory("sftp://localhost/tmp")
        mydir = tmp.open_dir(getpass.getuser(), saga.filesystem.Create)

        # copy this python file to the newly created directory 
        thisfile = saga.filesystem.File("sftp://localhost/"+os.path.abspath(__file__))
        thisfile.copy(str(mydir.get_url()))
        # copy another file
        motdfile = saga.filesystem.File("sftp://localhost/etc/motd")
        motdfile.copy(mydir.get_url())

        # list the directory content
        for entry in mydir.list():
            file = saga.filesystem.File(str(mydir.get_url())+"/"+entry)
            print "%s (%s bytes)" % (file.get_url(), file.get_size())

        # finally, remove the directory 
        mydir.remove()

    except saga.Exception, ex:
        print "An error occured during file operation: %s" % (str(ex))
        sys.exit(-1)

if __name__ == "__main__":
    main()

Save the file and execute it via the python interpreter (make sure your virtualenv is activated):

python saga_example_2.py

The output should look something like this:

sftp://localhost/tmp/yourusername//saga_example_2.py (1029 bytes)
sftp://localhost/tmp/yourusername//motd (1758 bytes)

Back: [Tutorial Home](SAGA Tutorial)    Next: SAGA Tutorial Part 5: ABC