File history

A full history of changes, permissions changes, and access events made through the Files API is recorded for every file and folder on registered Tapis systems. The recorded history events represent a subset of the events thrown by the Files API. Generally speaking, the events saved in a file item’s history represent mutations on the physical file item or its metadata.

Direct vs indirect events

Tapis will record both direct and indirect events made on a file item. Examples of direct events are transferring a directory from one system to another or renaming a file. Examples of indirect events are a user manually deleting a file from the command line. The table below contains a list of all the provenance actions recorded.

Event Description
CREATED File or directory was created
DELETED The file was deleted
RENAME The file was renamed
MOVED The file was moved to another path
OVERWRITTEN The file was overwritten
PERMISSION_GRANT A user permission was added
PERMISSION_REVOKE A user permission was deleted
STAGING_QUEUED File/folder queued for staging
STAGING File or directory is currently in flight
STAGING_FAILED Staging failed
STAGING_COMPLETED Staging completed successfully
PREPROCESSING Prepairing file for processing
TRANSFORMING_QUEUED File/folder queued for transform
TRANSFORMING Transforming file/folder
TRANSFORMING_FAILED Transform failed
TRANSFORMING_COMPLETED Transform completed successfully
UPLOADED New content was uploaded to the file.
CONTENT_CHANGED Content changed within this file/folder. If a folder, this event will be thrown whenever content changes in any file within this folder at most one level deep.

Out of band file system changes

Tapis does not own the storage and execution systems you access through the Science APIs, so it cannot guarantee that everything that every possible change made to the file system is recorded. Thus, Tapis takes a best-effort approach to provenance allowing you to choose, through your own use of best practices, how thorough you want the provenance trail of your data to be.

Listing file history

List the history of a file item

tapis files history -v agave:://tacc.work.taccuser/nryan/picksumipsum.txt
Show curl
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://api.tacc.utexas.edu/files/v2/history/nryan/picksumipsum.txt

The response to this contains a summary listing all permissions on the

[
  {
    "status": "DOWNLOAD",
    "created": "2016-09-20T19:47:56.000-05:00",
    "createdBy": "public",
    "description": "File was downloaded"
  },
  {
    "status": "STAGING_QUEUED",
    "created": "2016-09-20T19:48:12.000-05:00",
    "createdBy": "nryan",
    "description": "File/folder queued for staging"
  },
  {
    "status": "STAGING_COMPLETED",
    "created": "2016-09-20T19:48:16.000-05:00",
    "createdBy": "nryan",
    "description": "Staging completed successfully"
  },
  {
    "status": "TRANSFORMING_COMPLETED",
    "created": "2016-09-20T19:48:17.000-05:00",
    "createdBy": "nryan",
    "description": "Your scheduled transfer of http://129.114.97.92/picksumipsum.txt completed staging. You can access the raw file on iPlant Data Store at /home/nryan/picksumipsum.txt or via the API at https://api.tacc.utexas.edu/files/v2/media/system/data.agaveapi.co//nryan/picksumipsum.txt."
  }
]

Basic paginated listing of file item history events is available as shown in the example. Currently, the file history service is readonly. The only way to erase the history on a file item is to delete the file item through the API.