File history¶
A full history of changes, permissions changes, and access events made through the Files API is recorded for every file and folder on registered Tapis systems. The recorded history events represent a subset of the events thrown by the Files API. Generally speaking, the events saved in a file item’s history represent mutations on the physical file item or its metadata.
Direct vs indirect events¶
Tapis will record both direct and indirect events made on a file item. Examples of direct events are transferring a directory from one system to another or renaming a file. Examples of indirect events are a user manually deleting a file from the command line. The table below contains a list of all the provenance actions recorded.
Event | Description |
---|---|
CREATED | File or directory was created |
DELETED | The file was deleted |
RENAME | The file was renamed |
MOVED | The file was moved to another path |
OVERWRITTEN | The file was overwritten |
PERMISSION_GRANT | A user permission was added |
PERMISSION_REVOKE | A user permission was deleted |
STAGING_QUEUED | File/folder queued for staging |
STAGING | File or directory is currently in flight |
STAGING_FAILED | Staging failed |
STAGING_COMPLETED | Staging completed successfully |
PREPROCESSING | Prepairing file for processing |
TRANSFORMING_QUEUED | File/folder queued for transform |
TRANSFORMING | Transforming file/folder |
TRANSFORMING_FAILED | Transform failed |
TRANSFORMING_COMPLETED | Transform completed successfully |
UPLOADED | New content was uploaded to the file. |
CONTENT_CHANGED | Content changed within this file/folder. If a folder, this event will be thrown whenever content changes in any file within this folder at most one level deep. |
Out of band file system changes¶
Tapis does not own the storage and execution systems you access through the Science APIs, so it cannot guarantee that everything that every possible change made to the file system is recorded. Thus, Tapis takes a best-effort approach to provenance allowing you to choose, through your own use of best practices, how thorough you want the provenance trail of your data to be.
Listing file history¶
List the history of a file item
tapis files history -v agave:://tacc.work.taccuser/nryan/picksumipsum.txt
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
https://api.tacc.utexas.edu/files/v2/history/nryan/picksumipsum.txt
The response to this contains a summary listing all permissions on the
[
{
"status": "DOWNLOAD",
"created": "2016-09-20T19:47:56.000-05:00",
"createdBy": "public",
"description": "File was downloaded"
},
{
"status": "STAGING_QUEUED",
"created": "2016-09-20T19:48:12.000-05:00",
"createdBy": "nryan",
"description": "File/folder queued for staging"
},
{
"status": "STAGING_COMPLETED",
"created": "2016-09-20T19:48:16.000-05:00",
"createdBy": "nryan",
"description": "Staging completed successfully"
},
{
"status": "TRANSFORMING_COMPLETED",
"created": "2016-09-20T19:48:17.000-05:00",
"createdBy": "nryan",
"description": "Your scheduled transfer of http://129.114.97.92/picksumipsum.txt completed staging. You can access the raw file on iPlant Data Store at /home/nryan/picksumipsum.txt or via the API at https://api.tacc.utexas.edu/files/v2/media/system/data.agaveapi.co//nryan/picksumipsum.txt."
}
]
Basic paginated listing of file item history events is available as shown in the example. Currently, the file history service is readonly. The only way to erase the history on a file item is to delete the file item through the API.