LifeOmic File Service FAQ¶
What is LifeOmic File Service?¶
LifeOmic File Service is a managed service for storing and retrieving file data.
Once files are uploaded, they are available to the other PHC services for use and analysis.
Omics Explorer is an example where VCF files may be uploaded, and once indexed, will allow querying across genetic variants (SNV, CNV, Fusion) in real-time in the PHC Web Console.
Task Service and Notebook Service are two examples where files may be brought in for analysis/compute with your own code.
Common file data examples uploaded are:
- Genomic Variants, Gene expression, Proteomics, Pharmacogenetics - file formats such as VCF, BAM, CSV, TSV
- Documents, Images, and Audio - file formats such as JPEG, PNG, DICOM, PDF, etc.
How are files organized? What is a project (aka data-set)?¶
The PHC platform organizes data, including files, under user-defined projects (aka data-sets).
Inside projects, files may be organized further into directory structures so files are not one large file list.
Does deleting a project (aka data-set) delete all associated files?¶
Yes, deleting a project deletes the project and all associated data.
After you select Delete this Project, the project stays active for 14 days before the files are actually deleted. You can cancel the deletion during this time period. You also have the option to delete the files immediately.
What access control is available?¶
Projects allow for access control to be put in place by the organization.
Application level access-controls are enforced when viewing, downloading, and uploading files.
Example: A user can be configured to query and search genetic data for a subject, but be restricted in the ability to download the subject's file(s) on a per project basis.
Access control refers to the ability to control who can interact with a resource within the platform. The LifeOmic platform uses Attribute Based Access Control (ABAC) to assign different attributes and dictate what information users have access to in cases requiring complex Access Control. For more information, see the Account Management Overview.
How durable is LifeOmic File Service?¶
LifeOmic File Service is powered by AWS S3. For more information see AWS S3 FAQ.
Are my files backed up?¶
Cross-region replication and backups are included as part of service.
How are my files protected?¶
Uploaded files are encrypted at rest.
Can arbitrary files be stored without a subject/patient?¶
The only required information to store a file is a project identifier.
Files are not required to be tied to a specific subject within a project.
If this link is desired, a FHIR DocumentReference can be used to provide this link using LifeOmic FHIR Service.
How can I get started uploading data?¶
The LifeOmic CLI is the best option to get started uploading file data to the PHC. Once installed, files can be uploaded with the
lo files --help command.
What limits are in place for LifeOmic File Service?¶
The total amount of files one can store is unlimited. The maximum file size
The LifeOmic CLI can manage uploading this amount of data to the PHC.
What file name restrictions are in place?¶
File names must match,
^([a-zA-Z0-9!\\-+.*'()&$@=;+ ,?:_/]*)$, and be less than 970 characters in length.
What files can be viewed within the PHC Web Console?¶
Common file types found on the web like text, images, and markdown will be viewable within the PHC Web Console.
Certain file types will open into web-based viewers. Some examples are csv/tsv files, DICOM images, ipynb notebook files, and PDF files.
Files larger than
5 MB will be opened in a new tab.
How can I document my files?¶
README.md in a directory of files will render the markdown below the file listing as an inline description.
How can I reference my file?¶
A unique identifier is created for all files uploaded to the PHC. To make a reference to a file, use a LifeOmic Resource Name (LRN) which will remain as a stable pointer to this file. Future file renames will not break the LRN reference.
How can I share my project with collaborators?¶
Users may create sharable links (URL) from the PHC Web Console for those who have access to the project.
Granting access to projects is available through access control and ABAC.
What different methods are available for transferring data?¶
- LifeOmic Files API - Use the HTTPS API to upload data with TLS.
- LifeOmic CLI - Use a command-line interface to upload data through the API.
- PHC SDK for Python - Use Python to upload data through the API.
- SFTP upload to a Project - Create a location in a project that allows transfer of files into a project over SFTP.
How can I trigger automation to run against the recent set of file transfers?¶
Users may define file glob patterns that may be used to trigger and start File Actions. File Actions allow one to automate the execution of behavior with Common Workflow Language (CWL).
What is the best method to send up large files (larger than
For the majority of use-cases the LifeOmic CLI will transfer files successfully and provide an easy to use terminal experience.
When individual files are
500 GB or larger a limiting factor when using the LifeOmic CLI to upload is your internet uplink speed, time, and the connection not being interrupted.
Configuring SFTP to a Project is recommended when transferring files of this size. Additionally, SFTP can resume a file transfer for a file that's been partially transferred.