Uploading Data with flowbio
flowbio allows you to upload data to a Flow instance - including specialised data upload such as that for demultiplexed sample files or multiplexed data.
Uploading Standard Data
A file can be uploaded using the client's upload_data method:
Upload standard data
data = client.upload_data("/path/to/file.fa")
For large data, you may wish to display a progress bar:
Upload standard data with a progress bar
data = client.upload_data("/path/to/file.fa", progress=True)
If you are experiencing network issues, you can instruct flowbio to retry any failed chunk upload - in this case up to a maximum of five times:
Upload standard data with retries
data = client.upload_data("/path/to/file.fa", retries=5)
The full arguments list:
- Name
path- Type
- string
- Description
The local path to the file to be uploaded.
- Name
chunk_size- Type
- int
- Description
Files are uploaded in chunks - this sets the size of those chunks in bytes (default 1,000,000). Lowering this improves the reliability of the upload, increasing it reduces the overall time taken.
- Name
progress- Type
- bool
- Description
Whether or not to display a progress bar (default
False).
- Name
retries- Type
- int
- Description
How many times to re-attempt to upload a chunk before giving up (default 0).
Sample Upload
To upload the initial data for a sample, use the upload_sample method. Here you provide the name of the sample, at least one file path (depending on whether the sample is single-end or paired-end), and then a dictionary of sample metadata:
- Name
name- Type
- string
- Description
The name of the sample being created.
- Name
path1- Type
- string
- Description
The local path to the initiating data to be uploaded.
- Name
path2- Type
- string
- Description
If paired-end, the local path to the second initiating file (default
None).
- Name
chunk_size- Type
- int
- Description
Files are uploaded in chunks - this sets the size of those chunks in bytes (default 1,000,000). Lowering this improves the reliability of the upload, increasing it reduces the overall time taken.
- Name
progress- Type
- bool
- Description
Whether or not to display a progress bar (default
False).
- Name
metadata- Type
- dict
- Description
Additional attributes for the sample.
Upload a sample
sample = client.upload_sample(
"My Sample Name",
"/path/to/reads1.fastq.gz",
"/path/to/reads2.fastq.gz", # optional
progress=True,
metadata={
"sample_type": "RNA-Seq",
"strandedness": "unstranded",
}
)
Metadata
The metadata is given as a Python dictionary of values. These are the full permitted attributes:
- Name
sample_type- Type
- string
- Description
The sample's type. This can determine whether other fields are required.
- Name
organism- Type
- string
- Description
The ID (
Hs,Mmetc.) of the organism the sample belongs to.
- Name
project- Type
- ID
- Description
The ID of the project to add the sample to.
- Name
scientist- Type
- string
- Description
The name of the person who performed the original experiment.
- Name
pi- Type
- string
- Description
The name of the PI for the original experiment.
- Name
organisation- Type
- string
- Description
The name of the organisation the sample was generated at.
- Name
purification_agent- Type
- string
- Description
The purification agent used.
- Name
experimental_method- Type
- string
- Description
This adds more specific detail to the sample type.
- Name
condition- Type
- string
- Description
The experimental condition of the sample.
- Name
sequencer- Type
- string
- Description
The sequencing equipment used to generate the data.
- Name
comments- Type
- string
- Description
Any additional comments.
- Name
five_prime_barcode_sequence- Type
- string
- Description
The 5' barcode sequence of the sample.
- Name
three_prime_barcode_sequence- Type
- string
- Description
The 3' barcode sequence of the sample.
- Name
three_prime_adapter_name- Type
- string
- Description
The 3' barcode adapter name of the sample.
- Name
three_prime_adapter_sequence- Type
- string
- Description
The 3' barcode adapter sequence of the sample.
- Name
read1_primer- Type
- string
- Description
The read 1 primer sequence.
- Name
read2_primer- Type
- string
- Description
The read 2 primer sequence.
- Name
rt_primer- Type
- string
- Description
The reverse transcription primer.
- Name
umi_barcode_sequence- Type
- string
- Description
The UMI Barcode Sequence.
- Name
umi_separator- Type
- string
- Description
The UMI separator string in the reads file.
- Name
geo- Type
- string
- Description
The GEO accession of the sample.
- Name
ena- Type
- string
- Description
The ENA accession of the sample.
- Name
purification_target- Type
- string
- Description
The name of the sample's purficiation target.
- Name
source- Type
- string
- Description
The name of the sample's cell type.
- Name
strandedness- Type
- string
- Description
Only needed for RNA-Seq samples - must be
"unstranded","forward","reverse"or"auto".
- Name
rna_selection_method- Type
- string
- Description
The RNA selection method of the sample. Only needed for RNA-Seq samples.
- Name
ribosome_type- Type
- string
- Description
The ribosome type of the sample. Can be added for Ribo-Seq types.
- Name
size_selection- Type
- string
- Description
The size selection method of the sample. Can be added for Ribo-Seq types.
- Name
separation_method- Type
- string
- Description
The separation method of the sample. Can be added for Ribo-Seq types.
- Name
ribosome_stabilisation_method- Type
- string
- Description
The ribosome stabilisation method of the sample. Can be added for Ribo-Seq types.
- Name
pubmed- Type
- string
- Description
The pubmed ID associated with the sample.