Advanced User Guide - SangerRead (FASTA)

SangerRead is the lowest level in sangeranalyseR showed in Figure_1 which corresponds to a single read in Sanger sequencing. In this section, we are going to go through detailed sangeranalyseR data analysis steps in SangerRead level from FASTA file input.

../_images/SangerRead_hierarchy.png

Figure 1. Hierarchy of classes in sangeranalyseR, SangerRead level.


Preparing SangerRead FASTA input

We design the FASTA file input for those who do not want to do quality trimming and base calling for their SangerRead; therefore, it does not contain quality trimming and chromatogram input parameters and results in its slots. Before starting the analysis, users need to prepare one target FASTA file. The only hard regulation of the filename is that file extension must be .fasta or .fa.


Creating SangerRead instance from FASTA

After preparing the SangerRead input FASTA file, the next step is to create the SangerRead S4 instance by running SangerRead constructor function or new method. The constructor function is a wrapper for new method which makes instance creation more intuitive. Most of the input parameters have their own default values. In the constructor below, we list important parameters. The filename is assigned to readFileName. Inside FASTA file, the string in the first line after ">" is the name of the read. Users need to assign the read name to fastaReadName which is used to match the target read in FASTA input file. Figure 2 is a valid FASTA file and the value of fastaReadName is Achl_ACHLO006-09_1_Forward.

sangerReadFfa <- new("SangerRead",
                     inputSource          = "FASTA",
                     readFeature          = "Forward Read",
                     readFileName         = "ACHLO006-09[LCO1490_t1,HCO2198_t1]_1_F.fa",
                     fastaReadName        = "Achl_ACHLO006-09_1_Forward",
                     geneticCode          = GENETIC_CODE)
../_images/SangerRead_fasta_input_file.png

Figure 2. SangerRead FASTA input file.

The inputs of SangerRead constructor function and new method are same. For more details about SangerRead inputs and slots definition, please refer to sangeranalyseR reference manual (need update after upload function manul).


Writing SangerRead FASTA files (FASTA)

Users can write the SangerRead instance to FASTA files. Because the FASTA input does not support quality trimming and base calling, in this example, the sequence of the written FASTA file will be same as the input FASTA file. Moreover, users can set the compression level through the one-line function writeFasta which mainly depends on writeXStringSet function in Biostrings R package.

writeFasta(sangerReadFfa,
           outputDir         = tempdir(),
           compress          = FALSE,
           compression_level = NA)

Users can download the output FASTA file of this example.


Generating SangerRead report (FASTA)

Last but not least, users can save SangerRead instance into a report after the analysis. The report will be generated in HTML by knitting Rmd files. The results in the report are static.

generateReport(sangerReadFfa,
               outputDir = tempdir())

SangerRead_Report_fasta.html is the generated SangerRead report html of this example. Users can access to 'Basic Information', 'DNA Sequence' and 'Amino Acids Sequence' sections inside this report.