DAFS: Data-Adaptive Flag Method for RNA-Sequencing Data
*Nysia George, NCTR 

Keywords: RNA sequencing, low expression, data-adaptive

Next-generation sequencing (NGS) has advanced the application of high-throughput sequencing technologies in genetic and genomic variation analysis. Whole transcriptome sequencing (RNA-seq) utilizes NGS to measure RNA levels of transcripts in a sample, and is expected to replace the microarray technology. Several statistical methods have been developed to accommodate the unique features of RNA-seq data. However, meaningful interpretation of the statistical analysis of low expression is difficult, and there is no consensus on the definition of a low expressed region. A number of factors affect the distribution of read counts for a given study and vary from study to study. Thus, an arbitrary cutoff to identify high/low expression region might be misleading. In this study, a data-adaptive approach is developed to estimate the lower bound of high expression in a given RNA-seq sample. Several RNA-seq datasets are used to demonstrate the robustness of our methodology.