Revvity Signals - Drug Discovery

RNA-seq RNAaccess Identified as the Preferred Method for Gene Expression Analysis of Low Quality FFPE Samples

Kai Song, Emon Elboudwarej, Xi Zhao, Luting Zhuo, David Pan, Jinfeng Liu, Carrie Brachmann, Scott D. Patterson, Oh Kyu Yoon, Marianna Zavodovskaya 

Abstract

Clinical tumor tissues that are preserved as formalin-fixed paraffin-embedded (FFPE) samples result in extensive cross-linking, fragmentation, and chemical modification of RNA, posing significant challenges for RNA-seq-based gene expression profiling. This study sought to define an optimal RNA-seq protocol for FFPE samples. We employed a common RNA extraction method and then compared RNA-seq library preparation protocols including RNAaccess, RiboZero and PolyA in terms of sequencing quality and concordance of gene expression using FFPE and case-matched fresh-frozen (FF) triple-negative breast cancer (TNBC) tissues. We found that RNAaccess, a method based on exome capture, produced the most concordant results. Applying RNAaccess to FFPE gastric cancer tissues, we established a minimum RNA DV200 requirement of 10% and a RNA input amount of 10ng that generated highly reproducible gene expression data. Lastly, we demonstrated that RNAaccess and NanoString platforms produced highly concordant expression profiles from FFPE samples for shared genes; however, RNA-seq may be preferred for clinical biomarker discovery work because of the broader coverage of the transcriptome. Taken together, these results support the selection of RNA-seq RNAaccess method for gene expression profiling of FFPE samples. The minimum requirements for RNA quality and input established here may allow for inclusion of clinical FFPE samples of sub-optimal quality in gene expression analyses and ultimately increasing the statistical power of such analyses.

Introduction

Next-generation sequencing (NGS) technologies have been rapidly advancing and are more widely used for clinical biomarker testing in recent years [1, 2]. RNA sequencing (RNA-seq) provides an in-depth and unbiased method for identifying transcripts and gene fusion events that contribute to disease pathogenesis [3–5]. Moreover, gene expression profiling can be performed on tissue samples that are formalin-fixed paraffin-embedded (FFPE), a commonly used method for preservation of clinical tissue samples. However, due to a high degree of RNA degradation, RNA base modification, and low amount of nucleic acid material that can be extracted from FFPE samples, accurate transcriptional profiling remains challenging [6–9]. Beyond RNA-seq platforms, nCounter technology (NanoString) has been proposed to be a suitable method for gene expression profiling of FFPE samples as it directly measures the abundance of target molecules without an amplification bias or effects from genomic variation [10]. Although the concordance between Illumina NGS and NanoString platforms has been established [11], NanoString generates a partial view of transcriptomic profiles because of the limited number of quantified genes [12]. Therefore, establishing a reliable RNA-seq protocol for limited FFPE tissue samples would facilitate its applications for biomarker discovery work in clinical trials.

Methods

Tissue samples

All human tissue samples were purchased from commercial vendors. All sample collections were conducted by vendors under IRB-approved protocols, and all donors signed informed consent forms. Samples were collected from 2009 to 2015. The study was conducted from 2015 to 2018. Authors had no access to information that could identify individual participants during or after the study. Triple-negative breast cancer/breast tissue set (TNBC): Seven cases with paired fresh-frozen (FF) and formalin-fixed paraffin-embedded (FFPE) triple negative breast cancer samples were purchased from a commercial vendor. One FFPE tissue was excluded from analysis due to sequencing failure. Additionally, two FF cases of commercially procured normal breast tissue were included in the sample set (Panels A-C in S1 Fig, S1 Table). All tissues were quality controlled in-house by a pathologist to verify tissue of origin, tumor content, and degree of necrosis. A minimum of 50% tumor cell content was used as a cutoff for study inclusion. Samples with less than 50% tumor cells were macro-dissected to increase the tumor content. Gene expression profiling was carried out using RNA sequencing.

Results

RNAaccess library preparation protocol produces more consistent RNA-seq quality control metrics between case-matched FF and FFPE tissues compared to RiboZero

To identify the optimal RNA-seq library preparation protocol for FFPE samples, tissues from seven TNBC patients with FFPE and case-matched FF (See Methods for sample details) as well as two separate cases with FF normal breast tissues were used to generate RNA-seq data using three library preparation methods including PolyA, RiboZero and RNAaccess. The FF samples were processed by all three library preparation protocols: PolyA, RiboZero and RNAaccess, whereas FFPE samples were processed by RiboZero and RNAaccess (Panels A-C in S1 Fig; S1 Table). We first confirmed that all library preparation protocols generated at least 50 million reads for each sample (Panel A in S2 Fig). The percentage of uniquely mapped, multi-mapped and unmapped reads to the reference genome was similar across all library preparation methods for both FFPE and FF samples (Panel B in S2 Fig). 

Discussion

FFPE tumor tissue samples are essential for clinical diagnostics as well as biomarker testing and discovery work in Oncology. In our study, we performed a systematic evaluation of RNA expression profiling methods in order to identify an optimal protocol for expression analysis of FFPE tissues. Among the library preparation protocols tested by comparing results from FF and FFPE tissues from the TNBC set, we found that the RNAaccess library preparation method had more consistent mapping metrics such as genomic distribution and gene body coverage than RiboZero (Fig 1). RNAaccess, as an exon-capture method, effectively covers exons, with minimal intronic and intergenic contamination [19], thus offering a larger quantity of data focused on protein coding regions. Since protein-coding genes are usually better annotated as they are more frequently studied than non-coding genes, RNAaccess is recommended over RiboZero in this context due to its ability to generate up to 2–3 times more exonic reads at the same sequencing depth. We also observed that RNAaccess produced a slightly higher concordance on protein-coding gene expression profiles and had an equal or better concordance score when comparing biological signatures between FF and FFPE samples than RiboZero (Fig 2). Overall, there is a strong rationale supporting the selection of RNAaccess as the preferred method for RNA expression profiling on FFPE samples for protein-coding genes.

Conclusions

We identified RNAaccess as the preferred RNA-seq library preparation method for transcriptomic profiling of FFPE samples due to its consistent performance and the concordant gene expression profiles between FFPE and matched FF tissues. Furthermore, our study defined RNA quality of DV200 above 10% and input of at least 10ng of RNA as acceptable for sequencing of FFPE sample using RNA-seq RNAaccess, which is likely to increase the number of clinical FFPE samples that can be tested. In addition, the current study demonstrated that FFPE expression profiles of shared genes using RNAaccess were comparable to those of NanoString. However, RNAaccess may be a preferred platform for exploratory and discovery oncology research due to the vastly higher genome-wide coverage.

Acknowledgments

We especially thank Xin Guo at Gilead Sciences, Inc. for providing guidance on data analysis.

We thank Audrey Goddard at Gilead Sciences, Inc. for providing suggestions to the manuscript. We thank Dustin Chernick and Sam Kim at Gilead Sciences, Inc. for writing support. We thank Q2 Solutions | EA Genomics for performing nucleic acid extraction, sequencing and NanoString profiling for all samples in this study.

Citation: Song K, Elboudwarej E, Zhao X, Zhuo L, Pan D, Liu J, et al. (2023) RNA-seq RNAaccess identified as the preferred method for gene expression analysis of low quality FFPE samples. PLoS ONE 18(10): e0293400. https://doi.org/10.1371/journal.pone.0293400

Editor: Anna Sapino, Universita degli Studi di Torino, ITALY

Received: June 2, 2023; Accepted: October 11, 2023; Published: October 26, 2023

Copyright: © 2023 Song et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Gilead Sciences shares anonymized individual patient RNA expression data upon request or as required by law or regulation with qualified external researchers based on submitted curriculum vitae and reflecting no conflict of interest. The request proposal must also include a statistician. Approval of such requests is at Gilead Science’s discretion and is dependent on the nature of the request, the merit of the research proposed, the availability of the data, and the intended use of the data. Data requests should be sent to [email protected].

Funding: This study was funded by Gilead Sciences, Inc., Foster City, California, USA. Gilead Sciences, Inc. We are confirming that the funding organization did play roles in the study design, data collection and analysis, decision to publish, preparation of the manuscript and provided financial support in the form of authors' salaries.

Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: All authors were employees at and owned stock of Gilead Sciences, Inc. This commercial affiliation does not alter our adherence to PLOS ONE policies on sharing data and materials.

 

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0293400#abstract0