Reverse transcriptase enzyme and priming strategy affect quantification and diversity of environmental transcripts

RT-Q-PCR, and RT-PCR amplicon sequencing, provide a convenient, target-specific, high-sensitivity approach for gene expression studies and are widely used in environmental microbiology. Yet, the effectiveness and reproducibility of the reverse transcription step has not been evaluated. Therefore, we tested a combination of four commercial reverse transcriptases with two priming techniques to faithfully transcribe 16S rRNA and amoA transcripts from marine sediments. Both enzyme and priming strategy greatly affected quantification of the exact same target with differences of up to 600-fold. Furthermore, the choice of RT system significantly changed the communities recovered. For 16S rRNA, both enzyme and priming had a significant effect with enzyme having a stronger impact than priming. Inversely, for amoA only the change in priming strategy resulted in significant differences between the same sample. Specifically, more OTUs and better coverage of amoA transcripts diversity were obtained with GS priming indicating this approach was better at recovering the diversity of amoA transcripts. Moreover, sequencing of RNA mock communities revealed that, even though transcript α diversities (i.e. OTU counts within a sample) can be biased by the RT, the comparison of β diversities (i.e. differences in OTU counts between samples) is reliable as those biases are reproducible between environments. Originality-Significance Statement Is the complementary DNA (cDNA) produced after Reverse Transcription (RT) a faithful representation of the starting RNA? This is a fundamental and important question for transcriptomic-based studies in environmental microbiology that aim to quantify and/or examine the diversity of transcripts via RT approaches. Yet little is known about the reliability and reproducibility of this step. Here, we evaluated the effect of the two main components of the RT reaction – the retro transcriptase enzyme and priming strategy (gene specific vs random priming), on the quantification and diversity of cDNA. We found that both have a significant impact. We further provide evidence to enable informed choices as to the enzyme and priming combinations to improve the performance of RT-PCR approaches. Taken together, this work will improve the reliability and reproducibility of transcript-based studies in environmental microbiology.


Originality-Significance Statement
Is the complementary DNA (cDNA) produced after Reverse Transcription (RT) a faithful representation of the starting RNA? This is a fundamental and important question for transcriptomic-based studies in environmental microbiology that aim to quantify and/or examine the diversity of transcripts via RT approaches. Yet little is known about the reliability and reproducibility of this step. Here, we evaluated the effect of the two main components of the RT reaction -the retro transcriptase enzyme and priming strategy (gene specific vs random priming), on the quantification and diversity of cDNA. We found that both have a significant impact. We further provide evidence to enable informed choices as to the enzyme and priming combinations to improve the performance of RT-PCR approaches.
Taken together, this work will improve the reliability and reproducibility of transcript-based studies in environmental microbiology.

Summary
RT-Q-PCR, and RT-PCR amplicon sequencing, provide a convenient, target-specific, highsensitivity approach for gene expression studies and are widely used in environmental microbiology. Yet, the effectiveness and reproducibility of the reverse transcription step has not been evaluated. Therefore, we tested a combination of four commercial reverse transcriptases with two priming techniques to faithfully transcribe 16S rRNA and amoA transcripts from marine sediments. Both enzyme and priming strategy greatly affected quantification of the exact same target with differences of up to 600-fold. Furthermore, the choice of RT system significantly changed the communities recovered. For 16S rRNA, both enzyme and priming had a significant effect with enzyme having a stronger impact than priming. Inversely, for amoA only the change in priming strategy resulted in significant differences between the same sample. Specifically, more OTUs and better coverage of amoA

Introduction
Whereas modifications of the genome can reflect adaptations of living organisms over evolutionary time scales, changes in the transcriptome reflect short term responses of cells (López-Maury et al., 2008;Browning and Busby, 2016). In environmental microbiology, transcriptomics is essential to understanding which biochemical pathways are triggered by environmental conditions at a given time. RNAseq approaches facilitate primer free metatranscriptomics to reveal global gene expression profiles. It is now a widely used method in environmental microbiology and has allowed scientists to gain formidable insight into the genome-scale mechanisms used by microbes to adapt to changing environmental conditions (Shakya et al., 2013;Gutleben et al., 2018). However, it generally comes at high cost and requires extensive data analysis. Plus, as an untargeted approach, it may require enrichment of the mRNA (via removal of ribosomal RNA) and will be dependent on sequencing depth to reveal rare transcripts among the diverse array of transcripts expressed in complex environmental samples. In contrast Reverse-Transcriptase-Quantitative PCR (RT-Q-PCR) is directed via primers towards a single target. While this is much lower throughput in terms of a global overview of transcription, this approach facilitates transcript quantification that is specific, with high-sensitivity and low-detection limits over a wide dynamic range (Sanders et al., 2014). RT-Q-PCR is high-throughput in terms of sample numbers, cost effective (in comparison to metatranscriptomics) and subsequent data processing is fast without the requirement for high computational power and bioinformatic expertise needed for metatranscriptomics analysis. As a consequence, RT-(Q)-PCR is routinely used in most life science research fields including environmental microbiology to target and quantify specific transcripts.
The RT-(Q)-PCR workflow consists of two steps, the RT reaction that converts RNA to cDNA and the subsequent PCR amplification of the cDNA. cDNA can be directly quantified via RT-Q-PCR or undergo end-point PCR for amplicon sequencing of the expressed transcripts. The initial RT reaction requires a reverse transcriptase enzyme, of which there are a number commercially available, and a reverse primer to initiate the RT reaction. There are two main priming strategies, random or gene specific priming. For random priming, short oligonucleotides (e.g. hexamer or decamer) consisting of all possible sequence combinations for that size, are used to randomly initiate the RT across the entire transcriptome. Gene specific, as the name implies, target specific transcripts of interest.
The aim for the RT reaction is that it faithfully produces cDNA that reflects gene expression in the starting RNA sample (Bustin and Nolan, 2017). However, a number of studies in the wider field of molecular biology indicate that the RT reaction has a significant impact on the final results for the same sample. Indeed within clinical studies, the inherent variability of cDNA synthesis has been reported in some cases to be greater than the differences between biological samples (Sanders et al., 2014). This level of variability implies that comparison of results between different studies using different approaches is near impossible (Bustin, 2002).
In environmental microbiology applications, while the subsequent analysis of cDNA by Q- PCR (Smith et al 2006;Smith & Osborn, 2009), and/or PCR for amplicon sequencing (Marotz et al., 2019) has been shown to be robust, reliable and reproducible, the effect of the initial RT reaction on quantification and amplicon sequencing of environmental transcripts has yet to be determined. Indeed environmental samples may provide a number of further challenges for efficient and reproducible RT reactions due to the presence of co-extracted inhibitors; variable target expression (high to low) in a background of high non-target template concentration and low RNA quality and integrity (Cholet et al., 2019). Moreover, there is the need for the RT reaction to faithfully transcribe the diversity of target of interest.
A small number of studies investigating primer-free approaches to characterise 16S rRNA transcripts revealed better accuracy (Mäki and Tiirola, 2018) and sensitivity (Hoshino and Inagaki, 2013) with primer-free approaches for amplicon sequencing than PCR of the cDNA.
Nonetheless, these primer-free approaches still rely on an initial RT reaction, which could impact the outcome of the 16S rRNA transcript sequencing. Moreover, our own personal observations in the laboratory have indicated that RT enzyme and priming strategy greatly impact the results of environmental transcript studies, often meaning the difference between detection or not of a given transcript that in turn results in different ecological interpretation. Therefore, to improve reproducibility and inform best practice and standardisation of RT-(Q)-PCR approaches in environmental microbiology, we have undertaken a detailed study of the effect of the RT reaction on RNA extracted from environmental samples. We aimed to determine the impact of enzyme and priming strategy on quantification and amplicon sequencing of transcripts (spiked artificial RNA, 16S rRNA and ammonia monooxygenase (amoA)). We therefore examined a combination of four commonly used commercial reverse transcriptases (Superscript III, Superscript IV, Omniscript and Sensiscript; designated SSIII, SSIV, Omni and Sensi, respectively, thereafter) and two priming strategies (random hexamer and gene specific; designated RH and GS, respectively, thereafter). We hypothesized that both quantification and alpha diversity (i.e. OTU counts within a sample) of transcripts from the same samples will be affected by RT enzyme and priming strategy.

Effect of enzyme and priming on the detection of exogenous spike
First, the impact of enzyme and priming on the quantification of RNA was determined for an exogenous target that was spiked at known concentrations into a background of environmental RNA. Artificial RNA (sfGFP RNA) that could be distinguished from environmental RNA background, was produced by in vitro transcription of a PCR product amplified from the pTHSSd_8 plasmid (Segall-Shapiro et al., 2014). The resulting RNA was mixed with environmental RNA at different concentrations (10 3 , 5 x 10 3 , 2 x 10 6 and 10 7 copies/µl) and quantified using RT-Q-PCR ( Figure 1A).

sfGFP standard curves
Artificial RNA (sfGFP RNA) that could be distinguished from environmental RNA, was produced by in vitro transcription of a PCR product amplified from the pTHSSd_8 plasmid.
Standard curves were constructed using 10-fold dilutions of sfGFP RNA from 10 10 to 10 1 copies/µl that underwent individual reverse transcription in duplicate using each of the four RT enzymes with two priming strategies. The Cycle thresholds (Cts) of the same sample derived from different enzyme/priming strategies were obtained by Q-PCR amplification of the resulting cDNA. Amplification of the no template controls and log 10 [sfGFP]= 1 and 2 gave no signal. The Limit of Detection (LD) and Quantification (Forootan et al., 2017) was log 10 [sfGFP]= 3 for all RT systems except for Sensi-RH which had a LD at log 10 [sfGFG]= 4.
Excluding the Cts obtained for log 10 [sfGFP]=10 resulted in an improvement of the regression fit (slopes closer to the expected -3.332 and R squared closer to 1; data not shown). As such standard curves ranged between log 10 [sfGFP] = [3;9] (or [4;9] for Sensi-RH). For all enzymes, the use of RH resulted in higher Cts than GS at all sfGFP concentrations. SSIV-GS resulted in the best efficiency (99.3%) while the lowest was obtained by Sensi-RH (84.2%) (  (GraphPad Prism6,www.graphpad.com). The effect of the RT approach on standard curve construction was further investigated using multilevel linear model analysis. Three different models were tested where 1) intercepts only, 2) slopes only and 3) slopes and intercepts varied between groups (i.e. RT method). Models 1 and 2 were then tested against model 3 using an ANOVA. Model 3, allowing for variations in both slopes and intercepts, resulted in a better fit than model 1 (intercept only) or model 2 (slopes) (Table S.2) indicating that the effect of enzyme and priming impacted both the slope (i.e. efficiency) and the intercept (i.e. signal at No Template Control (NTC)) of the standard curves.

RT-Q-PCR detection of the RNA sfGFP spike
The exogenous RNA spike, sfGFP, was added to environmental RNA at known concentrations and the resulting preparations were reverse-transcribed using the eight different combinations of RT enzymes and priming strategies ( Figure 1A). Four different concentrations of spike were added to environmental RNA background: two low and two high, with a five-fold difference in sfGFP copy number between the two low and the two high spikes respectively (10 3 and 5 x10 3 copies/µl for low spikes; 2 x10 6 and 10 7 copies/µl for high spikes). After cDNA synthesis, the spiked target was quantified by Q-PCR and Cycle thresholds (Ct) were converted to copies/µl using standard curves. Both enzyme and priming had a strong effect on the copy number of exogenous target detected (log10 transformed) at all spike concentrations (Table 1; Figure 2). Overall, SSIII and SSIV enzymes were the closest to the expected value. SSIV was slightly more accurate than SSIII, especially at spike concentrations > 5 x 10 3 . The use of Omni resulted in an underestimation of the spike concentration with factors ≈ 4 to ≈ 50 depending on the concentration of the target (higher differences at higher concentrations). Similarly, the use of Sensi also resulted in an underestimation of the exogenous target concentration with factors ≈ 3 to ≈ 30 (higher differences at higher concentrations) ( Figure 2).
The use of GS priming resulted in more accurate quantification for all enzymes except Omni for which it had no effect. For Sensi, RH priming failed at the low spike concentrations while at the high concentration of spike, RH was significantly lower (6-fold) than GS. For the Superscript enzymes, the use of RH versus GS generally resulted in lower quantification of the same target, except when using SSIV at low concentrations where the priming strategy had no effect. Of the two Superscript enzymes, SSIII with GS always overestimated the concentration of spike whereas RH always underestimated it (≈2 fold or less). Priming had the least effect with SSVI, but more accurate quantification was achieved when using GS priming ( Figure 2).
Next, we tested the ability of the RT systems to faithfully report a 5-fold difference in the sfGFP spike concentration between the two low and two high concentrations respectively.
For this, the differential expression (DE), i.e. the ratio of the average transcript number/µl between the two low or the two high spikes respectively, was calculated ( Figure 3). The DE does not report how accurately the system quantifies the spike but rather its ability to reflect the 5-fold change. Again, the choice of enzyme and priming had an effect on the observed when used with RH priming. Therefore, the superscript enzymes preformed best overall, and the DE was improved when SSIII and IV were used in combination with GS priming ( Figure   3).

Effect of standard curve construction on sfGFP quantification
As there are two approaches to constructing RNA standard curves (Smith et al., 2006), we tested if this had any impact on quantification of the spike and the above results. A standard curve can be made by serial dilution of RNA with individual RTs or via a single RT of RNA followed by serial dilution of cDNA. Standard curves for each enzyme and primer combination were made using these two approaches to quantify the spiked sfGFP ( Figure   1A). The percentage error was calculated between the observed and expected copies/µl for each sfGFP spike generated from each standard curve. The standard curve constructed from the dilution of cDNA generally increased the percentage error, and therefore dilutions of RNA for individual RTs were used for subsequent standard curves (Figure S.1).

RNA Quality check
Before proceeding with the quantification of endogenous transcripts, RNA extracted from sediment underwent a quality check (

Quantification of endogenous transcripts
Next, the effect of RT enzyme and priming strategy on the quantification of transcripts from the same sediment sample were tested. For this we targeted in situ highly abundant 16S rRNA and less abundant mRNA from the bacterial ammonia monooxygenase subunit A, amoA, for quantification from cDNA generated using the different combinations of reverse transcriptases and priming ( Figure 1B). Results were converted into copies/µl using paired standard curves, normalized per µg of extracted RNA and log10 transformed (for parametric 2-ways ANOVA tests). The results clearly showed the effect of the RT system was target dependent: for 16S rRNA both enzyme and priming significantly affected the results whereas for amoA only the effect of enzyme was significant ( Figure 4; Table 3).
For amoA, the choice of RT system resulted in differences of up to 600-fold in the detected copies/µg RNA in the samples tested (Omni-RH versus SSIV-RH) and, in the most extreme case, the difference between detection of the target or not (Sensi Interestingly, although the use of RH priming resulted in higher Cts on Q-PCR (i.e. lower quantification), conversion to copies/µg RNA via the standard curve resulted in a higher quantification than that achieved with GS priming (≈+1.2fold for SSIV (p.value=0.99) and ≈ +1.7fold for SSIII (p.value=0.99)). In summary, SSIV was the best choice for the detection of amoA transcripts as it resulted the highest numbers of transcripts detected and produced consistent results between GS and RH. When used in combination with GS as opposed to RH the results were more precise (i.e. lower standard deviation) ( Figure 4).

Effect of enzyme and priming on cDNA amplicon sequencing data
While the quantitative work clearly shows dramatic and significant differences when using different RT enzyme and priming strategies for quantification of the same template, it does not inform if these impact upon community transcript diversity. To examine this, RNA and DNA mock communities of known composition were examined in addition to endogenous 16S rRNA and amoA transcripts from marine sediments.

Effect on mock community composition
As the actual composition of the transcriptome of the environmental samples is unknown, it is virtually impossible to determine which RT system most closely represents the starting RNA.
We thus tested the effect of enzyme and priming on known RNA mock communities. Two mocks community (one even, with all 12 sequences at the same relative proportion, designated EM and one uneven, with the 12 sequences at different relative proportions, designated UM, Table S.3), each composed of twelve different 16S rRNA transcripts were constructed as detailed in Figure 1C. To further tease apart the effects of the PCR from the RT, both DNA and RNA mock communities were constructed. Of the twelve mock community sequences, one (S9) was over-represented in the DNA mock community but under-represented in the RNA mock community sequencing data. In contrast sequence S10 showed the opposite trend (over-represented in the RNA mock community but underrepresented in the DNA mock community). As a result, sequences (S9 & S10) were removed from further analysis.
DNA mock. In the EM community, there were variations from the expected proportions (10%). Some members of the community: S1, S2, S4, S8 and S12 were underrepresented whereas S3, S5, S6, S7, S11 were overrepresented. The most underrepresented member, S4, represented only 4% of the total community whereas the most overrepresented, S7, represented 14%. Yet, even though the observed proportions deviated from the expected 10%, they were within the same order of magnitude ( Figure 5A).
For the UM mock communities, the observed proportion of each member was plotted against the expected proportion ( Figure 5B). A regression line with equation y=x is expected if each sequence is faithfully represented. The UM community results were consistent with those of the EM, with S6, S7, S11 overrepresented and S1, S2, S4 and S12 underrepresented. S5 was at the correct proportion in both EM and UM.
For most sequences, the errors of representations were consistent between EM and UM (i.e. a sequence overrepresented in the EM was generally also over-represented in the UM and vice versa) indicating a sequence specific bias of the PCR/sequencing workflow. And indeed, when the proportions of the UM were corrected by those of the EM, the fit of the regression improved ( Figure 5C).

RNA mock.
As seen for the DNA mock community, the proportions observed in the RNA EM differed from the expected 10% ( Figure 5D, G, J, M). When analysing the sequences abundances in the EM using PERMANOVA test (Bray-Curtis distances), it was found that the priming strategy used had a significant effect on the proportions retrieved (p.value = 0.001), with RH being more accurate than GS (proportions closer to the expected 10%). On the other hand, neither enzyme nor the interaction between enzyme and priming had a significant effect (p.value=0.208 and p.value=0.194 respectively). The data set containing the highest errors was SSIII GS, followed by SSIV GS, SSIII RH and the lowest errors were found for SSIV RH (similar amount of error than for the DNA mock) ( Figure 5 and Figure S.2). For individual sequences, there was a sequence-specific bias: some sequences (S1, S2 and S6) were always over-represented in the data sets but this was different depending on the priming used (proportion S1 < proportion S2 with GS and inversely with RH). Other members of the mock communities (S3, S4, S7 and S8) had proportions lower than the expected 10%; Though S7 was very close to the expected 10% with SSIV RH. Finally, the results from the other members of the mock community were dependent on the enzyme and priming strategy used.
For example, S5 and S12 were always over-represented in the RH-prepared libraries but not in the GS ones. Inversely, S11 was over-represented in the GS libraries (especially with SSIII) and its proportions decreased in the RH ones.
As for the EM, the proportions retrieved for the UM mocks differed from the expected proportions (y=x) ( Figure 5E, H, K, N). As for the EM, RH seemed to perform better than GS with better fits for the regressions (for SSIII: R-squared GS = 0.177 versus 0.402 for RH; for SSIV: R-squared GS = 0.163 versus 0.373 for RH). The use of RH priming also resulted in slopes closer to the expected value of 1 compared to GS indicating a better conservation of the relative proportions using this priming strategy.
However, as observed for the DNA mock, the errors were consistent between EM and UM: A sequence over-represented in the EM would also be over-represented in the UM and inversely. As a consequence, when UM reads were corrected by EM reads ( Figure 5 F, I, L and O), the calculated slopes were very close to the expected value (y=x) and the R-squared values also improved (close to 1) indicating a better fit of the regression. This observation indicates that the same sequences were misrepresented in both EM and UM communities.

Effect of RT enzyme and priming strategy on endogenous community composition
16S rRNA and (Bacterial) amoA PCR amplicons were generated from cDNA prepared using the different combinations of enzymes and priming ( Figure 1B). For amoA, only SSIII and SSIV were compared as the Sensi and Omni enzymes failed to produce PCR amplicons for sequencing (as reflected by Ct values above 30; Figure 3). The combination of Sensi and RH priming also failed to reliably amplify 16S rRNA transcripts and was therefore also excluded from further analysis.
Effect on 16S rRNA community composition. When all four enzymes were taken into account, the effect of enzyme on OTU community composition was always significant (Table 4 and Figure 6). In addition, the priming strategy had a significant effect on community composition but only when the Bray-Curtis dissimilarity matrix was considered (Table 4 and Figure 6).
Still, the choice of priming strategy had less of an effect on 16S rRNA community composition than for amoA, (Figure 6 and Figure  OTUs, 63 did not correctly translate (e.g. "hypothetical protein" or "low quality protein") and were therefore removed from the data set. In terms of percentage of reads, these nontranslating OTUs represented 0.017% to 4.6% of the total. As shown in Figure S.5, the amount of "incorrect OTUs" found was higher in the data set obtained when using GS priming. However, as the number of reads obtained with GS was generally higher, they did not represent a significantly higher percentage of the community, except for the replicate 1  Table 4). One sample, Env1, when prepared using RH priming for both SSIII and SSIV, failed to produce sufficient reads (more than 5000) to proceed and was removed from the analysis pipeline. In contrast, when GS priming was used, sufficient reads were produced to pass this quality step in the analysis pipeline. Indeed, GS priming always resulted in greater OTU richness ( Figure 7) than RH (+13 and +21 OTUs on average for SSIII and SSIV respectively), indicating that this priming option was better at recovering the diversity of amoA transcripts in the samples. This observation supports the Q-PCR results where GS priming always resulted in lower Cts. To determine if the "missing OTUs" in the RH sequencing data sets were dominant or rare phylotypes, the mean abundance of OTUs present only in the GS data set was plotted for each individual OTU (Figure 8), revealing that most of the OTUs missing in the RH data set were low abundance OTUs. On the other hand, interestingly, a very small number of rare OTUs were only detected in the RH data set ( Figure 8). Moreover, the choice of priming also affected the representation of OTUs present in both GS and RH datasets ( Figure 8, Table S.5): with SSIII, 39 OTUs were significantly differentially expressed between GS and RH. With SSIV, it was found that 23 OTUs were significantly differentially expressed between GS and RH ( Figure 8).

Discussion
While RT-Q-PCR, and to a lesser extent RT-PCR amplicon sequencing, is widely used in environmental microbiology to quantify and determine the diversity of transcripts from environmental samples, the effectiveness and reproducibility of the reverse transcription step has not been evaluated. In particular, in complex environmental samples, to the best of our knowledge, there have not been any studies investigating the efficiency of the reverse transcriptase reaction to transcribe RNA to cDNA, despite this being a critical step informing the overall result. Furthermore, based on our own observations in the laboratory, we often noted the impact of different enzyme and primer choice on the same template. Therefore, we assessed the effect of the RT system (enzyme and priming strategy) on RT-Q-PCR and RT-PCR-amplicon sequencing and showed that the choice of enzyme and priming strategy can result in significant difference in both quantitative and qualitative results from the exact same sample. These methodological effects can bias and even alter final conclusions and interpretations of the underlying biological and ecological questions.
From the sfGFP spike experiments (Figures 2 and 3), we showed that the choice of enzyme and priming greatly affected the results of the RT-Q-PCR. When the sfGFP transcript was spiked into an environmental RNA background, it was found that the Superscript enzymes performed better than the Sensiscript and Omniscript enzymes. The Superscript enzymes systematically produced higher detected copy numbers, with values closer to the expected ones and, generally, differential expressions closer to the expected 5-fold difference. In a study by Levesque-Sergerie and co-workers (Levesque-Sergerie et al., 2007) it was found that the Sensiscript and Omniscript enzymes had a dynamic range >50ng RNA versus >0.01ng RNA for Superscript III. Results obtained here are in accordance, with a better detection of the low concentration target by the Superscript enzymes compared to Sensiscript/Omniscript, especially when RH priming was used. Yet, the RT reactions for standard curves constructed using Sensiscript and Omniscript produced reliable Cts at target concentrations as low as 10 3 copies/µl, similar to that observed for SSIV and SSIII (except for Sensi-RH: lower limit at 10 4 copies/µl). This indicates that the lower performances observed for Sensiscript and Omniscript in the environmental spike experiment could be due to inhibition of the enzymes from co-extracted components in environmental RNA (Hata et al., 2015) and In this study, GS priming always performed better than RH for RT-Q-PCR, with higher copy numbers and values closer to the expected ones for the exogenous RNA spike. For the endogenous targets (amoA and 16S rRNA) a similar trend was observed, with GS priming resulting in higher detected average copy numbers, except for SSIII, in the amoA assay and Omni in the 16S rRNA assay. The differences between priming were particularly strong with the use of Sensiscript, where the combination of this enzyme and RH was clearly the least efficient RT strategy. Interestingly, small differences were observed between GS and RH when used with SSIV for the quantification of both the spiked sfGFP and the endogenous amoA showing that this enzyme reliably reverse-transcribed mRNAs. This was also supported by the differential expressions of the exogenous spiked sfGFP always being close to the expected 5-fold difference when using SSIV.
Differences in the performances of the RT enzymes and the priming strategies were similar between the sfGFP spike and the endogenous amoA mRNA with Superscript enzymes performing better than Sensiscript/Omniscript and GS generally performing better that RH. In contrast, for 16S rRNA significant differences were detected only between Omniscript and the other three enzymes (no statistically significant differences between SSIII, SSIV and Sensiscript) and the effect of priming was very important for all four enzymes. For this assay, Omniscript in combination with RH priming yielded the highest copies/µg RNA. These differences could be a reflection of target concentration, i.e. highly abundant 16S rRNA verses low abundance amoA transcripts, or indeed could be target dependant (i.e. ribosomal VS mRNA) reflecting for example the complex secondary structure of the RNA molecule.
Overall this study showed that the combination of Superscript IV with GS priming was the most accurate for the quantification of the exogenous sfGFP spike and showed the lowest variation in quantification when priming was changed to RH. SSIV GS was also the RT system yielding, on average, the highest copy number for the quantification of amoA mRNAs by RT-Q-PCR, coupled with the best precision (lowest standard deviation). This fits with our previous observations, where we would routinely achieve better results (e.g. detection verses no detection) and our subsequent choice of the Superscript enzyme with gene specific priming to quantify a range of N-cycle mRNA targets (Duff et al., 2017, Smith et al., 2015. Next, we investigated the effect of the RT strategy on cDNA sequencing. The result from cDNA sequencing demonstrated that the enzyme and priming strategy employed has an impact on cDNA amplicon diversity. As Sensiscript and Omniscript failed to reliably produce sufficient cDNA to produce PCR amplicons for amoA, they were not included, nor was the combination of Sensiscript and RH for the 16S rRNA diversity study. We have shown that for 1 amoA transcripts, priming is an important consideration (Figures 6, 7 and 8; Table 4). Most notably, the use of RH for sample Env1, resulted in too few sequencing reads (<5000) for further analysis. We attribute this to the lower abundance of the amoA transcript in a high background of RNA. In this case, the choice of priming made the difference between the This observation is further supported by the results from the 16S rRNA assay, where the choice of priming strategy was seen to be less important. In fact, here most of the differences observed were due to enzyme choice and not priming strategy. In contrast to the amoA results, for 16S rRNA the use of GS priming did not necessarily result in a higher number of OTUs compared to RH even though GS priming resulted in a higher number of 16S rRNA copies detected by Q-PCR. It may be that differences in RT performances are abundance or target molecule dependant (i.e. very abundant ribosomal RNA with complex secondary structures versus rare messenger RNA).
As the true representation of our transcripts in the environmental samples was unknown, we tested the RT systems against artificial defined RNA mock communities seeded into background environmental RNA. These artificial sequences were derived from target inserts with additional cloning vector sequence added, which allowed for their selective amplification from the background. To evaluate the bias introduced by the PCR/sequencing steps and separate them from the RT, a similar experiment was carried out using DNA mock communities. This experiment revealed that, for all RT systems, biases were introduced in both the RT and the subsequent PCR step of the reaction as the recovered proportions deviated from the expected ones ( Figure 5; Figure  As anticipated, errors were also seen in the UM resulting in observed regressions deviating from the expected. Interestingly, the errors were consistent between EM and UM (i.e. a sequence over-represented in the EM would also be over-represented in the UM and vice versa). As a result, when the UM proportions were corrected with the EM ones, the observed regressions were close to the expected y=x ( Figure 5). Since the mock communities were constructed separately (Figure 1), this indicated that: 1) these errors are a reflection of sequence specific bias of the RT-PCR workflow and not attributed to user error such as pipetting; 2) Since artificial over/under representations is likely introduced by sequence specific bias, the relative abundance of transcripts within a sample (α diversity) might not always be absolute when small differences (e.g. ≈ 4-fold as in this study) in expression are observed; 3) However, as these biases are reproducible (UM reads corrected by EM reads), comparison between samples (i.e. β diversity) can be undertaken.
In a recent review about the use of RT-Q-PCR, Bustin and Nolan (Bustin & Nolan, 2017) stated that "the majority of published RT-Q-PCR data are likely to represent technical noise".
The intrinsic variability of the RT step and the lack of information on protocols used were key points that lead them to this striking conclusion. This is likely to be similar, if not further amplified in complex environmental samples, from which ecosystem conclusions are drawn.
Here we have shown that primer and RT system choice can range from no detection to a 600fold difference in transcripts for the same template. In environmental studies, this is the difference between no gene expression to the presence of a highly active transcript -striking difference leading to opposite ecosystem conclusions. There is therefore an urgent need to ensure that the approaches we use are tested and recommendations as far as possible for best practice are made, followed and reported in future studies. Our study shows that the choice of correct enzyme and priming can improve the reliability and reproducibility of RT-Q-PCR and RT-sequencing data, facilitating insight into the transcriptionally active microbial communities directly from the environment. This, taken together with steps to monitor the purity and integrity of the extracted RNA prior to downstream analysis (Bustin and Nolan, 2017;Cholet et al., 2019) and detailed documentation of the RT approach used should greatly improve the reliability and reproducibility of transcript based studies in environmental microbiology. From our work, we put forward the following recommendations for best practice:

1: Evaluate and report RNA quality and integrity
As previously reported (Cholet et al., 2019), the quality and in particular, the integrity of the extracted environmental RNA should be determined and reported as the mandatory first step in any RNA based workflow.

2: RT-Q-PCR
i) Gene specific priming was more accurate, precise and sensitive than random hexamer priming for mRNA.
ii) Of the enzymes tested, Superscript IV was accurate, precise and sensitive, and therefore we recommend its use for the detection of transcripts in complex environmental RNA matrixes.
iii) The incorporation of an exogenous RNA target at known concentration into the environmental RNA being tested is an efficient way to validate RT-Q-PCR protocols.
iv) When converting Ct results into copy number, we advise the use of an RNA standard curve (i.e. serial dilution of the target RNA and individual RT-Q-PCR) rather than a cDNA standard curve (i.e. reverse transcription of a fixed concentration of RNA, dilution of the cDNA and Q-PCR).
v) Fully report the RT protocol used.

3: RT-amplicon sequencing
i) For RT-amplicon sequencing of mRNA targets, we recommend the use of gene specific priming as it resulted in better coverage and higher OTU richness of the bacterial amoA transcript. For 16S rRNA RT-sequencing, the choice of priming is less important.
ii) The addition of RNA mock communities into environmental RNA (before reverse transcription) can aid interpret sequencing results: in our case, we deduced from our RNA mock communities that even though relative proportions of individual OTUs within a sample (α diversity) can be biased, the comparisons of changes in OTU composition between samples (β diversity) are reliable.

Sediment sample collection
Surface mud samples (0 to 2 cm) were collected on 11/01/2017 from Rusheen Bay, Ireland transcripts. An additional sample was used for preparing the RNA background for the sfGFP spiking experiment.

RNA preparation from sediment
All surfaces and equipment were cleaned with 70% ethanol and RNase Zap (Ambion) before sample processing. All glassware and stirrers used for solutions were baked at 180°C overnight to inactivate RNases. All plasticware was soaked overnight in RNase away solution (ThermoFisher Scientific). Consumables used, including tubes and pipette tips were RNase free. All solutions were prepared using Diethylpyrocarbonate (DEPC) treated Milli-Q water.
A simultaneous DNA/RNA extraction method, based on that of Griffiths and co-workers (Griffiths et al., 2000) was used to recover nucleic acids from sediment.  et al., 2006).

RNA quality check
The quality, purity and integrity of extracted environmental RNA was determined as follows: Quantity/purity: Total RNA was quantified using three different approaches: spectrophotometry (NanoDrop; Life Technologies), fluorometry (Qubit broad Range RNA; Life Technologies) and microfluidics (Bioanalyser 2100 RNA Nano; Agilent Technologies).
Integrity: RNA integrity was determined using two different approaches: the RNA Integrity Number (RIN), based on the 23S/16S rRNA ratio and the electropherogram of the extracted RNA (Bioanalyser 2100 RNA Nano; Agilent Technologies) and the R amp approach, based on the differential amplification of glnA mRNA amplicons of different length (Cholet et al., 2019).

Preparation of the sfGFP RNA
A plasmid containing the sfGFP (pTHSSd_8) gene (designed by Segall-Shapiro et al. (2014)) was ordered from the Addgene plasmid repository web site (https://www.addgene.org/). A DNA preparation of the plasmid was prepared for PCR and Q-PCR as the target to optimise primers, initially at DNA level and subsequently for RT-Q-PCR as appropriate. The amplicon obtained by PCR amplification using pBRforECO and GFP-Frc (

Spiking experiment
In order to determine which enzyme/priming combination was the most accurate, the exogenous RNA spike (sfGFP RNA) was seeded into a background of environmental RNA ([RNA] background =70.7 ng/µl; ratio 260/280 background = 1.63; ratio 260/230 background = 0.87) at known concentrations: 10 3 , 5 x 10 3 , 2 x 10 6 and 10 7 copies/µl. The RNA background was same for all spikes. These concentrations were chosen to mimic five-fold changes in gene expression at both low and high expression level. After the sfGFP spike was added, total RNA was reverse transcribed in triplicate, using different combinations of enzymes and priming (four different RT enzymes; two different priming strategies) in the same manner as illustrated in Figure 1 and Table S.7. A 300bp fragment of the sfGFP cDNA was then quantified from the cDNA preparations using quantitative PCR (one Q-PCR reaction for each of the 3 RT replicates) with the primer pair sF300_F/ sF300_R. The Q-PCR mix was composed of 10µl EVAGreen Supermixes 2X (SsoFast; Bio-Rad), 0.5µl each primer (10µM each), 8µl water and 1µl cDNA template.

Differential Expression (DE) between consecutive spike concentrations
The fold difference between consecutive spike concentrations was then calculated as the ratio of the mean copies/µl exogenous spike detected: DE "Low" corresponds to the ratio of mean copies/µl detected in the 10 3 spike versus the 5 x 10 3 spike. DE "High" corresponds to the ratio of mean copies/µl detected in the 2 x 10 6 spike versus the 10 7 spike. The standard deviations of the ratios were calculated as: where m1 is the mean copies/µl at concentration C and m2 the mean copies/µl at concentration C x 5; sd1 and sd2 the standard deviations of m1 and m2 respectively.

Effect of RT reaction on quantification of endogenous amoA and 16S rRNA transcripts
amoA and 16S rRNA Q-PCR standard curves amoA and 16S rRNA RNA dilutions were prepared and each individually reverse transcribed (RT) using four different enzymes: Superscript IV (SSIV), Superscript III (SSIII) (Invitrogen), Sensiscript (Sensi) and Omniscript (Omni) (Qiagen) and two priming strategiesgene specific (GS) and random hexamer (RH). A summary of the protocol for each system is presented in Table S.7. The resulting cDNA preparations underwent Q-PCR using the corresponding primer pair and Q-PCR conditions as detailed in Table S.6 (regression coefficients in Table S.6).

RT-Q-PCR endogenous transcripts
RNA was extracted from five biological replicates (marine sediment samples) and reverse transcribed as described in Figure 1 and Table S

Effect of reverse transcription on sequencing of endogenous transcripts
Besides evaluating the effect of the RT enzyme and priming strategy on the quantification of the endogenous amoA and 16S rRNA transcripts, the effect of these on community composition, as determined by amplicon sequencing of cDNA was also studied. To this end, amoA and 16Sr RNA amplicons were generated from the cDNA preparations used in the previous experiment (Figure 1.B were reverse transcribed using two different enzymes (SSIII and SSIV) and two different priming strategies (RH and GS) (Figure 1 C). The GS priming was carried out using the 806R 1 primer (Table S.6). The procedure followed was the same as described in Table S.7. After reverse transcription, the spiked mocks sequences were recovered from the total RNA pool by PCR amplification using the 806R reverse primer and a custom vector-specific forward primer pGEMT_FW2 (Table S.
The NCBI taxonomy was given in the FASTA headers. Subsequently, R's rentrez (Winter, 2017)

Bioinformatics pipeline
Abundance tables were obtained by constructing operational taxonomic units (OTUs) as follows. Paired-end reads were trimmed and filtered using Sickle v1.2 (Joshi & Sickle, 2011) by applying a sliding window approach and trimming regions where the average base quality drops below 20. Following this we apply a 10 bp length threshold to discard reads that fall below this length. We then used BayesHammer (Joshi & Sickle, 2011) from the Spades v2.5.0 assembler to error correct the paired-end reads followed by pandaseq v(2.4) with a minimum overlap of 20 bp to assemble the forward and reverse reads into a single sequence.
The above choice of software was as a result of author's recent work (Schirmer et al., 2015;D'Amore et al., 2016) where it was shown that the above strategy of read trimming followed by error correction and overlapping reads reduces the substitution rates significantly.

amoA OTUs check
To ensure all amoA OTUs were valid amoA sequences they were translated using BLASTx to proteins and the match recorded for each individual OTU. Results of this search were used to filter the OTU_table before further processing, and non-translated amplicons removed from further analysis.

Statistical analysis
All statistical analyses were carried out in R (R Core team 2013). The effect of enzyme and priming on RT-Q-PCR result were tested after a log10 transformation of copy number data for 2-way ANOVA tests because the assumption of homogeneity of variances between groups was violated when using copy number directly. When the two-way ANOVA was significant, differences between enzymes/priming strategy were investigated using Tuckey HSD post-hoc test.
The  p.values for the effect of enzyme (Enz), priming (Prim) and the interaction between the two (Enz:Prim) on Ct for the different spike concentrations.  p.valued for the effect of enzyme (Enz), priming (Prim) and the interaction between the two (Enz:prim) on the Ct of the target transcript.  The effect of the RT reaction was evaluated on: A) the quantification of an exogenous transcript spiked at known concentrations; B) the quantification of two endogenous transcripts and the subsequent sequencing of these transcripts and C) the sequencing of mock communities composed of 12 transcripts with known sequences for this last experiment, DNA mocks were also included as controls. Replicates are indicated by "n=".

Figure 8 Differences in t expression and number of amo
OTUs detected by GS and R priming The Venn Diagrams top of the plots show: the numb of OTUs found in the GS data only (blue), in both the GS a Rh data set (purple) and in the R data set only (red). Results a presented as the avera differences in proportio between GS and RH data s (OTUs with positive values a overexpressed in the GS a inversely). When OTUs we found in only one data set, t results are presented as t average proportion of the OT with positive and negative val for GS and RH respectively. F the OTUs shared between GS a RH, the colour of the poin indicates if the difference expression is significant or not explained in the legend (si ND= Not Determined.

Preparation of the sfGFP RNA spike and standard curves
A plasmid containing the sfGFP gene (designed by Segall-Shapiro et al. (2014)) was ordered from the Addgene plasmid repository web site (https://www.addgene.org/59948/) as a bacterial stab (E. coli). The bacterial stab was streaked on LB agar + Ampicillin (100 µg/ml) and incubated overnight at 37°C. A single colony was re-grown in LB ampicillin (100 µg/ml) and used to generate glycerol stocks and subsequently used for PCR and Q-PCR validation of the primers (see main document). For all PCR amplifications, three negative control were included: E. coli DNA, environmental DNA and a no template control. The primer sF500_R (used for gene specific reverse transcription) was also tested at PCR level to ensure its specificity for subsequent RT experiments. After validation of the primers, a fragment of the plasmid including the T7 promotor site and the quasi full length sfGFP gene (40 bp at the 3' end was not included) was PCR amplified using primers pBRforEco and GFP-Frc (reverse complement of GFP-F) ( Table 1). The PCR product was purified using Agencourt AMPure XP beads (Beckman Coulter) following the manufacturer's recommendations and then used for in vitro transcription using the MEGAscript T7 transcription kit (Invitrogen) to prepare sfGFP RNA. The RNA preparation was treated with the Turbo DNase kit (Ambion) and the full digestion of the DNA template was confirmed by the absence of PCR amplification of a 300bp sfGPF fragment using the sF300_F and sF300_R primer pair ( Table 1). Production of target RNA of the correct length was confirmed on the 2100 Bioanalyser RNA nano (Agilent) and concentration determined using fluorometric quantification method (Qubit RNA BR assay; ThermoFisher Scientific). The number of RNA transcripts was calculated using EndMemo RNA copy number Calculator (http://endmemo.com/bio/dnacopynum.php). A 10 points serial dilution was prepared by successive 1/10 dilutions. RNA dilutions were reverse transcribed using the combination of four enzymes and two priming (Table S. Colonies that gave positive results were re-amplified using M13F (5'-GTAAAACGACGGCCAGT-3') and M13R (5'-CAGGAAACAGCTATGAC-3') and the resulting PCR products were purified using the SureClean Plus DNA purification kit (Bioline), then quantified using Qubit DNA High Sensitivity (ThermoFisher Scientific) and used as template for in-vitro transcription using the MEGAscript T7 transcription kit (ThermoFisher Scientific). RNA was then purified using the RNA clean-up protocol of the RNeasy Mini Kit (Qiagen) and DNase treated using an extended protocol: for 20µl purified RNA, 1µl DNase was added and incubated for 1h at 37°C. Then, another 1µl DNase was added and further incubated for 1h at 37°C. RNA was recovered using the phenolchlorophorm/Isopropanol method (as recommended by the MEGAscript instruction manual) and complete digestion of the DNA template was confirmed by no amplification of the insert using T7F and M13R primers. The concentration and size of the DNA-free RNA were then checked using the Bioanalyzer 6000 RNA Nano (Agilent) assay. The number of RNA transcripts was calculated using EndMemo RNA copy number Calculator (http://endmemo.com/bio/dnacopynum.php). For amoA, an eight points serial dilution (5 10 to 5 3 copies/µl) was prepared by successive 1/5 dilutions. For 16SrRNA, a five points serial dilution (10 9 to 10 5 copies/µl) was prepared by successive 1/10 dilutions. RNA dilutions were reverse transcribed using the combination of four enzymes and two priming ( Bacterial culture (the T7 forward primer was used instead of the 515 forward to ensure that the amplicon had been cloned in the 5' 3' orientation relative to the T7 promoter). The rest of the culture was stored as glycerol stock at -80°C. When band of the expected size were detected on agarose gels, the PCR reaction was purified using the SureClean Plus kit (Bioline) and send for Sanger sequencing. In total 47 purified PCR products were sent for Sanger sequencing.
The sequences obtained were submitted to BLAST search (2) and then clustered at 97% similarity. A phylogenetic tree was constructed using MEGA7 (3) and the twelve most dissimilar sequences were selected for the mock community construction. Colony PCRs were then carried out using the glycerol stock of these 12 clones as template. The amplicons were generated using M13 forward and M13 reverse primers ( Table 1). The resulting amplicons were composed of the full length insert (≈290bp) plus a vector sequence of 100bp at the 5' end containing the T7 promoter site, and 137bp at the 3' end. After purification, the twelve PCR products were quantified using fluorometric quantification method (Qubit DNA High Sensitivity assay) and the corresponding copy number was calculated using EndMemo DNA copy number Calculator (http://endmemo.com/bio/dnacopynum.php). Finally, they were pooled together as explained in Figure 1 and  Table 3 and Figure 1. Once pooled together, the RNA mock communities were diluted to reach ≈ 10 8 copies/µl for each sequence.      The regressions coefficients were calculated based on the linear model (y= ax+b; a=Slope and b=Intercept). The fit of the regression is represented by the R squared values. Efficiency of the Q-PCR was calculated as Efficiency (%) = 100 x (10 -1/slope -1). LD = Limit of Detection. For each model, the fit was assessed using four criteria namely; AIC (Akaike Information Criteria), BIC (Bayesian Information Criteria), LogLik (log Likelihood) and deviance. Models 1 and 2 were compared to model 3 and the statistic of the test is reported in the last column (Pr(>ChiSq))    The black arrow indicates the procedure through time. RH = Random Hexamer; GS = Gene Specific.