How the ‘kitome’ influences the characterization of bacterial communities in lepidopteran samples with low bacterial biomass

We aimed to elucidate whether the DNA extraction kit and bacteria therein affect the characterization of bacterial communities associated with butterfly samples harbouring different bacterial abundancies.


Introduction
The molecular analysis of bacterial communities has greatly extended our understanding of these hidden members of ecosystems (e.g. Torsvik and Ovreas 2002;Riesenfeld et al. 2004;Bringel and Couee 2015;Lievens et al. 2015). Valid comparisons of bacterial communities and analyses of their successions require precise and reproducible descriptions of their composition. The results of the analysis of bacterial communities by PCR and sequencing techniques sensitively depend on a wide range of parameters (Wintzingerode et al. 1997;Fouhy et al. 2016). For instance, storage temperatures of samples and preservation buffers can affect the detected bacterial community diversity, richness and relative abundance (Choo et al. 2015). DNA extraction methods using different cell lysis procedures have an impact on absolute microbial numbers, community richness and relative abundance (Ariefdjohan et al. 2010;Henderson et al. 2013). Furthermore, the choice of primers targeting the bacterial 16S rRNA gene can significantly affect which members are detected. The so-called 'universal primers' (e.g. Ben-Dov et al. 2006), in fact, vary in their efficacy to cover the richness of bacteria present in a sample (Baker et al. 2003). In addition, the different 16S rRNA gene regions amplified by different primers produce varying results when analysing community diversity by next-generation sequencing (Bukin et al. 2019).
Contaminating bacterial DNA is commonly found in different DNA extraction kits (Salter et al. 2014). This so-called 'kitome' can have a great impact on samples with a low bacterial abundance. Thus, when using DNA sequencing-based techniques, negative controls should always be used in parallel to identify those bacterial members belonging to the actual sample (Wintzingerode et al. 1997;Salter et al. 2014). A wide range of studies of bacterial communities, especially in soil, water, vertebrate animals and humans have shown that the treatment of samples prior to analysis, the DNA extraction method, the primers, the sequencing platform and the purity of reagents greatly affect the description of these bacterial communities (Meth e et al. 1998;Martin-Laurent et al. 2001;Cu ıv et al. 2011;Gilbert et al. 2012;Henderson et al. 2013;Burbach et al. 2016;Castelino et al. 2017).
Bacterial communities associated with insects have been recognized as important players that can greatly affect their host biology and its interaction with other trophic levels (Feldhaar 2011;Ferrari and Vavre 2011;Hansen and Moran 2014;Douglas 2015;Paniagua Voirol et al. 2018). However, a great number of economically important insect pests remain unexplored in terms of their microbiota, although knowledge of their microbiota might contribute to a deeper understanding of, for example, development of insect resistance against insecticides.
Lepidoptera (moths and butterflies) are a taxon comprising numerous agricultural and forestry pest species. The extent to which bacterial associates affect the biology of Lepidoptera is difficult to assess. The studies characterizing the bacterial communities associated with various moths and butterflies are hardly comparable because they employ different DNA extraction techniques and 16S primers (Paniagua Voirol et al. 2018). Furthermore, studies of the lepidopteran microbiota are mostly focused on the plant-damaging larval stage (Paniagua Voirol et al. 2018), and analyses on other life stages, such as the eggs or adults, are scarce (see: Chen et al. 2016;Phalnikar et al. 2018;Ravenscraft et al. 2019), although bacterial associates in these stages may be relevant for the insect´s fitness. In fact, it has been found that bacterial abundance in lepidopteran adults can be much higher than in other life stages (Hammer et al. 2014(Hammer et al. , 2017Ravenscraft et al. 2019;Paniagua Voirol et al. 2020).
Treatments with antibiotics (ABs) are commonly used to manipulate bacterial communities associated with Lepidoptera (Paniagua Voirol et al. 2018Voirol et al. , 2020. Samples from AB-treated individuals are frequently analysed along with untreated control samples to assess differences in bacterial community composition. However, no study addressed so far the question whether different DNA extraction kits differentially influence the characterization of bacterial communities in Lepidoptera samples with different bacterial abundance. The aim of this study is to find out whether the choice of DNA extraction kit is critical when investigating bacterial communities associated with Lepidoptera samples, which contain bacteria in different abundancies. We studied the bacterial communities associated with P. brassicae eggs and adults harbouring different bacterial loads. Eggs were expected to have a low bacterial load (see Paniagua Voirol et al. 2020). To produce samples of P. brassicae adults with different bacterial load, butterflies were treated with ABs. A significant reduction of the bacterial abundance in P. brassicae butterflies upon AB treatment was previously confirmed via qPCR and reported by Paniagua Voirol et al. (2020).
We processed the samples using three different commercially available DNA extraction kits and addressed the following questions: (i) Does the applied kit, in combination with different 16S primers, affect the efficacy of the PCR amplification of the 16S rRNA gene from bacteria in eggs, AB-treated and untreated adults? (ii) How does the kit affect the detection of bacterial taxa in untreated and AB-treated adults? We studied this question by sequencing (MiSeq) the bacterial communities and identified bacteria that were consistently present in adults as well as bacteria associated with the kits used. (iii) Therefore, we asked to which extent does this so-called 'kitome' shape the detected bacterial community in untreated and AB-treated butterflies; here, we compared the bacterial communities associated with butterflies with those obtained from negative extraction controls (NECs). We determined the dissimilarities and alpha diversities of the bacterial communities of the differently processed butterfly samples and of the NECs.
Our study shows that the results of the PCR amplification of the 16S rRNA gene vary in dependence of the kit used. The sequencing analysis revealed that the bacterial taxa detected in samples of control P. brassicae with high bacteria abundance were similar, regardless of the type of kit used. However, sequencing of samples with a low bacterial load and NECs evidenced bacterial contamination coming from the DNA extraction kits, which greatly shaped the detected bacterial communities and alpha diversities in these samples. Our study exposes that a valid characterization of the bacterial community associated with Lepidoptera, especially with those harbouring a low bacterial load, needs method testing and obligatory consideration of NECs.

Insect rearing
Insects originated from a laboratory-reared colony (Institute of Biology, Applied Zoology/Animal Ecology, Freie Universit€ at Berlin, Germany). Larvae of the Large White  P. brassicae) were reared on Brussels sprouts plants (Brassica oleracea var. gemmifera) in a climate chamber (18-h/6-h light/dark cycle, 160 µmol m À2 s À1 light intensity, 20°C and r.h. 70%) until pupation. The plants were grown in a greenhouse. Pupae were transferred to a separate climate chamber (18-h/6-h light/dark cycle, 220 µmol m À2 s À1 light intensity, 23°C and r.h. 70%), where adult butterflies emerged. Adults were fed with a 15% w/v honey solution provided in 1Á5-ml Eppendorf tubes placed in the centre of artificial flowers. This insect line is here referred to as untreated or control line.
To elucidate how the choice of DNA extraction kit and bacterial contaminants therein affect the results when analysing insects with low bacterial abundance, we established an AB-treated line of P. brassicae. This AB-treated line was derived from the original P. brassicae rearing. For this line, larvae were reared on 7-week-old Brussels sprouts plants sprayed with a cocktail of four ABs (ampicillin, chloramphenicol, rifampicin and streptomycin), each in a concentration of 0Á5 mg ml À1 H 2 O until the plant surface was uniformly covered with the solution. After pupation, the treatment with ABs was continued by feeding the butterflies with an AB-spiked aqueous honey solution (15% w/v honey dissolved in the above-mentioned mixture of ABs).

Sampling procedures
An egg sample (biological replicate) of P. brassicae consisted of a pool of 30 eggs collected from three different female butterflies (i.e. 10 eggs per female). We pooled eggs from three different females to minimize variation originating from individual females. For the collection of P. brassicae eggs, females laid egg clutches on the sterile side of a petri dish, while the other side was covered with a Brussels sprouts leaf to stimulate oviposition. The freshly laid eggs were further processed for DNA extraction (see below). In total, we analysed the bacterial abundance in 10 egg samples (biological replicates) by PCR (Fig. 1a).
We sampled adult control and AB-treated butterflies 2 weeks after emerging from the pupal stage. Each biological replicate consisted of an individual female after the removal of its wings. A total of 10 control and 10 AB-treated females were collected and processed with the different DNA extraction kits and subsequent PCR  Figure 1 Experimental scheme of the sampling and further processing of Pieris brassicae samples. (a) Eggs were pooled from three females per sample to limit variation of bacteria detection across females. One egg sample consisted of 30 eggs obtained from three females. (b) Each butterfly sample consisted of a single female that was either untreated (control) or treated with antibiotics (AB). Samples were homogenized and TE buffer (280 µl) was added to split the homogenate in three aliquots of 50 µl each. Aliquots from the same sample were processed with each DNA extraction kit (A, B and C) (compare Table 1). The sample obtained by each DNA extraction was used for PCR amplification of the 16S rRNA gene using three different pairs of primers (a, b and c). For the adults, the product yielded with c primers (encircled) was then used for bacterial community sequencing due to its clear signal and adequate amplicon size. N represents the number of biological replicates (biological samples). amplification of the 16S rRNA gene. For the sequencing of the bacterial community, we processed seven to eight adult individuals of each the AB-treated and control butterflies (Fig. 1b).

DNA extraction
Samples of P. brassicae eggs and adults were frozen in liquid nitrogen immediately after collection in sterile Fas-tPrepâ tubes and bead-ground for 15 s at 4500 rev min À1 in a tissue homogenizer (Precellys Evolution â ). We used three different DNA extraction kits ( Table 1). The kits were chosen based on whether they have been reported for successful DNA extraction of bacteria (i) in insect eggs (Pankewitz et al. 2007), (ii) in insect larvae or adults (Kaltenpoth et al. 2011;Salem et al. 2013 To test each biological replicate with the three different DNA extraction kits, 280 µl of TE buffer was added to the homogenized samples, followed by brief vortexing. Then, three aliquots of 50 µl were taken from the homogenate and processed with each kit. We added the corresponding lysis buffer from each kit to the aliquots of the homogenized samples from adults and eggs, thereby following the manufacturer's instructions. The DNA extraction was continued as indicated in the manufacturer's instruction. Each sample was subjected to a lysozyme (Epicentre Ready-Lyse TM ) digestion for a period of 30 min following the Epicentre Ready-Lyse TM protocol to maximize the lysis of Gram-positive bacteria (Ketchum et al. 2018). Samples were further processed exactly as indicated by each kit's protocol.
We included NECs, consisting of mock samples (containing no insect sample), and positive controls, consisting of samples containing a resuspended (50 µl TE buffer) pellet of Escherichia coli DH5-a previously cultured in 1 ml of Luria Bertani (LB) broth at 37°C overnight. Extraction controls were processed at the same time as the true samples to control for kit-associated contamination and effectiveness of the bacterial DNA extraction.
Since contamination during DNA extraction can originate from different sources other than the kit itself (McFeters et al. 1993;McAlister et al. 2002;Witt et al. 2009;Motley et al. 2014), extractions were performed under clean bench conditions. All samples processed with each kit were handled at once including their respective controls. Plastic consumables and additional reagents (isopropanol, ethanol and molecular grade water) were of the same lot for all extractions.
Following the extraction, DNA concentrations were measured using the Thermo Scientific TM lDrop TM Plate and applying the manufacturer's instructions. The A 260 /A 280 and A 260 /A 230 ratios were verified to be within the values known for high-quality DNA samples (Lucena-Aguilar et al. 2016).

Primers and PCR amplification of the 16S rRNA gene
For the PCR amplification of the bacterial 16S rRNA gene in the samples extracted with the types of kits mentioned above, we used three different pairs of primers (a, b and c) commonly used for detection of bacteria via PCR (Table 2 Primers targeting the V3-V4 regions of the 16S rRNA gene are considered more efficient at capturing bacterial community composition when compared to primers targeting other regions (e.g. V1-V3 region) (Castelino et al. 2017). Furthermore, the V3-V4 region has been targeted in important studies such as the Human Microbiome Project (Huttenhower et al. 2012;Fadrosh et al. 2014).  Although no NGS platform is available that allows multiple-sample sequencing of the complete 16S rRNA gene, we also included a pair of primers (b), which amplify almost the entire 16S rRNA gene (Weisburg et al. 1991;Andreolli et al. 2013). Thus, we could assess whether the bacteria signal per se was influenced by the DNA extraction kit. The PCR was conducted using the JumpStart TM Taq ReadyMix TM from Sigma Aldrich with 50 ng of DNA template in a total volume of 50 µl. The PCR mixture was prepared under clean bench conditions. All consumables and reagents were of the same batch. The cycling parameters were as follows: an initial denaturation cycle at 94°C for 2 min followed by 30 cycles of denaturation at 94°C during 30 s, annealing (primers a: 52°C, b: 55°C, c: 55°C) for 30 s, extension at 72°C for 2 min and a final extension cycle at 72°C for 5 min.
In addition to the positive and NECs, we included positive and negative PCR controls to validate the effectiveness of the reaction. Positive PCR controls consisted of PCR mixtures containing 50 ng of E. coli DH5-a DNA as template. Negative PCR controls consisted of only the PCR mixtures containing no true samples. A volume of 10 µl of the PCR product was run on a 1Á2% agarose gel stained with ethidium bromide in 1X TAE buffer at 150 V for 30 min. Bands were visualized under UV light. The rest of the PCR product was kept for downstream sequencing.

Bacterial community sequencing
For bacterial community sequencing, we selected the PCR products given by the c primers, which targeted bacterial 16S sequences. These primers reliably produced a single conspicuous band (amplicon) with a suitable size (<500 bp) for the sequencing platform. We sequenced (Illumina MiSeq platform) amplicons from butterflies (control and AB-treated) processed with the three different DNA extraction kits and the respective NECs, to assess the extent to which the 'kitome' shapes the detected bacterial community in samples with high and low bacterial abundance. Amplicons from eggs were not further sequenced as the bacterial load in AB-treated butterflies and eggs has been reported to be similarly low (see Paniagua Voirol et al. 2020).
PCR products were purified with AMPure beads (Beckmann Coulter, Brea, CA) and ligated to barcoded Illumina adapters via PCR using a high-fidelity DNA polymerase (Herculase II Fusion DNA Polymerase, Agilent). The cycling parameters were as follows: an initial denaturation cycle at 95°C for 2 min followed by eight cycles of denaturation at 95°C during 20 s, annealing at 52°C for 30 s, extension at 72°C for 30 s and a final extension cycle at 72°C for 3 min. Each library contained a specific combination of index adapters (dual-indexed) to allow later discrimination of samples after pooling. Concentrations of the ligated products were measured using Qubit â fluorometer (Thermo Fisher Scientific) and brought to an equimolar ratio prior to pooling. The pooled, barcoded samples were sequenced at the Berlin Center for Genomics in Biodiversity Research (BeGen-Div) on the Illumina MiSeq platform (Illumina, San Diego, CA) using v3 600 cycles of (paired-end) sequencing. Sequencing reads were trimmed, denoised and overlapped using a full-stack R (R core Team 2018) pipeline incorporating dada2 (Callahan et al. 2016a(Callahan et al. , 2016b and phyloseq (McMurdie and Holmes 2013). Forward and reverse reads were trimmed to 275 and 175 bp, respectively, truncated at the first instance of a quality score less than 2 and filtered on a maximum expected error rate of two errors per truncated read. The remaining forward and reverse reads were dereplicated and denoised using a parameterized model of substitution errors (Callahan et al. 2016a(Callahan et al. , 2016b. The resulting denoised read pairs were merged and subjected to de novo chimera removal. Taxonomy was assigned using the Ribosomal Database Project (Cole et al. 2014) training set, version 16. The resulting exact sequence variants (see Table S1 for sequencing variant numbers per sample) were agglomerated at the genus level. The R script applied for the sequencing analysis is provided in File S1.

Statistical Analysis
Shannon indices were calculated using absolute counts (McMurdie and Holmes 2013) and analysed with Wald tests of linear models with linear combinations of  (Kuhn et al. 2013). We did not rarefy or normalize the libraries for alpha diversity determination because library sizes were not significantly different among the treatment groups (Willis 2019) (see Table S2). Chloroplast sequences were present in small amounts in most samples (see Table S3) and filtered out for the analysis. Plant or insect mitochondrial sequences were present in very minor, negligible amounts (Table S3). Bray-Curtis distance matrices were calculated using relative abundance data and principal coordinate analysis (PCoA) ordination was performed using pyhloseq (McMurdie and Holmes 2013). Multivariate analysis of variance was tested using vegan (Oksanen et al. 2018). Differential abundance testing was performed using DESeq2 (Love et al. 2014) in conjunction with phyloseq. Briefly, genus-level counts were modelled using generalized linear models of the negative binomial family with a logarithmic link. For all analyses, dispersion parameters were estimated with a local fit and empirical Bayes shrinkage. Specifically, likelihood-ratio tests were performed to test for the main effect of the applied kit and for a kit by treatment interaction. Counts for each genus were normalized by size factors accounting for variation in sequencing depth across samples. Size factors were estimated using the 'median of ratios' method described by equation 5 in Anders and Huber (2010). A modified geometric mean was used by taking the nth root of the product of the non-zero counts (McMurdie and Holmes 2013). The impact of multiple testing correction was mitigated by independent filtering using the mean normalized count for each genus across all samples. Genera were considered to be differentially abundant at FDR-corrected P < 0Á05.

Results
Effects of the DNA extraction kit on PCR amplification of the 16S rRNA gene We extracted DNA from P. brassicae eggs, AB-treated and untreated (control) adults using three different DNA extraction kits. We assessed whether PCR amplification of the bacterial 16S rRNA gene varies in dependence of the kit used to process the sample. For control P. brassicae adults, the intensity of the PCR band corresponding to the bacterial 16S rRNA gene varied depending on the combination of primers and extraction kit. Primers a produced the weakest PCR band in combination with kit A, but revealed a clearer band in combination with kits B and C. Primers b and c were highly effective at producing conspicuous amplicons, regardless of the kit used (Fig. 2a).
In AB-treated butterflies, the bacterial signals (PCR bands) were very weak in intensity, irrespective of the type of primers and extraction kits. Hence, these butterflies harbour a very low bacterial biomass. Primers a produced a second amplicon when used in combination with kits B and C. Sanger sequencing (Microsynth AG) confirmed that this amplicon originates from the insect 18S rRNA gene (Table S4). Interestingly, this band was not observed when using kit A (Fig. 2a).
For P. brassicae eggs (from untreated females), the PCR band corresponding to the bacterial 16S rRNA gene was extremely tenuous or absent in the analysed samples (Fig. 2b). Similar to the samples from AB-treated butterflies, primers a produced an amplicon belonging to the insect 18S rRNA gene (Table S4). This second band was In summary, the intensity of the bacteria signal (PCR band) obtained from untreated butterflies was dependent on the combination of DNA extraction kit and primers used. In contrast, the bacteria signal intensity was very weak when analysing the butterfly eggs and AB-treated adults, regardless of the kit and primer pair used.

Effects of the DNA extraction kit on detection of bacteria taxa in P. brassicae butterflies
We sequenced bacterial PCR amplicons from untreated and AB-treated butterflies processed with the three different DNA extraction kits (Table 1) to assess whether the kit used affects which bacteria taxa are detected. Out of the two primers (a and c) that can yield an amplicon of adequate size (<500 bp) for the Illumina MiSeq platform, we selected primers c because they yielded a single amplicon of the 16S rRNA gene (compare Fig. 2).
The sequencing of bacteria in P. brassicae adults revealed striking differences in the bacterial communities associated with untreated and AB-treated butterflies. In untreated P. brassicae adults, the community composition showed great homogeneity across samples processed with the different kits. Thus, the sequencing outcome of their bacterial community was not significantly affected by the kit used (Fig. 3). These samples were dominated by four bacteria genera (Gluconobacter, Lactococcus, Serratia and Yersinia), which accounted for 98-99Á9% of the reads. In contrast, the bacterial communities found in AB-treated butterflies and the NECs obtained with kits A, B and C were highly heterogeneous and varying with the kit used for processing the samples; they were dominated by bacterial taxa which are commonly reported as contaminants (Salter et al. 2014). The taxa Pseudomonas and Acinetobacter were present in all samples of AB-treated butterflies processed with the three kits.
Furthermore, when analysing the differential abundance of bacteria genera in the differently extracted samples of control and AB-treated butterflies, a likelihood ratio test revealed increased counts of Burkholderia (P < 0Á001) and Methylobacterium (P = 0Á023) associated with kit A (Fig. 4). This result indicates that those members are contaminants belonging to kit A and are not part of the bacterial community of P. brassicae adults.

Effects of the 'kitome' on the characterization of bacterial communities in P. brassicae butterflies by dissimilarity and diversity indices
We further determined the dissimilarity of the bacterial communities in the differently processed butterfly samples and assessed the factors, which account for this. Since our analysis indicated the presence of bacterial contaminants in the kits, we included the detected bacteria   Table S5 for proportions represented by these taxa). Each group of three stacked bars represents one individual butterfly processed with DNA extraction kit A, B or C. Negative extraction controls (NECs) correspond to mock samples processed in parallel using each kit. N = 8 untreated adult butterflies (white) for control, 7 AB-treated adult butterflies (red) and 1 NEC per kit. Bacterial communities differed according to treatment regardless of the extraction kit (F = 14Á3, R 2 = 0Á38, P = 0Á001). The bacterial community of the pooled control butterfly samples differed significantly from both the one of the pooled AB-treated samples (F = 25Á29, R 2 = 0Á37, P = 0Á003) and from the pooled NECs (F = 11Á70, R 2 = 0Á32, P = 0Á003) (Fig. 5).
When further analysing the impact of treatment and kit on the dissimilarities, we found a significant effect of the AB treatment (F = 17Á10, R 2 = 0Á38, P = 0Á001) and kit (F = 3Á25, R 2 = 0Á07, P = 0Á001) and an effect of a kit by treatment interaction (F = 2Á12, R 2 = 0Á10, P = 0Á005) on the dissimilarities.
When considering only the samples from AB-treated butterflies, we found a significant effect of the type of kit (F = 4Á78, R 2 = 0Á31, P = 0Á001) on community dissimilarities. Hence, samples of AB-treated butterflies processed with kit A differed from kit B (F = 6Á24, R 2 = 0Á34, P = 0Á003) and kit C (F = 5Á96, R 2 = 0Á31, P = 0Á003). In contrast, when considering only the samples from untreated butterflies, there was no effect of the kit (F = 2Á66, R 2 = 0Á08, P = 0Á063).
To characterize the bacterial communities by their alpha diversities, we compared the Shannon indices of the bacterial communities detected in the differently processed samples of control and AB-treated butterflies as well as in the NECs. We used bacteria genera and their abundance for calculating the index. The alpha diversities varied according to treatment (Fig. 6). The bacterial communities detected in AB-treated butterflies and NECs showed a greater alpha diversity than those in untreated butterflies. Thus, samples from untreated butterflies significantly differed in their diversity from both the ABtreated butterflies (T = À10Á5, df = 46, P < 0Á01) and NECs (T = À4Á84, df = 46, P < 0Á01). There was no significant effect of kit nor of a kit by treatment interaction on the alpha diversity index (Fig. S1).
In summary, the sequencing of amplicons obtained with primers c revealed that the detected bacterial community composition of samples with high bacterial abundance remained consistent across kits and was dominated only by a few taxa. In the presence of such dominant members, kit-associated bacteria occurring in very low abundances could not be detected upon sequencing. Consequently, the alpha diversity of the bacterial community of untreated control butterflies was very low. In contrast, the sequencing outcome of the community composition in samples with low bacterial abundance (AB-treated adults) was dependent on the extraction kit and the bacterial contaminants therein. Many bacteria genera of low dominance were detectable in these samples upon sequencing, resulting in a high alpha diversity of the respective communities. Hence, the 'kitome' shaped the

Discussion
Assessing the most appropriate DNA extraction technique is fundamental when investigating the bacterial community in an unexplored ecosystem. Here, we analysed how the selection of DNA extraction kit and bacteria therein affect the detection of bacteria in lepidopteran samples with different bacterial abundance: eggs, AB-treated and untreated butterflies. We took P. brassicae as a model and found, as expected, only a very low abundance of bacteria associated with P. brassicae eggs and AB-treated butterflies, but consistent detection of bacteria in conspicuous abundance in untreated P. brassicae adults. In this latter type of sample, the bacteria signal intensity obtained by the PCR analysis varied with the type of DNA extraction kit and the primers used, suggesting differences in the extraction efficiency of the kits and/or their compatibility with the primers. However, Illumina (MiSeq) sequencing revealed that the type of kit used for extraction of bacterial DNA hardly exerts any effects on the taxon identification of the detected bacterial community in those lepidopteran samples with a high bacterial abundance (untreated butterflies). In striking contrast, the kit determined the bacterial community composition in the samples with a low bacterial abundance; the 'kitome' shaped the bacterial community composition and significantly affected the determined alpha diversity index of the bacterial community. How to explain that the PCR analysis revealed clear effects of the kit on the bacteria signal intensity in combination with the type of primers used? In addition to primer pair c, primer pair b also functioned kit independently and provided a strong bacteria signal in samples from control adults. Primers a, however, gave a clear signal when samples were processed with kits B and C, and only a weak signal when samples were processed with kit A. The reason behind the weak signal given by primers a in samples processed with kit A might be the presence of DNA extraction-related PCR inhibitors differentially affecting primer annealing (Schrader et al. 2012). This idea is supported by the weaker signal given by the positive extraction control in comparison to the positive PCR control (compare '+' vs '(+)' kit A, Fig. 2a), showing that the extraction with kit A reduced the efficacy of the PCR with primers a, but not with primers b and c. This kit-mediated effect might be caused by differences in the quantitative and/or qualitative chemical composition of the kits, affecting the purity of the DNA product. However, the exact chemical composition of the kit reagents is not publicly available. Whether such PCR inhibition would be observed when using a more sensitive DNA polymerase remains to be tested. We chose the JumpStart TM DNA polymerase due to its high performance at a price, which is representative of most DNA polymerases used for the amplification of the 16S gene. Another potential explanation for the weaker signal shown by kit A in combination with primers a is a low bacterial DNA yield in relation to the total DNA yield (from insects and their bacteria) obtained with this kit. We measured the total DNA yield obtained by the tested kits and found that kit A produced a significantly lower yield per sample than kits B and C (Fig. S2), thus showing a reduced extraction efficiency. DNA yield is used as a quality parameter for DNA extraction kits (Pollock et al. 2018). Different DNA yields obtained by different kits may be due to their varying capacity to disrupt cells (de Bruin and Birnboim 2016), especially if the sample contains Gram-positive bacteria and endospores, which are less prone to lysis (Pollock et al. 2018). Thus, poor DNA extraction can produce an incomplete coverage of the true bacterial community during downstream PCR amplification (Ariefdjohan et al. 2010). Overall, our results indicate that the type of kit can affect the efficacy of the 16S primers.
Interestingly, we found that primers a also amplify 18S rRNA (insect) amplicons. This renders them less suitable for (MiSeq) sequencing of bacterial communities in P. brassicae since it demands an extra amplicon purification process prior to (MiSeq) sequencing. Hence, primers c were chosen for the community sequencing due to the single amplicon they yield and the explicit recommendation by Illumina to use these primers for the MiSeq platform (Illumina 2013;Klindworth et al. 2013). Remarkably, the use of these primers produced very low numbers of plant and insect mitochondrial 16S reads (Table S3). This result may vary when using these primers on other lepidopteran species and life stages (e.g. folivorous larvae with high plant content in their gut). In such cases, use of insect-or plant-specific blocking primers might greatly favour the amplification of bacterial sequences over insect and plant sequences (Vestheim and Jarman 2008).
The simplicity of the bacterial community in untreated P. brassicae butterflies is likely explaining the lack of significant differences in the composition of the detected adult-associated bacterial community when using different DNA extraction kits. The majority of studies showing a large effect of DNA extraction methods on the retrieved bacterial community composition are focused on animal samples containing robust, taxon-rich bacterial communities (Scupham et al. 2007;Yuan et al. 2012;Henderson et al. 2013;Larsen et al. 2015;Burbach et al. 2016;Weber et al. 2017;Ketchum et al. 2018). Lepidoptera, in contrast, are known to be colonized by rather few bacteria taxa (Paniagua Voirol et al. 2018). Our results indicate that the type of kit did not influence the detection of bacteria taxa when handling samples with conspicuous bacteria abundance, but low community diversity. Samples from AB-treated butterflies and NECs had higher bacterial diversity (Shannon indices) than the control adult samples. This calculated higher diversity turned out to be due to several kit contaminating bacteria, which were detected in these types of samples because of the highly reduced abundance of bacteria associated with the butterflies. The presence of several bacterial contaminants in low and similar abundancies as well as the lack of highly dominant taxa contributed to the high Shannon index, which is the higher the more taxa are present in similar abundancies. We suggest that the bacterial kit contaminants present in samples from untreated butterflies were marginally or not amplified by PCR, and thus neither identified by the sequencing analysis, presumably due to the high abundance of the bacteria associated with these Lepidoptera samples.
Understanding the impact of bacteria on the biology of organisms associated with them is not only a challenge for entomologists, but for all biologists and physicians, who need to analyse samples with low bacterial biomass such as mammalian placenta or blood plasma. This is challenging because commonly used methods for samples with high microbial abundance (e.g. human gut and faeces) do not generate reliable readouts for samples with low microbial abundance (Weiss et al. 2014). There is an increasing evidence that analyses of samples with low  biomass are highly susceptible to produce biased results, overestimating the impact of bacterial taxa, which are members of the bacterial community present in the kit, that is, of the 'kitome' or other contaminants (Weiss et al. 2014). Yet, many studies of bacterial communities lack sequencing of negative controls or descriptions of contamination removal methods (Salter et al. 2014;Glassing et al. 2016). Results like ours, showing higher bacteria diversity in the samples from AB-treated butterflies in comparison to control butterflies, could be misleading if no negative controls would have been included in the analysis.
The analysis of negative controls should be a basic requirement in the study of bacterial associates of Lepidoptera, especially when sampling life stages containing low bacteria biomass such as lepidopteran eggs or larvae (Hammer et al. 2017;Paniagua Voirol et al. 2020), or when analysing the effects of a treatment with ABs on insect-associated communities. Interestingly, many of the core bacteria genera reported in Lepidoptera, such as Pseudomonas, Bacillus, Enterococcus and Acinetobacter (Paniagua Voirol et al. 2018), have also been reported as common laboratory contaminant genera (Salter et al. 2014;Glassing et al. 2016). Even though this does not imply these bacteria are mere artefacts, appropriate controls are needed to confirm their association with insects. Since Pseudomonas and Acinetobacter were present in negative controls processed by all three kits tested here, we cannot fully exclude that these bacteria originate from other sources of contamination than the kit, such as plastic consumables (Motley et al. 2014), ultrapure water (McFeters et al. 1993McAlister et al. 2002), PCR reagents (Grahn et al. 2003) or the laboratory environment (Witt et al. 2009).
Although Burkholderia and Methylobacterium were present in all samples processed by the different kits, their counts were significantly higher in samples processed with kit A (Fig. 4). Hence, there was a correlation between usage of kit A and these two bacteria genera. This, along with other taxa found in NECs, strongly indicates that DNA extraction kits contain bacterial DNA that influences the results of community sequencing if samples contain low bacterial presence.
In summary, the DNA extraction kit can affect the characterization of bacterial associates of Lepidoptera, especially when samples contain low bacterial abundance. In contrast to untreated butterflies, the detected bacterial community of AB-treated butterflies with reduced bacterial abundance was largely shaped by kit-associated contamination. Characterizing bacterial communities by alpha diversity indices needs in parallel the sequencing of NECs to prevent that the 'kitome' shapes statements on the diversity of the actual bacterial community under study.

Supporting Information
Additional Supporting Information may be found in the online version of this article: Data S1. Materials and methods. Data S2. Results. Table S1. Amplicon sequence variants (ASV) per sample before agglomeration at the genus level. Table S2. Sequencing library size per sample. Table S3. Percentage of reads from chloroplast and insect/plant mitochondria per sample. Table S4. Sequence of the insect 18S rRNA gene amplified by 515F/806R primers. Table S5. Proportion of the Pieris brassicae bacterial community represented by the 12 most abundant taxa. Figure S1. Alpha diversity (Shannon index) of butterfly-associated bacterial communities extracted with different DNA extraction kits. Figure S2. Total DNA yield per sample obtained by using three different extraction kits.