SARS-CoV-2 viral RNA is stable at room temperature for 31 days when preserved with BioSaliva
To determine the stability of SARS-CoV-2 viral RNA in gargle specimens, we spiked 4 gargle specimens from healthy volunteers with SARS-CoV-2 viral RNA at a final concentration of 20 copies/µl, mixed them with BioSaliva Collection Buffer at 1:1 composition, and stored at room temperature (22–27 °C), fridge (2–8 °C), − 20 °C freezer (− 15 to − 20 °C), and − 80 °C freezer (− 75 to − 85 °C). A total of 80 gargle specimens were extracted at day 0, 7, 14, 21, and 31 to check the stability of spiked SARS-CoV-2 viral RNA. We found that the detection of both target genes remained stable for 31 days at room temperature, − 20 °C, and − 80 °C (Fig. 1A, B), while no viral RNA was detected on gargle specimens stored at 4 °C after 7 days.


(A), (B) Stability of SARS-CoV-2 viral RNA at room temperature, 4 °C, − 20 °C, and − 80 °C for 31 days. Both target genes, helicase (A) and RdRP (B), were detected with no increase in Ct values at room temperature and − 80 °C, while no detection of SARS-CoV-2 viral RNA observed on samples stored at 4 °C. (C) Ct values comparison between spiked gargle specimens with and without BioSaliva Collection Buffer. Increased Ct value was observed on spiked gargle specimens without BioSaliva Collection Buffer.
To assess whether BioSaliva Collection Buffer preserved RNA integrity, we compared the spiked gargle specimens with and without the addition of the Collection Buffer at room temperature. We observed that there was a shift of 5 Ct values or higher in spiked gargle specimens without Collection Buffer after incubation at room temperature for 2 days (Fig. 1C). This implies that adding the Collection Buffer to gargle specimens preserve RNA integrity.
As no viral RNA was detected for gargle specimens stored at 4 °C, Kruskal–Wallis test was performed to compare the median differences between samples stored at room temperature, − 20 °C, and − 80 °C. For the helicase target gene, there was no significant difference in the detection between all three temperatures (p-value = 0.8305). There was also no significant difference found in the detection of RdRP between all three temperatures (p = value = 0.6478).
Comparison between NPOP swabs and gargle specimens in detecting SARS-CoV-2 in the inpatient cohort
We first sought to determine the sensitivity and specificity of both naso-oropharyngeal swabs (NPOP swabs) and gargle specimens to diagnose COVID-19. Fifty-three inpatients and 13 healthy volunteers from RSDK and RSND were recruited with written informed consent for the collection of NPOP swabs and gargle specimens after their confirmation of positive or negative detection of SARS-CoV-2. The mean age (± SD) of the participants was 45.4 ± 16.5 years, and the majority were females (n/N = 41/66; 60.2%). All of the samples were collected within a median (IQR) duration of 3 days (1.75 days) since symptom onset (Supplementary Table S1). We found an overall agreement between NPOP swabs and gargle specimens to be 86.36%, with a sensitivity of 87.23% (95% CI: 74.83–94.02%) and specificity of 84.21% (95% CI: 62.43–94.48%). Cohen’s κ analysis (κ = 0.682) indicated substantial agreement between NPOP swabs and gargle specimens. Among the 50 patients with positive viral detection, both NPOP swabs and gargle were positive in 41 (82%), while 6 (12%) had NPOP swab-positive/gargle-negative results and 3 (6%) patients had gargle-positive/NPOP swab-negative results. Compared to the previous qRT-PCR results, NPOP swabs did not detect 6 samples from previously confirmed positive patients, which resulted in a sensitivity of 88.68% (95% CI: 77.42–94.71%) and specificity of 100% (95% CI: 77.19–100%). On the other hand, gargle specimens did not detect 9 samples from previously confirmed positive patients, thus having a sensitivity of 83.02% (95% CI: 70.77–90.80%) and specificity of 100% (95% CI: 77.19–100%). Overall, our results show that gargle specimens can be a viable alternative to NPOP swabs for specimen collection (Table 1 and Supplementary Table S2A, B).
Comparison between NPOP swabs and gargle specimens in detecting SARS-CoV-2 in the outpatient cohort
To further validate the performance of gargle specimens, a total of 244 outpatients from RSDK and RSND were further recruited for comparison between NPOP swab and gargle specimens. The mean age (± SD) of the participants was 34.3 ± 12.5 years, and the majority were females (n/N = 144/244; 63.2%). Histories of symptom onset and close contact with COVID-19 patients were obtained from 219 (89.75%) subjects and 22.54% (n = 55) of them were reported to be asymptomatic. All of the samples were collected within a median (IQR) duration of 3 days (2 days) since symptom onset (Supplementary Table S1). For the outpatient group, we found substantial agreement (κ = 0.722) between NPOP swabs and gargle specimens to be 86.48%, with a sensitivity of 85.14% (95% CI: 78.52–89.97%) and specificity of 88.54% (95% CI: 80.64–93.48%). Similar to the results of the inpatient study, 79.25% (n/N = 126/159) of participants had both NPOP swabs and gargle specimens, 13.84% (n = 22) of participants had NPOP swab-positive/gargle-negative results, and 6.92% (n = 11) of participants had gargle-positive/NPOP swab-negative results. The performance of gargle specimens on outpatients was found to be similar to the performance on inpatients, indicating that gargle specimens can be used in a population context (Table 2).
Preference of sample type was collected from all participants, where 97.10% (n/N = 301/310) of the participants preferred the usage of gargle specimens as a sample to diagnose COVID-19 (Supplementary Table S3).
High sensitivity of gargle specimens for detecting SARS-CoV-2 across low, moderate, and high NPOP Ct groups
When comparing the Ct value between NPOP swabs and gargle specimens, we found significant differences in the two virus target genes between both sample types (Wilcoxon matched-pairs signed rank test, p-value < 0.0001), with the median difference for each target gene > 8 Ct (Fig. 2A, B). On the other hand, the median Ct value of the human internal control target gene RPP30 was found to be lower than the NPOP swab’s median RPP30 Ct value (Fig. 2C). Additionally, we observed that the majority of the discrepant samples occurred when NPOP swab Ct values were > 31, which indicates that disagreement occurred when the viral load was low. This confirms that the viral load in gargle specimens was lower than that in NPOP swabs and that some variation exists between sample types 4,11.


Sensitivity of gargle specimens is comparable to NPOP swabs for the detection of COVID-19. Higher Ct values were observed for the majority of gargle samples on both helicase (A) and RdRP (B) target genes. (C) RPP30’s Ct values were significantly lower in gargle specimens. (D) Difference in Ct value distribution when analyzed by NPOP Ct groups (low: < 20; moderate: 20–29; high: > 30). (E) However, the lower viral load in gargle specimens does not impact the performance of gargle specimens in detecting COVID-19, as the sensitivity is still 91.38% for Ct < 35.
Analyzing the Ct differences in the low Ct (NPOP Ct < 20), moderate Ct (NPOP Ct 20–29, and high Ct (NPOP Ct > 30) groups, we found that the median difference widened as the NPOP Ct values were lower (Fig. 2D). The median differences for the low Ct groups were 10.40 for helicase and 9.34 for RdRP. For the moderate Ct groups, the median differences were 8.85 for helicase and 8.07 for RdRP. For the high Ct groups, the median differences were 2.39 for helicase and 0.03 for RdRP. However, the effect of a lower viral load was marginal, as the sensitivity of gargle specimens to diagnose COVID-19 was still 91.38% for NPOP Ct values below 35 (Fig. 2E).
Ct values were observed to be lower during the earlier period of infection on both NPOP swabs and gargle specimens (Fig. 3A, B), with NPOP swabs having lower Ct values during the earlier period compared to gargle specimens. The sensitivity of gargle specimens in detecting SARS-CoV-2 drops when the period of sample collection was longer than 5 days from symptom onset (Fig. 3C), although there were more samples detected to be positive for SARS-CoV-2 in gargle specimens (n = 8) than in NPOP swabs (n = 5). On the other hand, age and total number of symptoms did not correlate with the Ct values observed on either NPOP swabs or gargle specimens (Supplementary Fig. S1A, B). This confirms the observation by Zou et al. 12, where higher viral loads were detected soon after symptom onset and that there was no difference in viral load between asymptomatic and symptomatic patients.


Ct values were lower in earlier periods of infection for both NPOP swab and gargle specimens, as evidenced by the trend in the virus target genes helicase (A) and RdRP (B). This resulted in a lower sensitivity (80%) on samples collected longer than 5 days since symptom onset (C).
Sensitivity and specificity of gargle specimens is highly replicable with a different RNA extraction kit and qRT-PCR kit
To assess the repeatability of the performance of gargle specimens in diagnosing COVID-19, we performed a comparison between NPOP swabs and gargle specimens using other commercial extraction and qRT-PCR kits at the University of Indonesia (Table 3). Similar to the results on the inpatient and outpatient cohorts of RSDK and RSND, we found an overall agreement of 85% with a sensitivity of 85% (95% CI: 63.96–94.76%) and specificity of 100% (95% CI: 72.25–100%). However, there was no significant median difference observed between the Ct values of the NPOP swab and gargle specimens (Supplementary Fig. S2A, B). We also observed high sensitivity (94.12%) of gargle specimens on NPOP swab Ct < 35 validated by Universitas Indonesia (Supplementary Fig. S2C). Thus, our results demonstrated the wide applicability of this gargle specimen for adoption with other existing workflows.

