Study population
A total of 342 faecal samples were collected from healthy subjects recruited in different studies. Six subjects provided a second stool sample one year after the first collection: for them, only data from the first sampling were considered in the analyses while the second sample was used to assess intra-individual variability. One subject was excluded because few sRNA-Seq reads were aligned on miRNA sequences. The final study population consisted of 335 subjects provided by both stool sRNA-seq data and lifestyle questionnaires (Fig. 1). Subjects had an average age of 44.7 ± 14.7 years old (range: 18–81) and 63.6% were females (Table 1).


Stool miRNA profiles and analysis of the intra/inter-individual expression variability
On average, 10.3 million single-end reads per sample were obtained from sRNA-Seq with a median of 64,281 reads (0.92%) assigned to human miRNA annotations (Supplementary Table 1A). In total, 3,277 miRNAs (93% of our miRNA reference) were detected in at least one subject, but an extensive variability was observed among individual miRNA profiles (ranging from 188 to 1,949 detected miRNAs per sample), with a median of 594 miRNAs detected in stool of the studied population. The chromosomal distribution of all the stool miRNAs detected was compared with that of the total reference miRNome investigated (n = 3,524): an average of 93.7% of detected miRNAs per chromosome was observed, with the highest percentage on chromosome Y and the lowest on chromosome 2 (100% and 88.2% of the total miRNA coding genes, respectively; Supplementary Table 1B).
Four hundred and forty-nine miRNAs (13.8%) were detected in at least half of the analysed samples (Supplementary Table 1B). Among them, nine miRNAs (miR-320e-5p, miR-607, miR-647-3p, miR-1246-3p, miR-1302, miR-3125, miR-5698, miR-6075, and miR-6777-5p) were detected in all the samples analysed. miR-3125 was characterised by the highest median expression levels (2,051 reads), followed by miR-6075 (921 reads), and miR-1246-3p (884 reads).
To evaluate the inter-individual variability of stool miRNA expression levels across the study population, a coefficient of variation (CV) was calculated for all miRNAs with a median read expression level of at least one (n = 2,031). The resulting coefficients were used to rank miRNAs according to their expression variability (Supplementary Table 1C). miR-647-3p, miR-6075, and miR-3125, also expressed in all samples, showed the lowest expression variability (CV ranging from 0.48 to 0.50, Fig. 2a).


(a) Scatter plot showing the stool miRNA inter-individual variability expressed as a relation between median expression levels and coefficient of variation (CV). Red dots represent miRNAs detected in all samples. On the left of the dotted line are reported those miRNAs characterised by the lowest CV. (b) Scatter plots showing the stool miRNA intra-individual variability of four selected subjects who provided a stool sample at two different time points.
Repeated samples collected from six subjects were used to assess the stability of stool miRNA expression levels over time. A second faecal sample was collected approximately one year after the first collection (min = 378 days, max = 560 days). No significant changes in lifestyle habits were reported from the questionnaires compiled by the participants on both occasions. A Spearman correlation analysis between miRNA expression levels at the two time points showed a correlation coefficient ranging from 0.54 (p < 0.001) to 0.72 (p < 0.0001) (Fig. 2b). A Wilcoxon paired test performed on the set of nine miRNAs detected in all samples showed that they were also expressed in the repeated samples and did not show any significant difference in mean expression levels (p > 0.05), except for miR-3125 (Supplementary Table 1D).
miRNA profiles in relation to common traits
miRNA expression levels were explored in relation to sex, age, menopausal status, and BMI. Initially, miRNA profiles of 122 males and 213 females were compared and nine DEmiRNAs were observed between sexes (Supplementary Table 2). Specifically, five were up-regulated (miR-324-3p, miR-324-5p, miR-1255b-5p, miR-3935, and miR-4675) and four down-regulated (miR-3615-5p, miR-4326, miR-4418, and miR-4632-5p) in males. The expression levels of miR-324-5p, miR-4326, and miR-4418 are reported as an example of DEmiRNAs associated with sex (Fig. 3a).


(a–c) Box plots showing the expression levels of selected differentially expressed miRNAs among individuals stratified according to the investigated common trait: sex (a), age (b), and BMI (c). P-values were computed by DESeq2 and adjusted using the FDR method. ***adj.p < 0.001, **adj.p < 0.01, *adj.p < 0.05.
Considering the age distribution of the study population, miRNA profiles were investigated comparing three categories: < 37 (n = 122), 37–53 (n = 111), and > 53 (n = 102) years old. In total, 19 DEmiRNAs were identified in at least one comparison among these categories (Supplementary Table 2). miR-1231 and miR-4276-3p were down-regulated and miR-4487 was up-regulated in subjects of the age class 37–53 compared with those of the age class < 37. Comparing subjects of age class > 53 with those of the class < 37, 7 DEmiRNAs (6 down- and miR-3169 up-regulated in the older group) were identified. Between the age classes 37–53 and > 53, 10 DEmiRNAs (3 down- and 7 up-regulated in the eldest group) were identified. Progressive reduced expression levels with increasing age were observed across categories for miR-4276-3p and increased for miR-3169 and miR-4505-3p (Fig. 3b).
For the 19 DEmiRNAs, a similar expression pattern among age classes was observed when the analysis was performed considering males and females separately (Supplementary Table 2). However, only miR-550a-3-5p was significantly down-regulated in males of the age class 37–53 with respect to the other two classes. Additionally, 13 and 6 DEmiRNAs were specifically associated with age in males and females, respectively (Supplementary Table 2).
The relationship between stool miRNA expression levels and age was also explored by Spearman correlation analysis, considering the whole study population or stratified by sex. In the whole population, four miRNAs (miR-922, miR-3619-3p, miR-4276-3p, and miR-8074) decreased with age (-0.26 < SCC < -0.19; adj. p < 0.05). In females, the expression of miR-8074 was inversely correlated with age (SCC = -0.28, adj. p < 0.001) while in males no significant correlation was found (Supplementary Table 3).
Stool miRNA expression levels of women in premenopausal status (n = 132) were compared with those in post menopause (n = 77). Only miR-3615-5p was significantly down-regulated in postmenopausal women (Supplementary Table 2). This miRNA was also significantly more expressed in females compared to men.
miRNA levels in relation to BMI were assessed comparing obese (n = 18), overweight (n = 66), and underweight (n = 25) individuals with normal weight subjects (n = 215). A total of 92 miRNAs were significantly differentially expressed in at least one comparison (Supplementary Table 2). In the obese group, ten and eight miRNAs were respectively up- and down-regulated, 69 miRNAs (64 down- and five up-regulated) were altered in the overweight group, and, finally, in the underweight group, 18 DEmiRNAs (13 down- and 5 up-regulated) were identified. Many of the DEmiRNAs showed a trend of expression going from underweight to obese, including miR-194-3p, miR-200b-3p, and miR-4748-3p (Fig. 3c and Supplementary Table 2).
Stratifying for sex, the 92 above mentioned DEmiRNAs showed a coherent trend of expression either in males or in females (Supplementary Table 2) with 4 and 18 DEmiRNAs also significant in males and females, respectively. Additional differentially expressed miRNAs were found in males (n = 4) or females (n = 33) only.
The stool miRNA expression profiles according to BMI (considered as a continuous variable) were further investigated by Spearman correlation analysis. With an increasing BMI, the expression levels of 22 miRNAs significantly decreased (-0.16 < SCC < -0.24, adj. p < 0.05) while 4 miRNAs increased (-0.22 < SCC < 0.17, adj. p < 0.05) (Supplementary Table 3). Twenty out of the 26 miRNAs significantly correlated with BMI were also differentially expressed among the BMI groups. No significant correlations were observed after the stratification for sex.
miRNA profiles according to lifestyle habits
miRNA levels were further investigated in relation to smoking status, alcohol, and coffee consumption as well as physical activity.
For smoking habits, miRNA levels were analysed comparing subjects who smoked more than 16 cigs/day (n = 16), those smoking less than 16 cigs/day (n = 41) and former smokers (n = 94) with never smokers (n = 181). Overall, 84 DEmiRNAs were identified from the three comparisons (Supplementary Table 2). Comparing individuals who smoke more than 16 cigs/day with non-smokers, 59 DEmiRNAs were identified (50 up- and nine down-regulated) while 22 miRNAs were differentially expressed in those who smoke less than 16 cigs/day compared to non-smokers (three up- and 19 down-regulated). Interestingly, mir-8075-5p and miR-12128 were down-regulated in both smoking categories compared to non-smokers. Finally, 13 miRNAs were differentially expressed in former smokers vs non-smokers, with miR-5090-3p up-regulated and the other 12 DEmiRNAs down-regulated in the former group. Some of the 84 DEmiRNAs associated with smoking showed a trend of expression from non-smoker individuals to those who smoke more than 16 cigs/day (Supplementary Table 2). miR-302c-5p, miR-4508-3p and miR-6745-5p are reported in Fig. 4a as examples of such trends.


(a–d) Box plots showing the expression levels of selected DEmiRNAs among individuals stratified according to the investigated lifestyle habits: smoking status (a), alcohol (b) or coffee (c) consumption, and physical activity (d). P-values were computed by DESeq2 and adjusted using the FDR method. ***adj.p < 0.001, **adj.p < 0.01, *adj.p < 0.05.
A similar expression pattern was observed for the 84 DEmiRNAs when the population was stratified by sex, with 14 and 26 out of 84 DEmiRNAs significantly dysregulated in men and women, respectively. Other additional sex-specific DEmiRNAs were uniquely identified in men (n = 17) and women (n = 14) (Supplementary Table 2).
The relationship between miRNA expression levels and alcohol consumption was assessed comparing non-drinkers (n = 29) with low (n = 230) and high (n = 75) intake drinkers. Comparing high intake drinkers with non-drinkers, miR-3972 was down-regulated in the latter, whereas in low intake drinkers vs non-drinkers, miR-4254 and miR-4254-5p were down-regulated, and miR-6895-3p up-regulated in drinkers. The expression levels of miR-3972, miR-4254, and miR-4254-5p, characterised by a trend of expression in non-drinkers, low, and high intake drinkers, are reported in Fig. 4b. The aforementioned 4 alcohol-related DEmiRNAs showed a similar trend of expression also stratifying by sex, with miR-4254 and miR-3972 being significant in males and miR-4254-5p and miR-6895-3p in females. Moreover, other eight DEmiRNAs were uniquely observed in males and 11 in females in both drinking categories vs non-drinkers.
miRNA profiles in relation to coffee consumption were studied comparing low (n = 149) and high (n = 50) intake coffee drinkers with non-drinkers (n = 135). In high intake coffee consumers vs non-drinkers, 44 miRNAs were differentially expressed, with three up- and 41 down-regulated, in drinkers (Supplementary Table 2). Five out of these 44 miRNAs were also differentially expressed with the same expression trend in low coffee drinkers compared to non-drinkers: miR-1281 was up-regulated and miR-622, miR-1231, miR-7515-5p, and miR-8053 were down-regulated in drinkers. Three miRNAs characterised by decreasing expression levels going from non-drinkers to high drinker subjects are reported in Fig. 4c.
Performing the same comparisons and stratifying the population according to sex, similar trends of expression were observed for the 44 DEmiRNAs previously reported. Two additional miRNAs in males and seven in females were uniquely identified (Supplementary Table 2).
The stratification of the study population according to the physical activity identified the following categories: inactive (n = 90), moderately inactive (n = 114), moderately active (n = 108), and active (n = 21) subjects. Comparing each of the first three categories against active individuals, 11 DEmiRNAs were identified (Supplementary Table 2). Six miRNAs, five down- and one up-regulated (miR-182-3p), resulted from the comparison between inactive vs active subjects. Also six DEmiRNAs (two up- and four down-regulated) were found in moderately inactive individuals and, finally, seven miRNAs (four up- and three down-regulated) in those moderately active. Two DEmiRNAs (miR-4493 and miR-4700-5p) with significant increasing expression levels in the active category and miR-646 with a mild trend of up-regulation going from inactive to the active group are reported in Fig. 4d.
The same comparisons were also performed stratifying according to sex, confirming similar trends of expression for the 11 DEmiRNAs in both males and females. However, among them, miR-646, miR-5090-3p, and miR-12121 were found significantly altered in females and just miR-4700-5p altered in males. Additionally, eight DEmiRNAs were found only in males (n = 4) or females (n = 4) (Supplementary Table 2).
Overview of common miRNAs altered among investigated variables
From a total of 3,041 miRNAs detected, 151 (5%) were associated with at least one of the analysed common traits or lifestyle habits while 52 DEmiRNAs were significant in two or more comparisons (Supplementary Table 2). Considering separately for males and females the stool levels of the latter group of DEmiRNAs, a subtle clustering of miRNAs emerged for both sexes, mainly related to smoking habit, BMI and coffee consumption, as reported in the heatmap in Fig. 5.


Heatmap reporting, separately for males and females, the results of the hierarchical clustering of 52 DEmiRNAs significantly associated with two or more variables. For each DEmiRNA, the z-score of log10 read count for each sample is reported.
To test the temporal stability of all the identified stool DEmiRNAs expression levels in repeated samples collected from the same individuals, a Wilcoxon paired test was performed between the available samples collected at two time points. One hundred and seventy-three miRNAs (85.2%) showed no significant variation between the two measurements (p ≥ 0.05, Supplementary Table 1D). The remaining 30 DEmiRNAs (among which two variables shared 11 miRNAs), showing a significant variability between the two time points, were mostly related to BMI (n = 13), smoking habit (n = 14) or coffee consumption (n = 11).
Target gene enrichment analysis
Functional enrichment analysis was performed to explore whether independent sets of faecal DEmiRNAs identified in this study were involved in relevant biological processes for the investigated variables. The analysis, carried out considering the validated target genes of DEmiRNAs, revealed a total of 298 significantly enriched terms (Supplementary Table 4). Two hundred and eleven, six and 81 terms were enriched for GO, KEGG, and REACTOME gene set libraries, respectively. An overlap of five REACTOME enriched terms among different variables was observed and reported in Supplementary Table 4.

