Post-hoc multiple comparisons adjustment in biomarker discovery

Question

I have a dataset with 10 biomarkers tested in both cerebrospinal fluid (CSF) and serum across 3 groups. I want to compare 6 analytes (which have elevated levels) post hoc between group pairs. Additionally, patients with inflammation are more likely to have multiple analyte elevations at once. I need to adjust p-values for multiple comparisons, but I’m unsure of the best approach.

Here are my specific questions: 1. Holm-Bonferroni vs. FDR: Given that I’m making post-hoc comparisons and controlling for multiple comparisons, which adjustment method is more suitable for my situation? 2. Since I have both CSF and serum measurements, can I consider testing them separately (biologically distinct behavior) when applying the p-value adjustment, or would it be better to treat them as a single set of comparisons? 3. Which Tests to Adjust?: I am comparing 6 analytes (those with elevated levels) between group pairs. Should I adjust the p-values based only on these 6 analytes, or should I adjust for all 10 biomarkers tested (even if only 6 are elevated)? 4. Multicollinearity: Given that patients with inflammation tend to have multiple analyte elevations at once, I’m concerned about multicollinearity. How should I address this? I appreciate any guidance or recommendations!

EdM · Accepted Answer · 2024-11-14 21:46:47Z

As Michael Lew said, much depends on the nature of the study. If this is a preliminary study to identify matters for further detailed investigation, then you are the primary audience and you can proceed in whatever way makes sense to you. The risk in that case is that you might lead yourself down a blind alley if you put too much emphasis on an analyte that was indeed a false positive result.

To answer your Question 1, FDR control makes a lot more sense here than the family-wise error rate controlled by the Holm-Bonferroni procedure, which protects you (at the specified Type-I error rate) from making any errors in your determinations of "statistically significant" results. That's awfully stringent for a study like yours.*

Whether FDR control itself makes sense leads to your Question 4 about multicollinearity, a topic that typically receives much more attention than it deserves except in very large-scale data mining. Multicollineariy of predictor variables in a regression model inflates the variances of individual coefficient estimates. What you have, however, is correlations among the outcome values.

The Benjamini-Hochberg (BH) FDR control procedure assumes that the tests are independent. That probably isn't the case when you have multiple correlated outcomes; that FDR control would probably be too stringent. In gene-expression studies with many thousands of outcome variables, many of which might be highly correlated, FDR can be a useful heuristic for limiting the number of top targets for future study. I don't know that there's much value in interpreting the actual FDR rate, however.

If you do have substantial correlations among analytes, you might consider modeling their first few principal components instead.

The answer to Question 3 should also give you pause about the implications of multiple-comparison correction here. If you are going to do such correction, it must be done for all comparisons together. If you learned from your data that only 6 out of 10 biomarkers are "elevated," learning that required the equivalent of 10 comparisons (at least; more if you did all pairwise comparisons among the 3 groups). Even the BH procedure then requires that the lowest uncorrected p-value have a value of the FDR control rate divided by the number of comparisons; for example, for an FDR of 5% over 10 comparisons, the lowest p-value would need to be below 0.005.

Question 2 might best be answered by a different approach: model the CSF and serum values of an analyte together in a single model of the concentrations. Include the sampleType as a categorical predictor with levels of CSF and serum, and include an interaction of sampleType with group. That directly evaluates both systematic differences between CSF and serum and whether they have different associations with the groups. Although there are p-values reported for each coefficient in a model, they typically aren't corrected for multiple comparisons when the model as a whole is "significant."

Finally, as much depends on details specific to your study and what questions you are trying to answer, look for a local experienced statistician who can work through those details through with you and provide guidance for study design, analysis, and write-up.

*I'm assuming that you used "analyte" as a synonym for "biomarker"; that is, you analyzed 10 biomarkers and found that 6 of them had elevated levels in some group.

Michael Lew · Accepted Answer · 2024-11-14 20:08:31Z

You do not give enough context for anyone to be certain about what advice to give, but given your description I am assuming that this is a preliminary, hypothesis generating study. In that case do not adjust the p-values at all. See here: p value correction in multiple outcomes study

With 10 biomarkers, 6 analytes, and 3 groups any 'correction' for multiplicity will rob your analysis of a lot of statistical power for very little practical benefit. Choose the comparisons that are most interesting and test their hypotheses with a new dataset. (Yes, I know that it is hard to get clinical samples, but good science is often harder to do than crappy science.) The raw p-values are much more useful as an input into the 'interestingness' considerations than are 'corrected' p-values because the corrections are designed to protect against global error rates and not the evidential information encoded in the p-values. See here: https://link.springer.com/chapter/10.1007/164_2019_286

Stack Exchange Network

Post-hoc multiple comparisons adjustment in biomarker discovery

2 Answers 2

Linked

Hot Network Questions

Post-hoc multiple comparisons adjustment in biomarker discovery

2 Answers 2

Linked

Related

Hot Network Questions