I’m having a statistical problem (a rather major one) and I was wondering if you could help. I’m researching microbial chemotaxis and analysing colony perimeters by scanning their fluorescence. Chemotaxis is the ability of a microbial cell to swim towards an attractant, in this case succinate. I’ve a set of ~10 scans (A,B,C..) of these colonies. with some having errors.
I’m comparing the perimeter (cm) of:
Alpha chemotactic mono-culture. Beta non-chemotactic mono-culture. Gamma co-culture. I have two other strain combinations (Delta, Epsilon), however these are the ones I’m focussing on now. In three different treatment conditions; 0 mM sugar, 1 mM sugar and 10 mM sugar (succinate).
Statistically comparing the perimeters of three different strain mixtures;
Alpha vs. Beta Alpha vs. Gamma Beta vs. Gamma So far I’ve tried comparing the raw values (cm) which are generally non-normally distributed; upon inspection of density plots and using a Shapiro-Wilks test. I’ve then used a Wilcox rank test to compare them (Fig. 1), I’ve done this with and without outliers removed after assessing their IQR (Fig. 2, Unpair.xlsx).
There is quite a lot of variability, so I decided to standardise the perimeter values of Beta, Gamma (Delta, Epsilon) against the Alpha chemotactic control strain within each scan/batch (A,B,C...). This requires pairs within each scan and successful Alpha scans, if there was an issue with Alpha strain growth; such as contamination, successful scans of Beta for example within the same batch were deleted (Fig. 3). I then removed outliers based on their IQR (Fig. 4) and conducted standardisation: eg. Example: (Alpha perimeter: 7cm / Alpha perimeter: 7cm = 1 and Beta perimeter: 2 cm/ Alpha perimeter: 7 cm = 0.286) (Fig. 5, Pair.xlsx).
Unpair Excel columns: UniqueID:, Bacteria, Treatment, Date, Perimeter (cm), Scan, Biological samples (this is the exact culture number I used). Pair Excel columns: UniqueID:, Bacteria, Treatment, Date, Perimeter (cm), Standardised (cm), Scan, Biological samples (this is the exact culture number I used).
Fig. 1-4.: a), b), c) Treatment comparison. Fig. 1-5.: d), e), f) Strain comparison. Link: https://drive.google.com/file/d/113-u7nVtAuCyVpBUU-cyW_mwvJn4mHAI/view?usp=sharing
As you can see from the figures the results are fairly inconsistent and I’m unsure if the standardisation method I applied works, so I was wondering your take on the most robust statistical test I can use to analyse the strain and treatments.
For example, if I conduct a Wilcox rank test of the Alpha vs. Beta there are all ties, because every Beta standardised is being compared against 1 (Alpha). Is there any kind of paired batch effect standardisation across scans, biological samples or comparison to the mean or average values you’d suggest, or any smoothing techniques I could apply to the perimeter (cm) values. Or any kind of way I could say model the likelihood of one perimeter being greater than the other ?