Timeline for data visualization RNAseq : scaling data for PCA and cluster dendogram
Current License: CC BY-SA 4.0
5 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Oct 8, 2020 at 18:07 | vote | accept | Mee | ||
| Oct 8, 2020 at 18:05 | comment | added | Mee | ok, thank you for your answer now I undestand. But, in terms of using the cpm() function: is it ok to use it over the data that is already TMM? Thanks | |
| Oct 8, 2020 at 9:30 | comment | added | haci | What TMM does is to "make different samples comparable" by adjusting for library size and more. It does not deal with individual genes. So yes, scaling would help when expression levels of different genes vary a lot (it is often the case). Since you are using edgeR, you can use the function cpm() to calculate log2 of CPM and then scale these before clustering for example. | |
| Oct 8, 2020 at 8:16 | comment | added | Mee | Yes, I understand that. But the data has already been standardized using by trimmed mean of the value. And when performing PCA with and without scaling the results are pretty similar. I am wondering if the scaling and centering are not needed because the data has already been standardized. | |
| Oct 7, 2020 at 13:43 | history | answered | haci | CC BY-SA 4.0 |