Skip to main content

Timeline for PCA and the train/test split

Current License: CC BY-SA 4.0

15 events
when toggle format what by license comment
Jan 7, 2021 at 18:50 history edited cbeleites CC BY-SA 4.0
added 64 characters in body
Oct 30, 2018 at 14:16 history edited amoeba CC BY-SA 4.0
added 180 characters in body
Apr 13, 2017 at 12:44 history edited CommunityBot
replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/
Dec 1, 2016 at 19:40 comment added amoeba I found your newer answer here stats.stackexchange.com/questions/239898 very relevant for this old thread, perhaps you will even want to add a link to it in this answer (into the third bullet?). In any case I wanted to leave the link here.
Jul 19, 2016 at 8:13 comment added cbeleites @FelipeAlmeida: yes
Jul 11, 2016 at 15:37 comment added Felipe So basically I should fit the PCA model using only the training set and transform both the training and the test sets, right? (using sklearn vocabulary here)
Oct 17, 2015 at 9:39 history edited amoeba CC BY-SA 3.0
added one more link to the first paragraph, as this issue comes over and over again...
Dec 17, 2014 at 10:13 comment added cbeleites @amoeba, thank you very much. Yes, that is an important point you added. Any many thanks for the work you put into cleaning up the collection of questions.
Dec 16, 2014 at 21:53 comment added amoeba Hi @cbeleites, I want to make this thread a "canonical" thread for the questions about PCA and train/test splitting (there are many!) and mark those as duplicates. I took the liberty to add one sentence to your answer that might clear up a misunderstanding that often arises in the duplicate questions. Hope you are happy with my edit, but please check! +1, btw.
Dec 16, 2014 at 21:51 history edited amoeba CC BY-SA 3.0
very light editing + included one additional sentence to make it a canonical answer
Apr 10, 2013 at 21:25 vote accept Bitwise
Apr 10, 2013 at 20:59 comment added cbeleites @Bitwise: please see my edit
Apr 10, 2013 at 20:59 history edited cbeleites CC BY-SA 3.0
added 694 characters in body
Apr 10, 2013 at 18:10 comment added Bitwise Thanks, this is exactly what I thought so it is good to hear it from an independent source. I am still finding it difficult to get a feeling of how an initial PCA on the whole dataset would bias the results without seeing the class labels.
Apr 10, 2013 at 17:15 history answered cbeleites CC BY-SA 3.0