Timeline for PCA and the train/test split

Current License: CC BY-SA 4.0

15 events

when toggle format	what		by	license	comment
Jan 7, 2021 at 18:50	history	edited	cbeleites	CC BY-SA 4.0	added 64 characters in body
Oct 30, 2018 at 14:16	history	edited	amoeba	CC BY-SA 4.0	added 180 characters in body
Apr 13, 2017 at 12:44	history	edited	CommunityBot		replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/
Dec 1, 2016 at 19:40	comment	added	amoeba		I found your newer answer here stats.stackexchange.com/questions/239898 very relevant for this old thread, perhaps you will even want to add a link to it in this answer (into the third bullet?). In any case I wanted to leave the link here.
Jul 19, 2016 at 8:13	comment	added	cbeleites		@FelipeAlmeida: yes
Jul 11, 2016 at 15:37	comment	added	Felipe		So basically I should fit the PCA model using only the training set and transform both the training and the test sets, right? (using sklearn vocabulary here)
Oct 17, 2015 at 9:39	history	edited	amoeba	CC BY-SA 3.0	added one more link to the first paragraph, as this issue comes over and over again...
Dec 17, 2014 at 10:13	comment	added	cbeleites		@amoeba, thank you very much. Yes, that is an important point you added. Any many thanks for the work you put into cleaning up the collection of questions.
Dec 16, 2014 at 21:53	comment	added	amoeba		Hi @cbeleites, I want to make this thread a "canonical" thread for the questions about PCA and train/test splitting (there are many!) and mark those as duplicates. I took the liberty to add one sentence to your answer that might clear up a misunderstanding that often arises in the duplicate questions. Hope you are happy with my edit, but please check! +1, btw.
Dec 16, 2014 at 21:51	history	edited	amoeba	CC BY-SA 3.0	very light editing + included one additional sentence to make it a canonical answer
Apr 10, 2013 at 21:25	vote	accept	Bitwise
Apr 10, 2013 at 20:59	comment	added	cbeleites		@Bitwise: please see my edit
Apr 10, 2013 at 20:59	history	edited	cbeleites	CC BY-SA 3.0	added 694 characters in body
Apr 10, 2013 at 18:10	comment	added	Bitwise		Thanks, this is exactly what I thought so it is good to hear it from an independent source. I am still finding it difficult to get a feeling of how an initial PCA on the whole dataset would bias the results without seeing the class labels.
Apr 10, 2013 at 17:15	history	answered	cbeleites	CC BY-SA 3.0