In all implementations of recommender systems I've seen so far, the train-test split is performed in this manner:
+------+------+--------+ | user | item | rating | +------+------+--------+ | u1 | i1 | 2.3 | | u2 | i2 | 5.3 | | u1 | i4 | 1.0 | | u3 | i5 | 1.6 | | ... | ... | ... | +------+------+--------+ This is transformed into a rating matrix of the form:
+------+-------+-------+-------+-------+-------+-----+ | user | item1 | item2 | item3 | item4 | item5 | ... | +------+-------+-------+-------+-------+-------+-----+ | u1 | 2.3 | 1.7 | 0.5 | 1.0 | NaN | ... | | u2 | NaN | 5.3 | 1.0 | 0.2 | 4.3 | ... | | u3 | NaN | NaN | 2.1 | 1.3 | 1.6 | ... | | ... | ... | ... | ... | ... | ... | ... | +------+-------+-------+-------+-------+-------+-----+ where NaN corresponds to the situation where a user has not rated that particular item.
Now, from each row (user) of the matrix, a certain percentage of the numeric (non-NaN) values are removed and set aside into a new matrix, representing the test set. The model is then trained on the initial matrix, with test samples removed, and the goal of the recommender is to fill-in the missing values, with the smallest possible error.
My question is, can the train-test split be somehow done user-wise? For example to keep a set of users separate, train the recommender on the rest of the user set and then try to predict the ratings for the new users? I know this goes a bit against the idea that "if a recommender does not know you, it cannot recommend something you like", but I am wondering if some k-NN can be done.