I have an RDD with this structure
RDD[((String, String), List[(Int, Timestamp, String)])] and data
((D2,Saad Arif),List((4,2011-10-05 00:00:00.0,C101), (5,2010-01-27 00:00:00.0,C101))) ((D3,Faran Abid),List((7,2016-10-05 00:00:00.0,C101))) ((D1,Atif Shahzad),List((1,2012-04-15 00:00:00.0,C101), (2,2011-10-05 00:00:00.0,C101), (3,2006-12-25 00:00:00.0,C101))) consider this as table means
'(D2,Saad Arif)' is like key and
'List((4,2011-10-05 00:00:00.0,C101), (5,2010-01-27 00:00:00.0,C101)' is like rows for this key. Now i want to check for each row that if there is record(history) with code 'C101' before two or more year then set level to 2 otherwise to 1. So the resulting RDD should look like this
((D2,Saad Arif),List((4,2011-10-05 00:00:00.0,C101, 1), (5,2010-01-27 00:00:00.0,C101, 1))) ((D3,Faran Abid),List((7,2016-10-05 00:00:00.0,C101, 1))) ((D1,Atif Shahzad),List((1,2012-04-15 00:00:00.0,C101, 2), (2,2011-10-05 00:00:00.0,C101, 2), (3,2006-12-25 00:00:00.0,C101, 1))) Notice new level after timestamp.How can i do this with map or flatmap?
mapandflatMap? This one is clearly an use-case formap.