The easiest way I can think of to satisfy the third criteria is to aggregate the y_ids into one row for comparison.
Using a common table expression (cte) makes this more readable for me, but it can be written without it as well.
test setup: http://rextester.com/APVZQS37775
create table x( x_id int , [date] datetime , text varchar(32) ); insert into x values ( 1,'2017-02-22 20:40:30.617','txt1') ,( 2,'2017-02-22 20:40:06.103','txt1') ,( 3,'2017-02-22 20:28:21.393','txt2'); create table xy ( x_id int , y_id int ); insert into xy values ( 1,3 ) ,( 1,10) ,( 2,3 ) ,( 2,10) ,( 3,5 );
query:
;with cte as ( select x.* , y_ids = stuff(( select ','+convert(varchar(10),xy.y_id) from xy where x.x_id = xy.x_id order by xy.y_id for xml path (''), type).value('.','varchar(max)') ,1,1,'') from x ) select * from cte where exists ( select 1 from cte as i where i.x_id <> cte.x_id and abs(datediff(minute,i.date,cte.date))<=5 and i.text = cte.text and i.y_ids = cte.y_ids )
results:
+------+---------------------+------+-------+ | x_id | date | text | y_ids | +------+---------------------+------+-------+ | 1 | 2017-02-22 20:40:30 | txt1 | 3,10 | | 2 | 2017-02-22 20:40:06 | txt1 | 3,10 | +------+---------------------+------+-------+
A method without aggregating the y_ids:
;with cte as ( select x.* , xy.y_id , cnt = count(*) over (partition by x.x_id) from x inner join xy on x.x_id = xy.x_id ) select x.x_id, x.date, x.text from cte as x inner join cte as x2 on x.x_id <> x2.x_id and x.y_id = x2.y_id and x.text = x2.text and x.cnt = x2.cnt and abs(datediff(minute,x.date,x2.date))<=5 group by x.x_id, x.date, x.text, x.cnt having count(*) = x.cnt
returns:
+------+---------------------+------+ | x_id | date | text | +------+---------------------+------+ | 1 | 2017-02-22 20:40:30 | txt1 | | 2 | 2017-02-22 20:40:06 | txt1 | +------+---------------------+------+