5

Question

The following is a snippet of my data:

Create Table Emps(person VARCHAR(50), started DATE, stopped DATE); Insert Into Emps Values ('p1','2015-10-10','2016-10-10'), ('p1','2016-10-11','2017-10-11'), ('p1','2017-10-12','2018-10-13'), ('p2','2019-11-13','2019-11-13'), ('p2','2019-11-14','2020-10-14'), ('p3','2020-07-15','2021-08-15'), ('p3','2021-08-16','2022-08-16'); 

db<>fiddle.

I want to use T-SQL to get a count of how many persons fulfil the following criteria at least once - multiples should also count as one:

For a person:

  1. One of the dates in 'started' (say s1) is larger than at least one of the dates in 'ended' (say e1)
  2. s1 and e1 are in the same year, to be set manually - e.g. '2021-01-01' until '2022-01-01'

Example expected response

If I put the date range '2016-01-01' until '2017-01-01' somewhere in a WHERE / HAVING clause, the output should be 1 as only p1 has both a start date and an end date that fall in 2016 where the start date is larger than the end date:

s1 = '2016-10-11', and e1 = '2016-10-10'.

Why can't I do this myself

The reason I'm stuck is that I don't know how to do this rowwise comparison between groups. The question requires comparing values across columns (start with end) across rows, within a person ID.

2
  • Upvote for providing CREATE TABLE...INSERT and a fiddle. "fulfil the following criteria at least once" Does that mean either of the criteria once, or both once? If the latter does it have to be the same row as each other? Also shouldn't p2 be included, as 2019-11-14 is greater than 2019-11-13 Commented Aug 4, 2022 at 15:28
  • Both once, otherwise it would have said fulfil at least one of the following criteria. Also, it should be the same row, as s1 = s1 and e1 = e1. Commented Aug 4, 2022 at 20:41

4 Answers 4

3

Use conditional aggregation to get the maximum start date and the minimum stop date in the given range.

select person from emps group by person having max(case when started >= '2016-01-01' and started < '2017-01-01' then started end) > min(case when stopped >= '2016-01-01' and stopped < '2017-01-01' then stopped end); 

Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=45adb153fcac9ce72708f1283cac7833

Sign up to request clarification or add additional context in comments.

Comments

3

I would choose to use a self-outer-join with an exists correlation, it should be pretty much the most performant, all things being equal.

select Count(*) from emps e where exists ( select * from emps e2 where e2.person = e.person and e2.stopped > e.started and e.started between '20160101' and '20170101' and e2.started between '20160101' and '20170101' ); 

2 Comments

This is more or less what I came up with as well. In practice, it's probably technically better to declare a variable for upper and lower bounds and reference those, but that's really more for maintainability - I doubt it'd make tons of difference to performance (if any)
Yes and it really depends on the compiled parameters and the cardinality of the data, and SQL Server's estimates, it's not black-and-white.
1

You said you plan to set the dates manually, so this works where we set the start date in one CTE, and the end date in another CTE. Then we calculate the min/max for each, and use that criteria in the query where statement.

with min_max_start as ( select person, min(started) as min_start, --obsolete max(started) as max_start from emps where started >= '2016-01-01' group by person ), min_max_end as ( select person, min(stopped) as min_stop, max(stopped) as max_stop --obsolete from emps where stopped < '2017-01-01' group by person ) select count(distinct e.person) from emps e join min_max_start mms on e.person = mms.person join min_max_end mme on e.person = mme.person where mms.max_start> mme.min_stop 

Output: 1

Comments

1

Try the following:

With CTE as ( Select D.person, D.started, T.stopped, case when Year(D.started) = Year(T.stopped) and D.started > T.stopped then 1 else 0 end as chk From (Select person, started From Emps Where started >= '2016-01-01') D Join (Select person, stopped From Emps Where stopped <= '2017-01-01') T On D.person = T.person ) Select Count(Distinct person) as CNT From CTE Where chk = 1; 

To get the employee list who met the criteria use the following on the CTE instead of the above Select Count... query:

Select person, started, stopped From CTE Where chk = 1; 

See a demo from db<>fiddle.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.