Counting Values in a repeated field in BigQuery

Question

I want to select rows that have more thank k values in a repeated field. (consider for example selecting user that have more than 3 email addresses)

In Standard SQL I know I can use

SELECT * FROM dataset.users WHERE array_length(email_address) > 3

But what is the way to do this in BigQuery legacy SQL?

Elliott Brossard · Accepted Answer · 2016-09-19 16:19:47Z

8

No need for a subquery; you should be able to filter with OMIT RECORD IF directly:

SELECT * FROM dataset.users OMIT RECORD IF COUNT(email_address) <= 3;

Do you mind commenting on why you want to use legacy SQL, though? If you encountered a problem with standard SQL I'd like to understand what it was so that we can fix it. Thanks!

answered Sep 19, 2016 at 16:19

Elliott Brossard

34k2 gold badges75 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

S.Mohsen sh Over a year ago

No I have not experienced any problem with Standard SQL, It is just because this is part of a query in my system that is already written in legacy SQL and converting the whole query to standard SQL is not completely straight forward.

Elliott Brossard Over a year ago

Thanks! I appreciate the feedback.

Mikhail Berlyant · Accepted Answer · 2016-09-19 16:53:25Z

Counting Values in a repeated field in BigQuery

BigQuery Legacy SQL

SELECT COUNT(email_address) WITHIN RECORD AS address_count FROM [dataset.users]

If you want then to count output rows - you can use below

SELECT COUNT(1) AS rows_count FROM ( SELECT COUNT(email_address) WITHIN RECORD AS address_count FROM [dataset.users] ) WHERE address_count> 3

Collectives™ on Stack Overflow

Counting Values in a repeated field in BigQuery

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related