2

I want to select rows that have more thank k values in a repeated field. (consider for example selecting user that have more than 3 email addresses)

In Standard SQL I know I can use

SELECT * FROM dataset.users WHERE array_length(email_address) > 3 

But what is the way to do this in BigQuery legacy SQL?

2 Answers 2

8

No need for a subquery; you should be able to filter with OMIT RECORD IF directly:

SELECT * FROM dataset.users OMIT RECORD IF COUNT(email_address) <= 3; 

Do you mind commenting on why you want to use legacy SQL, though? If you encountered a problem with standard SQL I'd like to understand what it was so that we can fix it. Thanks!

Sign up to request clarification or add additional context in comments.

2 Comments

No I have not experienced any problem with Standard SQL, It is just because this is part of a query in my system that is already written in legacy SQL and converting the whole query to standard SQL is not completely straight forward.
Thanks! I appreciate the feedback.
0

Counting Values in a repeated field in BigQuery

BigQuery Legacy SQL

SELECT COUNT(email_address) WITHIN RECORD AS address_count FROM [dataset.users] 

If you want then to count output rows - you can use below

SELECT COUNT(1) AS rows_count FROM ( SELECT COUNT(email_address) WITHIN RECORD AS address_count FROM [dataset.users] ) WHERE address_count> 3 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.