0

I have a column from a Google Bigquery Table which has certain results (na, pass, warn, and error) based on criteria (c1, c2, c3, ...), for example.

 data c1 c2 c3 ... ------- ------- ------- ------- ------- data1 pass na warn ... data2 pass na pass ... data3 error pass error ... 

Now I want to get a histogram for each column of the input table, and I want to create the summarized table as follows.

 criteria na pass warn error ---------- ------- ------- ------- ------- c1 0 2 0 1 c2 2 1 0 0 c3 0 1 1 1 ... ... ... ... ... 

Actually, there are many criteria, so I want to create the summarized table without enumerating the criteria. How can I get the output table?

2 Answers 2

2

You may consider below,

CREATE TEMP FUNCTION json_keys(input STRING) RETURNS Array<String> LANGUAGE js AS """ return Object.keys(JSON.parse(input)); """; CREATE TEMP FUNCTION json_values(input STRING) RETURNS Array<String> LANGUAGE js AS """ return Object.values(JSON.parse(input)); """; SELECT * FROM ( SELECT criteria, result FROM (SELECT * EXCEPT(data) FROM sample_table) t, UNNEST (json_keys(TO_JSON_STRING(t))) criteria WITH OFFSET JOIN UNNEST (json_values(TO_JSON_STRING(t))) result WITH OFFSET USING (offset) ) PIVOT (COUNT(result) FOR result IN ('na', 'pass', 'warn', 'error')); +----------+----+------+------+-------+ | criteria | na | pass | warn | error | +----------+----+------+------+-------+ | c1 | 0 | 2 | 0 | 1 | | c2 | 2 | 1 | 0 | 0 | | c3 | 0 | 1 | 1 | 1 | +----------+----+------+------+-------+ 
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you so much. I tried your script, but my table is too big and the script couldn't be done. The problem was solved using pivot and unpivot.
@shok1122, thanks for confirming that. obviously it makes sense that if you can get column information from information_schema directly rather than from expensive json manipulation, it will run more efficiently.
1

You can unpivot then pivot again to get the table you want as such:

input as ( select "data1" as data, "pass" as c1,"warn" as c2,"na" as c3, "pass" as c4 union all select "data2" as data, "na" as c1,"pass" as c2,"pass" as c3, "error" as c4 union all select "data3" as data, "warn" as c1,"pass" as c2,"na" as c3, "warn" as c4 union all select "data4" as data, "pass" as c1,"error" as c2,"pass" as c3, "error" as c4 union all select "data5" as data, "pass" as c1,"na" as c2,"na" as c3, "error" as c4 union all select "data6" as data, "pass" as c1,"warn" as c2,"na" as c3, "na" as c4 ), step_unpivot as ( select * from input unpivot(results for criterias in (c1,c2,c3,c4)) ) select * from step_unpivot pivot(count(data) for results in ('pass', 'warn','na','error')) 

And since you want dynamic column selection for the unpivot(), you can probably use the answer to this question for that part: Dynamic UnPivot in Bigquery

Edit: Below a version with the dynamic unpivot

 DECLARE myunpivot STRING; SET myunpivot = ( SELECT CONCAT('(', STRING_AGG( column_name, ','), ')'), From( SELECT column_name FROM dataset_name.INFORMATION_SCHEMA.COLUMNS where table_name ="unpivot_pivot" and column_name not in("data"))); EXECUTE IMMEDIATE format(""" with step_unpivot as ( select * from dataset_name.table_name unpivot(results for criterias in %s) ) select * from step_unpivot pivot(count(data) for results in ('pass', 'warn','na','error')) """, myunpivot); 

3 Comments

Thank you so much. Your sample script partially does what I want to do. Is there a way to specify unpivot column names by variable? I can enumerate the column names as follows, but can I use this result? SELECT column_name FROM <sample_dateset>.<sample_table>.INFORMATION_SCHEMA.COLUMNS WHERE table_name = <table_name>
The link I provided at the end should help you do that. It uses INFORMATION_SCHEMA.COLUMNS to select the columns you want to unpivot in a variable that is used in a subsequent query. I edited my answer with that part integrated
Thank you so much. The process performed as expected.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.