2

If i have 2 columns viz., ID & Name, ID column containing duplicates, and if i want to group by ID to get unique ID's but name column should be a comma-separated list, can this be possible in Google Query?

| ID | Name | =============== | 1001 | abc | --------------- | 1001 | def | --------------- | 1002 | kjg | --------------- | 1003 | aof | --------------- | 1003 | lmi | --------------- | 1004 | xyz | --------------- 

into

| ID | Name | ==================== | 1001 | abc, def | -------------------- | 1002 | kjg | -------------------- | 1003 | aof, lmi | -------------------- | 1004 | xyz | -------------------- 
6
  • Actually I don't think so. But it can be obtained by standard formulas. Please let me know if you are interested in this kind of solution. Commented Dec 6, 2020 at 19:23
  • hmm..wish there was someway with Google Query mixed with functions like Textjoin etc. Commented Dec 6, 2020 at 19:27
  • I'm pretty sure it can be done with query/textjoin/split/etc. I'll have a go at it... Commented Dec 7, 2020 at 0:01
  • There is a limit of 50,000 characters for TextJoin, Split. Commented Dec 7, 2020 at 5:34
  • I think somehow using the undocumented FLATTEN function should overcome this limit. Commented Dec 7, 2020 at 7:00

4 Answers 4

3

try:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, "select Col1,max(Col2) where Col1 is not null group by Col1 pivot Col3"), "select Col1 offset 1", 0), REGEXREPLACE(TRIM( TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B}, "select max(Col2) where Col1 is not null and Col2 <> ',' group by Col1 pivot Col3"), "offset 1", 0)),,999^9))), ",$", )}) 

enter image description here

however, this may not work for massive datasets due to TRIM (which is needed to remove empty spaces) and REGEXREPLACE (which is needed to remove the end comma) limitations. otherwise, without it, the formula can handle anything:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, "select Col1,max(Col2) where Col1 is not null group by Col1 pivot Col3"), "select Col1 offset 1", 0), TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B}, "select max(Col2) where Col1 is not null and Col2 <> ',' group by Col1 pivot Col3"), "offset 1", 0)),,999^9))}) 
Sign up to request clarification or add additional context in comments.

2 Comments

Very ingenious answer! This is what i call sheer brilliance. Thanks a lot. I am sure that Query without the select statement and large count of headers did the trick. I am mainly puzzled by the Offset 1. Can you explain the entire workings of the formula?
@sifar the query pivots 3rd column so the query outputs the 3rd column as header row number 1, which is totally unnecessary so we use "offset 1" to offset the whole query output by 1 row which results in removing the header row (pivot residue)
2

I looked through Query specification. I could not find a solution. So I made some formulas that do the job (because I found this task interesting). enter image description here

D2 contains =unique(a2:a)

E2 contains =join(", ",transpose(filter($B$2:$B,$A$2:$A=D2)))and it's copied down.

I had to copy formulas down (far from beautiful formula) Hope you find it helpful.

Reference

Comments

1

Here is an answer using QUERY.

=ARRAYFORMULA(REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(SPLIT( CONCATENATE(TRANSPOSE(QUERY({"♦"&A2:A&"♠", B2:B&", "}, "select max(Col2) where Col2 is not null group by Col2 pivot Col1", 0))), "♦")), "♠")), ",$", )) 

This comes directly from this question. Player0 has answers with just amazing formulas that are able to reorganise data in a huge variety of ways.

enter image description here

3 Comments

i am getting Text result of CONCATENATE is longer than the limit of 50000 characters.
Is that with sample data, or your "production" sheet? How many rows of data in columns A and B? Are the values in A or B large blocks of text? That would explain the error. It is possible that this formula could be redeveloped to avoid this issue.
it is with my production sheet which contains 6K rows.
1

if you could live with the end-comma present in the output you can try:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, "select Col1,max(Col3) where Col1 is not null and Col3 <> ',' group by Col1 pivot Col2"), "select Col1 offset 1", 0), TRANSPOSE(QUERY(TRANSPOSE(IFERROR(VLOOKUP(QUERY(QUERY({A2:B, B2:B}, "select Col1,max(Col3) where Col1 is not null and Col3 <> ',' group by Col1 pivot Col2"), "select Col1 offset 1", 0), QUERY(QUERY({A2:B, B2:B&","}, "select Col1,max(Col3) where Col1 is not null and Col3 <> ',' group by Col1 pivot Col2"), "offset 1", 0), SPLIT(TRANSPOSE(QUERY(TRANSPOSE(IF(QUERY(QUERY({A2:B, B2:B&","}, "select max(Col3) where Col1 is not null and Col3 <> ',' group by Col1 pivot Col2"), "offset 1", 0)="",,COLUMN(B2:XXX)&",")),,999^99)), ","), 0))),,999^99))}) 

enter image description here

(tho this was never tested on an ultra-massive dataset but in theory, it should handle anything too)

2 Comments

Wow! brilliant. I will check it in the morning with a larger dataset.
can you help me with this?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.