4

I am using the below script to delete duplicate rows from the google spreadsheet. The script was working good but as the data in the spreadsheet is being added daily, now the script is throwing "Exceeded maximum execution time" error. As I am new to scripting I don't understand what is my problem.

Could someone help me in solving this problem of mine.

 function Deleteduplicates() { var SpreadSheetKey = "My key"; var sheetD = SpreadsheetApp.openById(SpreadSheetKey).getSheetByName("Daily"); var sheetW = SpreadsheetApp.openById(SpreadSheetKey).getSheetByName("Weekly"); var dataD = sheetD.getDataRange().getValues(); var dataW = sheetW.getDataRange().getValues(); //Daily var newDataD = new Array(); for(i in dataD){ var row = dataD[i]; var duplicate = false; for(j in newDataD){ if(row.join() == newDataD[j].join()){ duplicate = true; } } if(!duplicate){ newDataD.push(row); } } //weekly var newDataW = new Array(); for(i in dataW){ var row = dataW[i]; var duplicate = false; for(j in newDataW){ if(row.join() == newDataW[j].join()){ duplicate = true; } } if(!duplicate){ newDataW.push(row); } } sheetD.clearContents(); sheetW.clearContents(); sheetD.getRange(1, 1, newDataD.length, newDataD[0].length).setValues(newDataD); sheetW.getRange(1, 1, newDataW.length, newDataW[0].length).setValues(newDataW); } 
3
  • From the VIEW, EXECUTION TRANSCRIPT menu, at the bottom of the list, there is a time given for how long it takes the script to run. How long does it take? Commented May 7, 2015 at 14:28
  • Execution failed: Exceeded maximum execution time [363.923 seconds total runtime] Commented May 7, 2015 at 14:29
  • Maybe a post like this can help: Exceeded Max Exec Time Commented May 7, 2015 at 14:31

2 Answers 2

3

Conceptually, this should be quite a bit faster. I have not tried it on a large data set. The first version will leave the rows sorted as they were originally. The second version will be faster but will leave the rows sorted according to the columns from first to last on first text.

function Deleteduplicates() { var SpreadSheetKey = "My key"; var ss = SpreadsheetApp.openById(SpreadSheetKey); var sheetD = ss.getSheetByName("Daily"); var sheetW = ss.getSheetByName("Weekly"); var sheets = [sheetD, sheetW]; var toSs = {}; for(s in sheets) { var data = sheets[s].getDataRange().getValues(); for(i in data){ // EDIT: remove commas from join("") for blank test data[i].unshift(data[i].join(""),(1000000 + i).toString()); } data.sort(); // remove blank rows -- Edit var blank = 0; while(data[blank][0].trim().length == 0) {blank++}; if(blank > 0) data.splice(0, blank); // end Edit var len = data.length - 1; for(var x = len; x > 0; x-- ) { if(data[x][0] == data[x-1][0]) { data.splice(x, 1); }; }; for(i in data) { data[i].splice( 0, 1); }; data.sort(); for(i in data) { data[i].splice(0, 1); }; toSs[sheets[s].getSheetName()] = data; }; for(s in sheets) { var data = toSs[sheets[s].getSheetName()]; sheets[s].clearContents(); sheets[s].getRange(1, 1, data.length, data[0].length).setValues(data); } } 

Faster leaving rows sorted by join() created to test for duplicates

function Deleteduplicates() { var SpreadSheetKey = "My key"; var ss = SpreadsheetApp.openById(SpreadSheetKey); var sheetD = ss.getSheetByName("Daily"); var sheetW = ss.getSheetByName("Weekly"); var sheets = [sheetD, sheetW]; var toSs = {}; for(s in sheets) { var data = sheets[s].getDataRange().getValues(); for(i in data){ // EDIT: remove commas from join("") for blank test data[i].unshift(data[i].join("")); } data.sort(); // remove blank rows -- Edit var blank = 0; while(data[blank][0].trim().length == 0) {blank++}; if(blank > 0) data.splice(0, blank); // end Edit var len = data.length - 1; for(var x = len; x > 0; x-- ) { if(data[x][0] == data[x-1][0]) { data.splice(x, 1); }; }; for(i in data) { data[i].splice( 0, 1); }; toSs[sheets[s].getSheetName()] = data; }; for(s in sheets) { var data = toSs[sheets[s].getSheetName()]; sheets[s].clearContents(); sheets[s].getRange(1, 1, data.length, data[0].length).setValues(data); } } 

Edited per Henrique's comment.

Edited 5/8: Remove blank rows(2 edited areas marked)

Sign up to request clarification or add additional context in comments.

7 Comments

There's one important Apps Script enhancement that can be applied here, which is to remove setValues from the loop and do it afterwards (in a separate loop). Because if you setValues and then call `getValues^ (in the next iteration), it will force a spreadsheet flush and recalculation. Since the values in the 2nd get may have changed due to the 1st set call. So, general tip, get everything first, set everything in the end. Do not call set and get in a mixed order.
Thank you Henrique. It's always a pleasure and enlightening to hear from you. Have a good one.
Thank you for the script @ScampMichael, it works good except for two things 1. a new blank line at the top of table is being created and 2. the data is not being sorted according to date, which is the first column but it is by second column i.e Name. Thank you Henrique, your tip really made the difference in execution time
if the data was sorted by date to begin with the first version of the code should leave it so. See 2 edited areas in code to solve blank row.
The edited code has solved the problem of blank row, but the data is still not being sorted by date. @ScampMichael
|
2

There is no problem with your script. It is just exceeding the "maximum execution time" allowed for any script (which is currently 6 minutes).

To workaround this problem you'll have to split your problem into "less than 6 minutes" parts.

For example, in your code you're clearing duplicates from 2 sheets. Trying creating two functions, one for each, and run them separately.

Also, there could be some performance enhancements that could make the script run under 6 minutes. For example, I'm not sure joining each row is the best way (performance-wise) to do an array comparison.

Creating a new array to re-set the data might not be optimal either, I'd probably go with a map verification, which is constant-time, instead of O(n^2) double array checking you're doing.

Bottom line, this is a limitation you have to live with in Apps Script. And any solution anyone proposes is just a workaround, that will also eventually fail if your data gets overly big.

5 Comments

Great answer with O() and all
to be picky, using a map is not O(1) its O(log n) per lookup
I always thought it was O(1), but I never searched it. Anyway, I did it now and it seems browsers's implementation are constant. Have you tested Apps Script implementation? (thanks for bringing it up)
Those answers are not strictly correct. Yes you can get O(1) if using enough memory (much more than the entire object space) but only up to a certain point. Then it has rocgo to log(n). Theres no free lunch, just that log(n) needs a really huge n value to notice that its not a constant.
These other one is more complete. stackoverflow.com/questions/2771368/… one easy way to see where log n is introduced is by looking at the key itself. To compare keys or produce hashes its proportional to the key string length which is basically log n. The trick people use to claim o(1) is by saying "few collitions" but that implies a certain key length thus log n is there.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.