4

I have a code as follows.

for item in my_list: print(item[0]) temp = [] current_index = my_list.index(item) garbage_list = creategarbageterms(item[0]) for ele in my_list: if my_list.index(ele) != current_index: for garbage_word in garbage_list: if garbage_word in ele: print("concepts: ", item, ele) temp.append(ele) print(temp) 

Now, I want to remove the ele from mylist when it gets appended to temp (so, that it won't get processed in the main loop, as it is a garbage word).

I know it is bad to remove elements straightly from the list, when it is in a loop. Thus, I am interested in knowing if there is any efficient way of doing this?

For example, if mylist is as follows;

 mylist = [["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["pudding", 298.2], ["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["berry_tim_tam", 171.9], ["tiramusu", 158.4], ["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999]] 

1st iteration

for the first element tim_tam, I get garbage words such as yummy_tim_tam and berry_tim_tam. So they will get added to my temp list.

Now I want to remove yummy_tim_tam and berry_tim_tam from the list (because they have already added to temp), so that it won't execute from the beginning.

2nd iteration

Now, since yummy_tim_tam is no longer in the list this will execute pudding. For pudding I get a diffrent set of garbage words such as chocolate_pudding, biscuits, tiramu. So, they will get added to temp and will get removed.

3rd iteration

ice_cream will be selected. and the process will go on.

My final objective is to get three separate lists as follows.

["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["berry_tim_tam", 171.9] , ["pudding", 298.2] ["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["tiramusu", 158.4] ["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999] 
7
  • 2
    It's not clear what you're trying to do with the whole .indexing thing. It also seems that garbage_list should really be a set instead... Commented Jan 14, 2018 at 11:40
  • @RoadRunner I want to gather similar words as follows. ["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["berry_tim_tam", 171.9] , ["pudding", 298.2], ["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["tiramusu", 158.4] , ["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999]. I will update the question again :) Commented Jan 14, 2018 at 11:50
  • @JCena Are you only trying to remove yunny_tim_tam and berry_tim_tam from temp? Or are their other words your trying to remove? Could you not just add everything and filter out what you don't want at the end of the loop? Or you could have some sort of pre-checking beforehand so you never add them to begin with. Commented Jan 14, 2018 at 11:50
  • It is not a good idea to modify lists which are being iterated. Better just check in the main loop if the term is already in temp before processing it. If True, continue Commented Jan 14, 2018 at 11:56
  • 1
    It is the way to go. If you need to obtain other lists, just create them and fill them appending items inside the for loops as you do for temp, or, if needed, after the main loop is finished. But do not modify mylist during the loops. Use also enumerate in the for loops to get the item index. Do not use 8 spaces for indent, just 4 Commented Jan 14, 2018 at 12:26

3 Answers 3

3

This code produces what you want:

my_list = [['tim_tam', 879.3], ['yummy_tim_tam', 315.0], ['pudding', 298.2], ['chocolate_pudding', 218.4], ['biscuits', 178.2], ['berry_tim_tam', 171.9], ['tiramusu', 158.4], ['ice_cream', 141.6], ['vanilla_ice_cream', 122.39] ] creategarbageterms = {'tim_tam' : ['tim_tam','yummy_tim_tam', 'berry_tim_tam'], 'pudding': ['pudding', 'chocolate_pudding', 'biscuits', 'tiramusu'], 'ice_cream': ['ice_cream', 'vanilla_ice_cream']} all_data = {} temp = [] for idx1, item in enumerate(my_list): if item[0] in temp: continue all_data[idx1] = [item] garbage_list = creategarbageterms[item[0]] for idx2, ele in enumerate(my_list): if idx1 != idx2: for garbage_word in garbage_list: if garbage_word in ele: temp.append(ele[0]) all_data[idx1].append(ele) for item in all_data.values(): print('-', item) 

This produces:

- [['tim_tam', 879.3], ['yummy_tim_tam', 315.0], ['berry_tim_tam', 171.9]] - [['pudding', 298.2], ['chocolate_pudding', 218.4], ['biscuits', 178.2], ['tiramusu', 158.4]] - [['ice_cream', 141.6], ['vanilla_ice_cream', 122.39]] 

Note that for the purpose of the example I created a mock creategarbageterms function (as a dictionary) that produces the term lists as you defined it in your post. Note the use of a defaultdict which allows unlimited number of iterations, that is, unlimited number of final lists produced.

Sign up to request clarification or add additional context in comments.

Comments

2

I would propose to do it like this:

mylist = [["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["pudding", 298.2], ["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["berry_tim_tam", 171.9], ["tiramusu", 158.4], ["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999]] d = set() # remembers unique keys, first one in wins for i in mylist: shouldAdd = True for key in d: if i[0].find(key) != -1: # if this key is part of any key in the set shouldAdd = False # do not add it if not d or shouldAdd: # empty set or unique: add to set d.add(i[0]) myCleanList = [x for x in mylist if x[0] in d] # clean list to use only keys in set print(myCleanList) 

Output:

[['tim_tam', 879.3000000000001], ['pudding', 298.2], ['biscuits', 178.20000000000002], ['tiramusu', 158.4], ['ice_cream', 141.6]] 

If the order of things in the list is not important, you could use a dictionary directly - and create a list from the dict.

If you need sublists, create them:

similarThings = [ [x for x in mylist if x[0].find(y) != -1] for y in d] print(similarThings) 

Output:

[ [['tim_tam', 879.3000000000001], ['yummy_tim_tam', 315.0], ['berry_tim_tam', 171.9]], [['tiramusu', 158.4]], [['ice_cream', 141.6], ['vanilla_ice_cream', 122.39999999999999]], [['pudding', 298.2], ['chocolate_pudding', 218.4]], [['biscuits', 178.20000000000002]] ] 

As @joaquin pointed out in the comment, I am missing the creategarbageterms() functions that groups tiramusu and biscuits with pudding to fit the question 100% - my answer is advocating "do not modify lists in interations, use appropriate set or dict filter it to the groups. Unique keys here are keys that are not parts of later mentioned keys.

2 Comments

please note your results do not correspond with the expected output as defined by the OP
@joaquin Agreed - I am missing the the "broader" creategarbageterms() function that makes biscuits and tiramusu belong the the pudding variety. The question evolved quite a bit and my answer was build with less data than there is now. The creategarbageterms() is still not in the question and your answer approximated it from the "later added" information as to what 3 results should be given. Your answer is better suited to answer the How to remove list elements within a loop effectively in python - mine is "how to get to the result without modifying iteration in a loop.
1

You want to have an outer loop that's looping through a list, and an inner loop that can modify that same list.

I saw you got suggestions in the comments to simply not remove entries during the inner loop at all, but instead check if terms already are in temp. This is possible, and may be easier to read, but is not necessarily the best solution with respect to processing time.

I also see you received an answer from Patrick using dictionaries. This is probably the cleanest solution for your specific use-case, but does not address the more general question in your title which is specifically about removing items in a list while looping through it. If for whatever reason this is really necessary, I would propose the following:

idx = 0 while idx < len(my_list) item = my_list[idx] print(item[0]) temp = [] garbage_list = creategarbageterms(item[0]) ele_idx = 0 while ele_idx < len(my_list): if ele_idx != idx: ele = my_list[ele_idx] for garbage_word in garbage_list: if garbage_word in ele: print("concepts: ", item, ele) temp.append(ele) del my_list[ele_idx] ele_idx += 1 print(temp) idx += 1 

The key insight here is that, by using a while loop instead of a for loop, you can take more detailed, ''manual'' control of the control flow of the program, and more safely do ''unconventional'' things in your loop. I'd only recommend doing this if you really have to for whatever reason though. This solution is closer to the literal question you asked, and closer to your original own code, but maybe not the easiest to read / most Pythonic code.

1 Comment

@JohnKugelman yes I know, that was the entire point of the answer, because this is a requirement of the question if we read the question's title literally. If you meant to point out that this means the second loop should also be a while loop instead of a for loop, that's correct, I missed that and just edited to fix that

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.