81

I have a list looks like this:

[[1,2,3],[1,2],[1,4,5,6,7]] 

and I want to flatten it into [1,2,3,1,2,1,4,5,6,7]

is there a light weight function to do this without using numpy?

1
  • If the goal is to do something without Numpy, then it isn't a Numpy question and shouldn't be tagged that way. That said, it seems strange to expect Numpy to be helpful here, since the inputs aren't the same length and thus there's no approach involving a rectangular array. Sure, the elements shown here are all integers, but they'd have to be boxed anyway unless we start with Numpy arrays. Commented Sep 6, 2022 at 0:54

3 Answers 3

127

Without numpy ( ndarray.flatten ) one way would be using chain.from_iterable which is an alternate constructor for itertools.chain :

>>> list(chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]])) [1, 2, 3, 1, 2, 1, 4, 5, 6, 7] 

Or as another yet Pythonic approach you can use a list comprehension :

[j for sub in [[1,2,3],[1,2],[1,4,5,6,7]] for j in sub] 

Another functional approach very suitable for short lists could also be reduce in Python2 and functools.reduce in Python3 (don't use this for long lists):

In [4]: from functools import reduce # Python3 In [5]: reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]]) Out[5]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7] 

To make it slightly faster you can use operator.add, which is built-in, instead of lambda:

In [6]: from operator import add In [7]: reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]]) Out[7]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7] In [8]: %timeit reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]]) 789 ns ± 7.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) In [9]: %timeit reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]]) 635 ns ± 4.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) 

benchmark:

:~$ python -m timeit "from itertools import chain;chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]])" 1000000 loops, best of 3: 1.58 usec per loop :~$ python -m timeit "reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])" 1000000 loops, best of 3: 0.791 usec per loop :~$ python -m timeit "[j for i in [[1,2,3],[1,2],[1,4,5,6,7]] for j in i]" 1000000 loops, best of 3: 0.784 usec per loop 

A benchmark on @Will's answer that used sum (its fast for short list but not for long list) :

:~$ python -m timeit "sum([[1,2,3],[4,5,6],[7,8,9]], [])" 1000000 loops, best of 3: 0.575 usec per loop :~$ python -m timeit "sum([range(100),range(100)], [])" 100000 loops, best of 3: 2.27 usec per loop :~$ python -m timeit "reduce(lambda x,y :x+y ,[range(100),range(100)])" 100000 loops, best of 3: 2.1 usec per loop 
Sign up to request clarification or add additional context in comments.

6 Comments

how does sum([[1,2,3],[4,5,6],[7,8,9]], []) compare to these?
@will for short lists its faster than reduce but for longer lists its not!
@Kasramvd Awesome response! But I got confused about how a non-nested list comprehension like [j for i in [[1,2,3],[1,2],[1,4,5,6,7]] for j in i] could flatten the 2d list, could you give some more illustration?
@WeisiZhan List comprehensions of this kind are usually called nested because of the nested for loops. In order to understand the behavior of such list comprehensions you can use a nested for lop and append all the items to a previously defined list. Like, lst = []; for sublist in all_lists: for item in sublist: lst.append(item)
[j for sub in [[1,2,3],[1,2],[1,4,5,6,7]] for j in sub] This is the way :-)
|
104

For just a list like this, my favourite neat little trick is just to use sum;

sum has an optional argument: sum(iterable [, start]), so you can do:

list_of_lists = [[1,2,3], [4,5,6], [7,8,9]] print sum(list_of_lists, []) # [1,2,3,4,5,6,7,8,9] 

this works because the + operator happens to be the concatenation operator for lists, and you've told it that the starting value is [] - an empty list.

but the documentaion for sum advises that you use itertools.chain instead, as it's much clearer.

8 Comments

how to do for list of strings ?
@pyd the code above works for any object type... why not test before asking?
my_list looks like ["A",["B","C"],"D",["E","F"]] it was not working,
@pyd it's not working because "A" isn't a list but ["B","C"] is so when it tried to use + to concatenate them it fails (you can't add a string to a list)
Using + as concatenation in sum, it is like a miracle. Thank you. This is great solution for any list of lists.
|
9

This will work in your particular case. A recursive function would work best if you have multiple levels of nested iterables.

def flatten(input): new_list = [] for i in input: for j in i: new_list.append(j) return new_list 

1 Comment

This is the only solution that actually works for deeply nested lists.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.