itertools groupby + repeat + chain
This is one solution using the itertools module. In essence these are the only operations we need to undertake:
- Group items according to whether they start with
\xa0. - Repeat headers for each list within your list of lists after grouping.
- Chain results for series
A and B to remove nested lists.
Crucially, these operations are already implemented lazily and efficiently in the standard library, so there's no need to reproduce in pure Python (although this, in itself, is a good learning exercise).
Complete solution:
from itertools import chain, groupby, repeat chainer = chain.from_iterable data = ['\xa0header1', 'element1', 'element2', 'element3', '\xa0header2', 'element4', 'element5'] def condition(x): return x.startswith('\xa0') # create list of lists for elements elements = [list(j) for i, j in groupby(data, key=condition) if not i] # create list of headers headers = [next(j) for i, j in groupby(data, key=condition) if i] # chain list of lists, and use repeat for headers df = pd.DataFrame({'A': list(chainer(LoL)), 'B': list(chainer(repeat(i, j) for i, j in \ zip(headers, map(len, elements))))}) print(df) A B 0 element1 header1 1 element2 header1 2 element3 header1 3 element4 header2 4 element5 header2