2

As simple as it sounds, can't think of a straightforward way of doing the below in Python.

my_string = "This is a test.\nAlso\tthis" list_i_want = ["This", "is", "a", "test.", "\n", "Also", "this"] 

I need the same behaviour as with string.split(), i.e. remove any type and number of whitespaces, but excluding the line breaks \n in which case I need it as a standalone list item.

How could I do this?

2
  • 1
    First split it at newline, and insert a newline between each resulting string. Then split each non-newline string at whitespace. Commented Mar 25, 2022 at 18:47
  • It makes total sense! Commented Mar 25, 2022 at 18:49

2 Answers 2

3

Split String using Regex findall()

import re my_string = "This is a test.\nAlso\tthis" my_list = re.findall(r"\S+|\n", my_string) print(my_list) 

How it Works:

  • "\S+": "\S" = non whitespace characters. "+" is a greed quantifier so it find any groups of non-whitespace characters aka words
  • "|": OR logic
  • "\n": Find "\n" so it's returned as well in your list

Output:

['This', 'is', 'a', 'test.', '\n', 'Also', 'this'] 
Sign up to request clarification or add additional context in comments.

Comments

0

Here's a code that works but is definitely not efficient/pythonic:

my_string = "This is a test.\nAlso\tthis" l = my_string.splitlines() #Splitting lines list_i_want = [] for i in l: list_i_want.extend((i.split())) # Extending elements in list by splitting lines list_i_want.extend('\n') # adding newline character list_i_want.pop() # Removing last newline character print(list_i_want) 

Output:

['This', 'is', 'a', 'test.', '\n', 'Also', 'this'] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.