2

Is there an easy way to split text into separate lines each time a specific type of font arises. For example, I have text that looks like this:

BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible. 

I'd like to split the text into lines for each speaker:

BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible. 

The speaker is always capitalized with a colon following the name.

2 Answers 2

11
import re a="BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible." print re.split(r"\s(?=[A-Z]+:)",a) 

You can use re.split for this.

Output:['BILLY: The sky is blue.', 'SALLY: It really is blue.', 'SAM: I think it looks like this: terrible.']

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! I'm new to this, so this is extremely helpful. I appreciate it!
3

If you just want to change the text rather than have a list, you could do the following:

import re text = "BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible." print re.sub(r'([A-Z]+\:)', r'\n\1', text).lstrip() 

This would print:

BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible. 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.