It is known that .srt files are structured in blocks having 3 underlying parts, like this example:
228 00:39:06,680 --> 00:39:13,460 Lorem ipsum dolor sit amet Now, let us suppose that in the closed captions there are some excerpts representing the speech of a speaker quoting a literary opus of someone else, like this additional example:
228 00:39:06,680 --> 00:39:13,460 According to Erasmus, book 1, chapter 23... Problem: I wish to extract only the text from the .srt by deleting the frame number, the frame duration without erasing, however, the cardinal numbers that appear in the closed captions as quotations through VIM.
Attempts: By using regular expression and the substitute command, I have found a way to "delete" the duration line with :%s/\d\d:\d\d:\d\d,\d\d\d --> \d\d:\d\d:\d\d,\d\d\d/ /g and the numbers with the same idea, except now searching for each cardinal number entry with the option /gc to bypass those amidst the text.
However, I have a considerable amount of such quotations to extract, for which the cardinal number should be maintained. Selecting yes/no for all entries turns into a tedious task.
Since I have a lacking skill in using regex, I presume to say that there is, at least, a less "ugly" manner to perform the strategy aforementioned. Perhaps, a more elegant way to not only delete the unwanted portions, but also to recover a raw text without the frame and duration lines, like:
Lorem ipsum dolor sit met According to Erasmus, book 1, chapter 23... Someone knows how to do that?