1

Using regular expressions, I'm trying to detect conditional clauses in sentences. In cases where the sentences starts with the conditional clause and the comma is correctly present (e.g., "If I see a snake, I run."), this is reasonably straightforward.

However, I have problems in cases there the conditional clause is in the end. For example, I can rewrite the previous example as

I run if I see a snake 

where there is not comma. Now, there are also sentences such as

I wonder if I passed the test I'm not sure if I passed the test 

Is there a straightforward way to tell using (simple) rules that the second sentence does not contain a conditional clause?

Intuitively, if I can replace if with whether, then it's not a conditional clause. But how could I tell that an algorithm that the new sentence is still correct?

In case it helps: I have for each word the Part-of-Speech tag, and should be able to get the tense of each verb.

2
  • 1
    For this particular issue, it depends on the subcategorisation frame (not just the part of speech) of the predicate (verb or adjective) of the matrix clause. So you have to know that "sure" can take a "whether" clause in order to know that I'm not sure if I passed the test has a different parse from I'm celebrating if I passed the test. You can't do it just on part of speech. Commented Apr 18, 2016 at 22:27
  • Yeah, I thought that POS (alone) won't do. I just wanted to mention what kind of information I have on hand. I don't even know if "can be replaced with whether" is a good rule anyway. I only noticed it when formulating my examples. Perhaps there's a more underlying reason that distinguishes a conditional from an non-conditional IF. Commented Apr 18, 2016 at 23:12

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.