If you have some already preprocessed text that is tagged, what are the rules to extract Subject-Verb-Object (SVO) triples if you want a triple like (word, word, word). Can you give the sentence as example and extract all triples? Do you just need to find all combinations without repetition from set of N words?
1 Answer
The goal of Subject-Verb-Object (SVO) triples is to extract a single triple for a sentence.
The sentence:
A rare black squirrel has become a regular visitor to a suburban garden.
results in the following SVO:
(squirrel, become, visitor)
Triplet Extraction From Sentences by Rusu et al. outlines how to do that. First, you'll need a parse tree ( Stanford Parser and OpenNLP are the most common). The three items then can be extracted:
The subject will be found by performing breadth first search and selecting the first descendent of NP that is a noun.
… the predicate of the sentence, a search will be performed in the VP subtree. The deepest verb descendent of the verb phrase will give the second element of the triplet.
… we look for objects. These can be found in three different subtrees, all siblings of the VP subtree containing the predicate. The subtrees are: PP (prepositional phrase), NP and ADJP (adjective phrase). In NP and PP we search for the first noun, while in ADJP we find the first adjective.