Natural Language Processing Model [closed]

Question

I'm making a project to parse, and understand the intentions of input lines by a user in english.

Here is what I think I should do:

Create a text of sentences with POS tagging & marked intentions for every sentence by hand.
Create a model say: decision tree and train it on the above sentences.
Try the model on user input:
Do basic tokenizing and POS tagging on user input sentence and testing it on the above model for knowing the intention of this sentence.

It all may be completely wrong or silly but I'm determined to learn how to do it. I don't want to use ready-made solutions and the programming language is not a concern.

How would you guys do this task? Which model to choose and why? Normally to make NLP parsers, what steps are done.

Can you explain exactly what do you mean with "intention"?

hrzafer
– hrzafer

2011-11-08 16:30:32 +00:00
Commented Nov 8, 2011 at 16:30 — hrzafer
– hrzafer, Commented Nov 8, 2011 at 16:30

Poorna Prudhvi · Accepted Answer · 2018-03-07 11:43:46Z

4

I would use NLTK. There is an online book with a chapter on tagging, and a chapter on parsing. They also provide models in python.

edited Mar 7, 2018 at 11:43

Poorna Prudhvi

7318 silver badges23 bronze badges

answered Oct 30, 2011 at 22:27

cyborg

10.2k4 gold badges40 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Muyinda Rogers · Accepted Answer · 2019-12-11 17:06:30Z

Here is a simple example based on NLTK and Bayes

import nltk import random from nltk.corpus import movie_reviews documents = [(list(movie_reviews.words(fileid)),category) for category in movie_reviews.categories() for fileid in movie_reviews.fileids(category) ] random.shuffle(documents) all_words = [w.lower() for w in movie_reviews.words()] for w in movie_reviews.words(): all_words.append(w.lower())git b all_words = nltk.FreqDist(all_words) word_features = list(all_words.keys())[:3000] def find_features(document): words = set(document) features = {} for w in word_features: features[w] = (w in words) return features print((find_features(movie_reviews.words("neg/cv000_29416.txt")))) featuresets = [(find_features(rev),category) for (rev,category) in documents ] training_set =featuresets[:10] testing_set = featuresets[1900:] classifier = nltk.NaiveBayesClassifier.train(training_set) print("Naive Bayes Algo Accuracy: ",(nltk.classify.accuracy(classifier,testing_set))* 100 )

Collectives™ on Stack Overflow

Natural Language Processing Model [closed]

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related