GitHub - tajamulkhann/NLP-Projects: Your hub for NLP projects and experiments – because language data deserves precision, not noise.

NLP Projects

📖 Evolution of NLP Approaches

1️⃣ Classical Machine Learning (Feature Engineering Era)

Goal: Represent text numerically and apply ML models.

Techniques:

OHE (One Hot Encoding) – sparse, no meaning between words.
BoW (Bag of Words) – counts word frequencies, loses order.
TF-IDF – weighted counts (rarer words get higher importance).
Word2Vec / GloVe – word embeddings capturing semantic meaning (e.g., king – man + woman ≈ queen).

⚠️ Limitation: These are context-independent embeddings (same word = same vector, regardless of sentence).

2️⃣ Deep Learning (Sequential Models Era)

Introduced neural networks for sequences to capture context.

Simple RNN – learns short dependencies, but vanishing gradient → can’t capture long context.
LSTM – solves long-term memory with gates.
GRU – simplified LSTM, fewer gates, faster.
Bi-directional RNN – looks at both past and future in the sequence.

⚠️ Limitation: Sequential processing → slow training, struggles with very long sequences.

3️⃣ Seq2Seq Models (Encoder–Decoder Era)

🔹 Vanilla Encoder–Decoder (Seq2Seq)

Encoder compresses input into a single context vector.
Decoder generates output step by step.
🚨 Issue: Bottleneck → long sentences lose information.

🔹 Seq2Seq with Attention

Decoder attends to all encoder hidden states instead of just context vector.
Dynamically computes a context vector per step.
✅ Huge improvement in translation, summarization.

4️⃣ Transformers (Attention-only Era)

Introduced in “Attention Is All You Need” (2017).
Key Idea: Drop recurrence → use only self-attention.

Advantages:

Parallelizable (fast training).
Captures long-range dependencies better.
Stacks multiple layers of self-attention + feedforward blocks.

Variants:

BERT → encoder-only (understanding tasks like classification, QA).
GPT → decoder-only (generation tasks).
T5 / BART → encoder–decoder (translation, summarization).

⚡ Today, Transformers dominate NLP (translation, chatbots, summarization, coding assistants, etc.).

✅ Final Evolution Path

ML features → RNNs → Seq2Seq → Attention-enhanced Seq2Seq → Transformers

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Abstractive Text Summarization Transformer Model		Abstractive Text Summarization Transformer Model
Datasets		Datasets
Kindle Sentiment Analysis		Kindle Sentiment Analysis
Movie Sentiment Analysis using Simple RNN		Movie Sentiment Analysis using Simple RNN
Spam Ham Classificatrion using BOW and TFIDF		Spam Ham Classificatrion using BOW and TFIDF
Speech Emotion Recognition		Speech Emotion Recognition
Word Prediction using LSTM GRU		Word Prediction using LSTM GRU
README.md		README.md
image.png		image.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Projects

📖 Evolution of NLP Approaches

1️⃣ Classical Machine Learning (Feature Engineering Era)

2️⃣ Deep Learning (Sequential Models Era)

3️⃣ Seq2Seq Models (Encoder–Decoder Era)

🔹 Vanilla Encoder–Decoder (Seq2Seq)

🔹 Seq2Seq with Attention

4️⃣ Transformers (Attention-only Era)

✅ Final Evolution Path

Let's Connect

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NLP Projects

📖 Evolution of NLP Approaches

1️⃣ Classical Machine Learning (Feature Engineering Era)

2️⃣ Deep Learning (Sequential Models Era)

3️⃣ Seq2Seq Models (Encoder–Decoder Era)

🔹 Vanilla Encoder–Decoder (Seq2Seq)

🔹 Seq2Seq with Attention

4️⃣ Transformers (Attention-only Era)

✅ Final Evolution Path

Let's Connect

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages