Word Embeddings #125

securetorobert · 2018-10-16T13:46:19Z

This is the first version of our Keras embeddings tutorial.

https://colab.sandbox.google.com/github/securetorobert/docs/blob/master/site/en/tutorials/keras/intro_word_embeddings.ipynb

Created a section for the Introduction to word embeddings, and also started writing the tutorial for creating and Embedding layer

…nd training for the second model; plotted training history for both

…ence to TF Hub

… without

Thank you for starting this! Here's a round of edits. I think it's almost ready to go. Could you take a look and see if there's anything we can improve in this version? When you're happy with it, please submit a PR to the TF Docs repo, and we can continue refining there. I'd like to add some graphics of the embedding projector before we publish, add more references to educational resources, and improve the intro to embeddings before we publish, but we can work on those changes in the docs PR whenever you're ready. Thanks again!

Updates

googlebot · 2018-10-16T13:46:24Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of all the commit author(s), set the cla label to yes (if enabled on your project), and then merge this pull request when appropriate.

MarkDaoust

Please remove the .ipynb_checkpoints/ directory from the PR.

I'd prefer to avoid the use of Flatten because it ties your model to a specific sequence length. The model has to learn the meaning of the embedding at each location.

I would use a GlobalAveragePooling, an RNN, CNN, or an attention module to squeeze the sequence to a fixes size and enforce the idea that the absolute position of a word in a sequence doesn't matter (much).

There is a lot of overlap between this and basic_text_classification.ipynb. The new content seems to be mainly be the setup to use the tensorboard embedding projector. Is there a way we can put more emphasis on that, and reduce the duplication? should this maybe a sub-section of basic_text_classification.ipynb?
Could we add an inline plot for people who are only running in colab?

random-forests · 2018-10-16T17:04:50Z

Thanks for the feedback! Good comments. Thinking about this more, how does the following sound?

We can include a section on using pre-trained embeddings.
We can use a more sophisticated CNN-based model, perhaps from the right side of this flow chart: https://developers.google.com/machine-learning/guides/text-classification/step-2-5 (others are working on an RNN tutorial at the moment, so I don't think we need to include that part here).

From there, we can include a few diagrams helping to communicate the basic idea of embeddings. WDYT?

securetorobert · 2018-10-16T17:09:31Z

I will review the comments above and implement in a little bit. Thanks guys.

securetorobert · 2018-10-16T17:19:35Z

@random-forests do you need to provide consent for the CLA above?

securetorobert · 2018-10-23T17:41:18Z

@random-forests kindly take a look at the recent updates. Thanks

securetorobert · 2018-10-28T21:42:25Z

@MarkDaoust kindly review recent changes, thanks.

random-forests · 2018-11-02T17:28:38Z

Hi Robert, thanks again for working on this! I will circle back in a week or so (been OOO for a bit, please stay tuned).

random-forests · 2019-02-13T15:10:18Z

I'm only a few months late here (wowzers, my bad!) I'm going to close this PR out, and work on a new version for a bit. Will definitely credit you when we get it out! Thanks again for your help, and sorry again for the delay.

securetorobert and others added 9 commits September 6, 2018 15:48

added notebook for word embeddings

30a7380

Started writing Embedding tutorial

fa33f92

Created a section for the Introduction to word embeddings, and also started writing the tutorial for creating and Embedding layer

used pre-trained GloVe embedding, with a frozen layer for one model a…

464265a

…nd training for the second model; plotted training history for both

filled in the introduction and also added closing advice with a refer…

451eb65

…ence to TF Hub

filled in the introduction and also added closing advice with a refer…

40dc42a

…ence to TF Hub

removed GloVe and TFHub from intro_word_embeddings

a1b36b7

updated intro-word-embeddings to show a model with embeddings and one…

d568d52

… without

Merge pull request #1 from random-forests/patch-3

5f199b0

Updates

securetorobert requested review from MarkDaoust and lamberta as code owners October 16, 2018 13:46

MarkDaoust requested a review from random-forests October 16, 2018 16:13

MarkDaoust requested changes Oct 16, 2018

View reviewed changes

securetorobert added 2 commits October 16, 2018 17:49

removed checkpoints

2aa0d13

Merge branch 'master' of https://github.com/securetorobert/docs

039e7ba

changed the Flatten layer to GlobalAveragePooling1D

2e6dd08

random-forests added the cla: yes CLA has been signed label Oct 16, 2018

MarkDaoust changed the title ~~Pull Request~~ Word Embeddings Oct 18, 2018

implemented GloVe and sepCNN

0aee7de

lamberta added the notebook: new New Colab notebook tutorial label Dec 26, 2018

random-forests closed this Feb 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Word Embeddings #125

Word Embeddings #125

Uh oh!

securetorobert commented Oct 16, 2018 •

edited by MarkDaoust

Loading

googlebot commented Oct 16, 2018

MarkDaoust left a comment •

edited

Loading

random-forests commented Oct 16, 2018

securetorobert commented Oct 16, 2018

securetorobert commented Oct 16, 2018

securetorobert commented Oct 23, 2018

securetorobert commented Oct 28, 2018

random-forests commented Nov 2, 2018

random-forests commented Feb 13, 2019

Labels

5 participants

Word Embeddings #125

Word Embeddings #125

Uh oh!

Conversation

securetorobert commented Oct 16, 2018 • edited by MarkDaoust Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

googlebot commented Oct 16, 2018

MarkDaoust left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

random-forests commented Oct 16, 2018

securetorobert commented Oct 16, 2018

securetorobert commented Oct 16, 2018

securetorobert commented Oct 23, 2018

securetorobert commented Oct 28, 2018

random-forests commented Nov 2, 2018

random-forests commented Feb 13, 2019

Labels

5 participants

securetorobert commented Oct 16, 2018 •

edited by MarkDaoust

Loading

MarkDaoust left a comment •

edited

Loading