Overview

These are files to investigate the new sentence embeddings models published by google on TF Hub It is besed on the paper Universal Sentence Encoder

File Overview

VectorSpaceExample.ipynb

Simple example of how to represent 2D vector on a vector space and how visually we can identify patterns such as clustering

ReducedVocabularyEmbeddings.ipynb

Using Principal Component Analysis (PCA) on one hot encoding example to generate a 2D graph of 4D points

GenerateEmbedding.ipynb

Example of how to use the USE module to generate a sentence embedding and visualize the range of the 512 dimensions

BaselineTest.ipynb

Code to measure the similarity between a baseline of curated queries and 1000 questions from the quora dataset.
It goes through each question and finds the best match for that question from the quora list

VisualComparison.ipynb

Code to visually compare sentence embedding pairs via scatter plot and bar chart

SaveEmbeddings.ipynb

Code that saves the embeddings for all our dataset in a pickle file.
This is so you can easily and quickly do some testing with the sentence embeddings via: "TestSentences.ipynb"
You do not need to run this notebook.

TestSentences.ipynb

Use this notebook to test out the sentence embeddings. How good are they? Do the matches make sense?
What about the scores? are they too high/low? You can just enter some test sentences here to get started

quora_recomend.py

Code to find top five recommendation of similar sentences from quora test list.
This is an example of how sentence embeddings coould be used to create a some recommender for a chatbot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

File Overview

VectorSpaceExample.ipynb

ReducedVocabularyEmbeddings.ipynb

GenerateEmbedding.ipynb

BaselineTest.ipynb

VisualComparison.ipynb

SaveEmbeddings.ipynb

TestSentences.ipynb

quora_recomend.py

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
dataset		dataset
BaselineTest.ipynb		BaselineTest.ipynb
GenerateEmbedding.ipynb		GenerateEmbedding.ipynb
LICENSE		LICENSE
Quora Recommender.ipynb		Quora Recommender.ipynb
README.md		README.md
ReducedVocabularyEmbeddings.ipynb		ReducedVocabularyEmbeddings.ipynb
SaveEmbeddings.ipynb		SaveEmbeddings.ipynb
TestSentences.ipynb		TestSentences.ipynb
VectorSpaceExample.ipynb		VectorSpaceExample.ipynb
VisualComparison.ipynb		VisualComparison.ipynb
embeddings.pkl		embeddings.pkl
floyd.yml		floyd.yml
floyd_requirements.txt		floyd_requirements.txt
quora_recomend.py		quora_recomend.py

License

choran/sentence_embeddings

Folders and files

Latest commit

History

Repository files navigation

Overview

File Overview

VectorSpaceExample.ipynb

ReducedVocabularyEmbeddings.ipynb

GenerateEmbedding.ipynb

BaselineTest.ipynb

VisualComparison.ipynb

SaveEmbeddings.ipynb

TestSentences.ipynb

quora_recomend.py

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages