Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

In the tutorial "Introduction to Flair for NLP in Python - State-of-the-art Library for NLP!" Is the tutorial doing a word embedding or sentence embedding or document embedding?

$
0
0

@spiel wrote:

URL: https://www.analyticsvidhya.com/blog/2019/02/flair-nlp-library-python/

Please help to understand the below embedding section of this tutorial :slight_smile:

In the #### Step 5: Vectorizing the text :

  1. Generate word embedding for each word
  2. Calculate the mean of the embeddings of each word to obtain the embedding of the sentence

So are we embedding words (1 sentence has 3 words and we get 3 vectors ) and then finding mean of the words (mean of 3 vectors ) to get sentence embedding ?

Example : Sentence ("today is monday)
so we get 3 vectors embedding for token1 :today, token 2: is, token3:monday
finally we taken mean of the 3 words/token vectors to get sentence embedding of it ? Is my understanding correct ?

There is also one more section
Document Embedding: Vectorizing the entire Tweet
Are we creating a single vector for all the sentence in the Tweet data set .So for instance if there are 100 sentence are all 100 sentence represented as a single vector ?(or)
Dose document embedding create a direct sentence embedding for a single sentence .For instance if there are 100 sentence then 1 sentence is embedded and represented as a single vector so that in end there will be 100 sentence vector which is similar to the above word + taking mean approach for getting sentence embedding ?

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles