@spiel wrote:
URL: https://www.analyticsvidhya.com/blog/2019/02/flair-nlp-library-python/
Please help to understand the below embedding section of this tutorial
In the #### Step 5: Vectorizing the text :
- Generate word embedding for each word
- Calculate the mean of the embeddings of each word to obtain the embedding of the sentence
So are we embedding words (1 sentence has 3 words and we get 3 vectors ) and then finding mean of the words (mean of 3 vectors ) to get sentence embedding ?
Example : Sentence ("today is monday)
so we get 3 vectors embedding for token1 :today, token 2: is, token3:monday
finally we taken mean of the 3 words/token vectors to get sentence embedding of it ? Is my understanding correct ?There is also one more section
Document Embedding: Vectorizing the entire Tweet
Are we creating a single vector for all the sentence in the Tweet data set .So for instance if there are 100 sentence are all 100 sentence represented as a single vector ?(or)
Dose document embedding create a direct sentence embedding for a single sentence .For instance if there are 100 sentence then 1 sentence is embedded and represented as a single vector so that in end there will be 100 sentence vector which is similar to the above word + taking mean approach for getting sentence embedding ?
Posts: 1
Participants: 1