118k views
5 votes
Consider the following corpus c1 of 4 sentences. What is the total count of unique bi-grams for which the likelihood will be estimated? Assume we do not perform any preprocessing

today is Nayan's birthday
she loves ice cream
she is also fond of cream cake
we wíll celebrate her birthday with ice cream cake

User Artscan
by
7.7k points

1 Answer

1 vote

Answer:

The total count of unique bi-grams is 18.

Step-by-step explanation:

A bi-gram, also known as a 2-gram, is a sequence of two adjacent elements in a given text. In this case, we are dealing with words as elements. The total count of unique bi-grams can be calculated by considering all possible pairs of adjacent words in the given corpus and then counting the unique combinations.

Let's break down the sentences to find all unique bi-grams:

  1. Sentence 1: "today is Nayan's birthday" Bi-grams: "today is", "is Nayan's", "Nayan's birthday"
  2. Sentence 2: "she loves ice cream" Bi-grams: "she loves", "loves ice", "ice cream"
  3. Sentence 3: "she is also fond of cream cake" Bi-grams: "she is", "is also", "also fond", "fond of", "of cream", "cream cake"
  4. Sentence 4: "we will celebrate her birthday with ice cream cake" Bi-grams: "we will", "will celebrate", "celebrate her", "her birthday", "birthday with", "with ice", "ice cream", "cream cake"

Now, let's count the unique bi-grams:

  • "today is", "is Nayan's", "Nayan's birthday", "she loves", "loves ice", "ice cream", "she is", "is also", "also fond", "fond of", "of cream", "cream cake", "we will", "will celebrate", "celebrate her", "her birthday", "birthday with", "with ice"
User ZSynopsis
by
8.8k points