227k views
5 votes
In the "Combining Taggers" section of the textbook, an example of a backoff tagger is provided. Extend that example by defining a TrigramTagger called t3 which backs off to t2. Train this tagger on all of the sentences from the Brown corpus with the category news. Then evaluate your tagger using the "accuracy" function on all of the sentences from the Brown corpus with the category lore. What is that number? How does this number compare to when this tagger is evaluated on all of the

sentences from the Brown corpus with the category news?
a. The trained trigram tagger evaluated on news evaluated on news category shows higher accuracy than evaluated on lore
b. The trained trigram tagger category shows the same accuracy than evaluated on lore
c. The trained trigram tagger evaluated on news category shows lower accuracy than evaluated on lore

User Amarnath R
by
9.1k points

1 Answer

5 votes

Final answer:

Extend the example by defining a TrigramTagger t3 with a BigramTagger backoff t2, train on Brown corpus news sentences, and evaluate on lore sentences to compare accuracy with the news evaluation.

Step-by-step explanation:

To extend the example of a backoff tagger provided in the Combining Taggers section, we would first define a TrigramTagger called t3 which would use a BigramTagger called t2 as its backoff. This TrigramTagger would be trained on all the sentences from the Brown corpus within the category news. After training, we would use the accuracy function to evaluate how well t3 performs on all the sentences from the Brown corpus with the category lore.

To obtain an accuracy number, we would have to write and execute Python code using a toolkit like NLTK to perform these actions. However, once we have the numbers, we would compare the accuracy when t3 is evaluated on the lore category to the accuracy when it is evaluated on the news category. Generally, you might expect that the tagger will have higher accuracy on the news category because it was trained on news data, and models tend to perform best on the data they were trained on. Nonetheless, this is an empirical question and requires actual computation to answer.

User Everspader
by
8.0k points