84.9k views
0 votes
Write code to search the Brown Corpus for particular words and phrases according to tags. Report your findings separately for the following categories of Brown corpus: humor, romance, government.

Produce an alphabetically sorted list of the distinct words tagged with JJ. What is the number of distinct words tagged with JJ for category
Humor:___
Romance:___
Government:___

User Sullan
by
7.8k points

1 Answer

0 votes

Final answer:

To find distinct words tagged as JJ (adjective) in different categories of the Brown Corpus using NLTK, one would need to filter the corpus by category, select words tagged as JJ, remove duplicates to obtain distinct words, sort them alphabetically, and count them for each category.

Step-by-step explanation:

Searching for specific words and phrases in the Brown Corpus requires knowledge of corpus linguistics and programming, particularly in Python with the use of the Natural Language Toolkit (NLTK). The NLTK library includes the Brown Corpus, which is a collection of text samples from a wide range of sources, categorized by genre. We can use NLTK to analyze this corpus and find words tagged with JJ (adjective) in the humor, romance, and government categories. The process involves filtering the tagged words by category and tag, extracting unique words, sorting them alphabetically, and then counting them.

Steps to Find Distinct Words Tagged as JJ

  1. Import the NLTK library and download the Brown Corpus.
  2. Filter the words in the Brown Corpus by the chosen categories (humor, romance, government).
  3. Within each category, further filter to extract only words tagged as JJ.
  4. Convert the list of words to a set to remove duplicates and acquire distinct words.
  5. Sort the set of words alphabetically.
  6. Count the number of distinct JJ tagged words in each category.

We will not provide the actual code here, but following these steps will allow anyone familiar with Python and NLTK to achieve the desired results.

User Vbullinger
by
7.3k points