35.2k views
0 votes
Write a Java program (WordFreqCount.java) to read a file (book.txt1 ) and count the frequencies of words using Java Map interface. Display both top 20 most appeared words and least appeared words2 .

Detailed Instructions.

If using a TreeMap, keys are already sorted, but not the values. You are required to implement a sortByValue method to sort a map by the values, and returns a list of map entries ordered by the values. To sort a map by value, you are asked to use the Java collection sort method by implementing your own compare method of Comparator interface instead of using the default Comparator. For example, a general framework may look like the following:

public static List < Map . Entry > sortByValue ( Map map ) {

......

Collections . sort ( list , new Comparator < Map . Entry >() {

public int compare ( Map . Entry e1 , Map . Entry e2 ) {

.....

}

}) ;

......

}

You are expected to properly handle any possible exception in your program.

Sample Output

Top 20 Least Appeared Words:

(1): frowning --> 1

(2): instep --> 1

(3): abrupt --> 1

(4): Folk --> 1

(5): crisply --> 1

(6): Gang --> 1

(7): Eventually --> 1

(8): gloomy --> 1

(9): perspired --> 1

(10): spreading --> 1

(11): investment --> 1

(12): scratch --> 1

(13): require --> 1

(14): incites --> 1

(15): neatness --> 1

(16): Game --> 1

(17): teetotaler --> 1

(18): rounded --> 1

(19): role --> 1

(20): glade --> 1

Top 20 Most Appeared Words:

(1): the --> 5426

(2): I --> 3038

(3): and --> 2887

(4): to --> 2788

(5): of --> 2733

(6): a --> 2595

(7): in --> 1747

(8): that --> 1664

(9): was --> 1393

(10): it --> 1303

(11): you --> 1283

(12): he --> 1168

(13): is --> 1131

(14): his --> 1103

(15): have --> 908

(16): my --> 907

(17): with --> 849

(18): had --> 821

(19): as --> 780

(20): which --> 770

User Partyd
by
8.1k points

1 Answer

3 votes

Answer:

Here's an example Java program, WordFreqCount.java, that reads a file (book.txt) and counts the frequencies of words using the Java Map interface. It displays the top 20 most appeared words and the top 20 least appeared words.

import java.io.BufferedReader;

import java.io.FileReader;

import java.io.IOException;

import java.util.*;

public class WordFreqCount {

public static void main(String[] args) {

String filename = "book.txt"; // Replace with the actual file name

try {

Map<String, Integer> wordFrequencyMap = countWordFrequencies(filename);

List<Map.Entry<String, Integer>> mostAppearedWords = getMostAppearedWords(wordFrequencyMap, 20);

List<Map.Entry<String, Integer>> leastAppearedWords = getLeastAppearedWords(wordFrequencyMap, 20);

System.out.println("Top 20 Most Appeared Words:");

displayWordFrequencies(mostAppearedWords);

System.out.println("\\Top 20 Least Appeared Words:");

displayWordFrequencies(leastAppearedWords);

} catch (IOException e) {

e.printStackTrace();

}

}

public static Map<String, Integer> countWordFrequencies(String filename) throws IOException {

Map<String, Integer> wordFrequencyMap = new HashMap<>();

try (BufferedReader br = new BufferedReader(new FileReader(filename))) {

String line;

while ((line = br.readLine()) != null) {

String[] words = line.split("\\s+");

for (String word : words) {

wordFrequencyMap.put(word, wordFrequencyMap.getOrDefault(word, 0) + 1);

}

}

}

return wordFrequencyMap;

}

public static List<Map.Entry<String, Integer>> getMostAppearedWords(Map<String, Integer> wordFrequencyMap, int count) {

List<Map.Entry<String, Integer>> entries = new ArrayList<>(wordFrequencyMap.entrySet());

entries.sort((e1, e2) -> e2.getValue().compareTo(e1.getValue())); // Sort by value in descending order

return entries.subList(0, Math.min(count, entries.size()));

}

public static List<Map.Entry<String, Integer>> getLeastAppearedWords(Map<String, Integer> wordFrequencyMap, int count) {

List<Map.Entry<String, Integer>> entries = new ArrayList<>(wordFrequencyMap.entrySet());

entries.sort((e1, e2) -> e1.getValue().compareTo(e2.getValue())); // Sort by value in ascending order

return entries.subList(0, Math.min(count, entries.size()));

}

public static void displayWordFrequencies(List<Map.Entry<String, Integer>> wordFrequencies) {

int rank = 1;

for (Map.Entry<String, Integer> entry : wordFrequencies) {

System.out.println("(" + rank + "): " + entry.getKey() + " --> " + entry.getValue());

rank++;

}

}

}

To use this program, make sure to replace "book.txt" with the actual file name you want to read. The program reads the file, counts the word frequencies using a HashMap, and then uses the provided methods getMostAppearedWords and getLeastAppearedWords to get the top 20 most and least appeared words, respectively. The displayWordFrequencies method is used to print the results.

Make sure the input file book.txt is present in the same directory as the Java file or provide the full path to the file.

Note: The program assumes that words are separated by spaces. If your file contains punctuation or special characters, you may need to modify the splitting

Step-by-step explanation:

User Stephan Unrau
by
7.6k points