130k views
3 votes
1. Introduction

Create a C++ program that utilizes a hash map to store a document's word frequency. Feel free to use any standard C++ library including
2. Description
In an attempt to find a secret informant, your company began temporarily sending two types of emails to their employees: one containing false fabricated information and the other with true information. You are tasked to create a program that validates if an email is true based on the frequency of strings used.
Email is FALSE if the frequency of each word is equal to 1.
Email is TRUE for every other case
Additionally:
The greeting line ("Dear ") is to be ignored when counting word frequency. o Ignore periods, commas, exclamation marks, question marks, and quotations
3. Input Files
Each input file will represent your email inbox.
Each email will always start with the greeting "Dear name."
The email body will never start with "Dear" A blank line will separate each email.
There will be no empty files.
Remove all \\ and \r
4. Output Files
For each email, you will either output "True" or "False" along with additional information.
If the email contains true information, output "True" and the frequency of each word in lowercase and alphabetical order.
If the email is fabricated, output "False"

User Tom Melo
by
7.6k points

1 Answer

7 votes

Final answer:

The task involves writing a C++ program using a hash map to check if an email contains true information based on word frequency, excluding certain punctuation and the greeting line. Emails with all words having a frequency of one are false. For true emails, output the word frequencies in lowercase and alphabetical order.

Step-by-step explanation:

To create a C++ program that determines the veracity of emails based on word frequency, you will need to utilize a hash map to store each word's frequency. The process for this program includes reading the email text, ignoring punctuation, and storing words in a map. If all words in the email have a frequency of one, the email is marked as 'False'; otherwise, it is marked as 'True'. Additionally, when an email is determined to be 'True', the program should output the frequency of each word in lowercase and alphabetical order.

To start, you will read the input file and split the content by blank lines to separate each email. For each email, you will preprocess the text to remove greeting lines, convert to lowercase, and strip punctuation. Afterward, you will iterate over each word, increment its count in the hash map, and then check the word counts to determine the email's status. If an email is labeled as 'True', you will output the word frequency in the specified format. If labeled as 'False', simply output 'False'. This C++ implementation makes use of the std::unordered_map from the STL to handle the hash map functionalities.

User Ricky Kazuo Miller
by
8.5k points