20.1k views
2 votes
Submit Levenshtein.java

Write a program that computes the edit distance (also called the Levenshtein distance) between two words. The edit distance between two strings is the minimum number of operations that are needed to transform one string into the other. For this program, an operation is a substitution of a single character, such as from "brisk" to "brick". The edit distance between the words "dog" and "cat" is 3, following the chain of "dot", "cot", and "cat" to transform "dog" into "cat". When you compute the edit distance between two words, each intermediate word must be an actual valid word. Edit distances are useful in applications that need to determine how similar two strings are, such as spelling checkers.

Read your input from a test.txt file or a dictionary.text file. From this file, compute a map from every word to its immediate neighbors, that is, the words that have an edit distance of 1 from it. Once this map is built, you can walk it to find paths from one word to another.

A good way to process paths to walk the neighbor map is to use a linked list of words to visit, starting with the beginning word, such as "dog". Your algorithm should repeatedly remove the front word of the list and add all of its neighbors to the end of the list, until the ending word (such as "cat") is found or until the list becomes empty, which indicates that no path exists between the two words.

Your Levenshtein class has to have a private Map> field so the following test code works:

Levenshtein.java

// section of code to read data file
Scanner file = new Scanner(new File("test.txt"));
ArrayList words = new ArrayList();
while (file.hasNext()) words.add(file.next());
file.close();

// section of code to test your data structure
Levenshtein structure = new Levenshtein(words); // builds Map from List of words
System.out.println(structure.getMap()); // displays the Map from above
System.out.println(structure.getMap().size()); // size of above Map
System.out.println(structure.getPath("dog","cat")); // displays path as described in text
System.out.println(structure.getDistance("dog","cat")); // the "distance" as described in text
test.txt

dog
dot
cot
cat
fat
mat
rat
rut

Output from above with my small test dictionary test.txt should yield output:
{rut=[rat], mat=[cat, fat, rat], rat=[cat, fat, mat, rut], cat=[cot, fat, mat, rat], dot=[dog, cot], fat=[cat, m at, rat], cot=[dot, cat], dog=[dot]}
8
[dog, dot, cot , cat]
3

The first line is the whole Map, which has a size of 8 key items (second line).

And the third line is the "path of how to change "dog" into a "cat" which is 3 steps.

Additional considerations:

If we pass two words with different lengths, then define path as -1 and an empty path
getPath should return a List
getDistance returns an int that is one less than the size from getPath
Above test is with test.txtstudent submitted image, transcription available belowbut I will use dictionary.txtstudent submitted image, transcription available belowfor my final testing, results will vary
Using dictionary.txtstudent submitted image, transcription available belowin above example, you should get a Map size of 19911 and path:
[dog, cog, cot, cat]

User Dragan
by
6.6k points

1 Answer

4 votes

Final answer:

The Levenshtein distance is the minimum number of operations needed to transform one word into another.

Step-by-step explanation:

The question is asking for a Java program that can compute the edit distance, also known as the Levenshtein distance, between two words. The edit distance is defined as the minimum number of operations required to transform one string into another, where an operation is a substitution of a single character. The program should read input from a text file, compute a map of each word with its immediate neighbors, and then find paths from one word to another using a linked list of words. The Levenshtein class should have a private Map<String, List<String>> field to store the map.

To implement this program, you can start by reading the words from the text file and storing them in an ArrayList. Then, you can create a method in the Levenshtein class to build the map by iterating through each word and finding its neighbors based on the edit distance. Finally, you can implement a getPath method to find the path from one word to another using the map and the linked list data structure.

When testing the program, you can use the provided test code to read words from the test.txt file, build the map, and find the path and distance between two words. The output should match the expected results given in the question.

User Algot
by
8.5k points