134k views
0 votes
Given a collection of unencrypted plain-text files, and the MD5 hash of one of the files, build a solution to identify which of the plain-text files matches the hash given to you. The dataset for this milestone is attached to this dropbox.

Deliverables:
1. The name of the file that matches the hash
2. Any source code (commented and readable) involved. If you used a pre-existing tool or tools, instead of source code, a list of the tools and a description of how you used them to identify the correct file.

User Ethry
by
8.1k points

1 Answer

7 votes

Final answer:

The solution involves creating a script that reads each file in a collection, calculates its MD5 hash, and compares it to the given hash. Upon finding a match, the script returns the name of the corresponding file. This requires basic knowledge of programming and usage of hashing functions.

Step-by-step explanation:

The subject of the question involves creating a method to match a given MD5 hash with its corresponding plain-text file from a collection. To solve this problem, you would write a program or script that reads through each file, computes its MD5 hash, and compares the result with the given hash. Once a match is found, the name of the file would be returned. Below is an example of pseudo code in Python:

import hashlib

def find_matching_file(files, target_hash):
for filename in files:
with open(filename, 'rb') as file:
content = file.read()
md5_hash = hashlib.md5(content).hexdigest()
if md5_hash == target_hash:
return filename
return None

# Given list of file names and the target MD5 hash
matching_file = find_matching_file(list_of_files, given_md5_hash)
print(f'The matching file is: {matching_file}')

To use this solution, you would need to have Python installed on your computer and a list of the filenames (list_of_files) which you want to check against the given MD5 hash (given_md5_hash). The function find_matching_file reads each file, calculates its hash, and then compares it with the target hash. The first file with a matching hash is the correct file, and its name would be returned. Commented and readable source code is considered a best practice when developing such tools.

This example is a concept and would need to be adapted to the specifics of the environment in which it's run (such as file paths and handling of large files).

User Ivan Fernandez
by
7.4k points