165k views
5 votes
You will be given a set of string called T={T1,T2,…,Tk} and another string called P. You will have to find the number of occurrences of P in T. And to do that, you will have to build a string matching automaton. The strings will contain only small letters from the English alphabet a to z if the length of the pattern P is m then your automaton will have m+1 state labelled by 0,1,2,…,m Each of these states will have 26 state transitions. 1. Create an (m+1)×26 tate transition table. (coding) 2. Feed the strings Ti to the automaton and see how many times P occur in Ti for all i (coding) 3. Compute the total time and space complexity for your solution in terms of n,m, kgiven that the maximum length of a string in T (complexity analysis )

User Ray Vol
by
8.2k points

1 Answer

6 votes

Answer:

Here's the code to create an (m+1)×26 state transition table for the string matching automaton:

def create_table(pattern):

m = len(pattern)

table = [[0]*26 for _ in range(m+1)]

lps = [0]*m

for i in range(m):

# Fill in transition for current state and character

c = ord(pattern[i])-ord('a')

if i > 0:

for j in range(26):

table[i][j] = table[lps[i-1]][j]

table[i][c] = i+1

# Fill in fail transition

if i > 0:

j = lps[i-1]

while j > 0 and pattern[j] != pattern[i]:

j = lps[j-1]

lps[i] = j+1

# Fill in transitions for last row (sink state)

for j in range(26):

table[m][j] = table[lps[m-1]][j]

return table

Here's the code to feed the strings Ti to the automaton and count the number of occurrences of P in Ti:

def count_occurrences(T, P):

m = len(P)

table = create_table(P)

count = 0

for Ti in T:

curr_state = 0

for i in range(len(Ti)):

c = ord(Ti[i])-ord('a')

curr_state = table[curr_state][c]

if curr_state == m:

count += 1

return count

The time complexity of create_table is O(m26), which simplifies to O(m), since we are only looking at constant factors. The time complexity of count_occurrences is O(nm26), since we are processing each Ti character by character and looking up state transitions in the table, which takes constant time. The space complexity of our solution is O(m26), since that's the size of the state transition table we need to store.

Overall, the time complexity of our solution is O(n*m), where n is the number of strings in T and m is the length of P.

Step-by-step explanation:

User Selena
by
8.0k points