129k views
1 vote
1. Extract title Write a function extract_title that takes one parameter, the filename of a GenBank formatted file, and returns a string with the title of the dataset. The title is what follows the first occurrence of the tag "TITLE" and goes all the way before the tag "JOURNAL". Be aware that the title might extend in more than one lines; i

User Imbrizi
by
5.1k points

1 Answer

4 votes

Answer:

def extract_title(file):

import re

a =''

with open(file,'r') as file:

for line in file:

a += line

m = re.search("^(TITLE)(.*?)(JOURNAL)", a, re.M + re.S)

print(m.groups()[1])

extract_title('new.txt')

Step-by-step explanation:

The programming language used is python 3.

The function is first defined and the regular expression module is imported.

A variable is initialized to an empty string that will hold the content of the GenBank formatted file.

The file is opened and every line in the file is assigned to the string variable. The WITH statement allows files to be closed automatically.

Regular expression is used to capture all the files between TITLE and JOURNAL in a group.

The group is printed and the function is called.

I have attached a picture of the code in action.

1. Extract title Write a function extract_title that takes one parameter, the filename-example-1
User Mysterion
by
4.6k points