3
\$\begingroup\$

I have 2 scripts, one is a tkinter GUI script where the user gives specific inputs, the 2nd script takes those inputs, does some modification, and then sends it back to the GUI script to be written/results printed. However, in dealing with multiple user inputs, the entries for the functions started getting longer and longer, and uglier. As you'll see in the GUI script, when I use the function I imported, it has 7 entries which make it quite long. Is there a better way to call user inputs from one script to another?

#GUI Script (NOTE: I'm not posting all the global and input functions, since they are basically the same thing. Don't want to be reptitious) #basic tkinter setup root=tk.Tk() with the loop and everything setup #globals where filenames and directories to files are saved, to be called on in the functions sparta_file=() sparta_directory=() seq_file=() seq_directory=() #browse options to choose files def input_file(): fullpath = filedialog.askopenfilename(parent=root, title='Choose a file') global sparta_directory global sparta_file sparta_directory=os.path.dirname(fullpath) sparta_file= os.path.basename(fullpath) label2=Label(root,text=fullpath).grid(row=0,column=1) def input_seq(): fullpath = filedialog.askopenfilename(parent=root, title='Choose a file') global seq_file global seq_directory seq_directory=os.path.dirname(fullpath) seq_file= os.path.basename(fullpath) label3=Label(root,text=fullpath).grid(row=1,column=1) #All the user inputs are designed more or less the same, user browses, clicks on file, and files directory and filename are saved as globals. #function that will be run to use user inputs, modify them, and then write modifications def sparta_gen_only(): from sparta_file_formatter import check_sparta_file_boundaries os.chdir(save_directory) with open(save_file_sparta,'w') as file: for stuff_to_write in check_sparta_file_boundaries(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start): file.write(stuff_to_write+'\n') 

So right off the bat, you can see the exact issue I'm having (check_sparta_file boundaries has a lot of inputs).

#2nd sparta_file_formatter import re import os def create_seq_list(seq_file,seq_directory,seq_start): os.chdir(seq_directory) amino_acid_count=(0+seq_start)-1 sequence_list=[] with open(seq_file) as sequence_file: for amino_acid in sequence_file: stripped_amino_acid=amino_acid.strip().upper() for word in stripped_amino_acid: amino_acid_count+=1 sequence_list.append(str(amino_acid_count)+word) return sequence_list def format_sparta(sparta_file,sparta_directory): os.chdir(sparta_directory) sparta_file_list1=[] proline_counter=0 with open(sparta_file) as sparta_predictions: for line in sparta_predictions: modifier=line.strip().upper() if re.findall('^\d+',modifier): A=modifier.split() del A[5:8] del A[3] A[0:3]=["".join(A[0:3])] joined=" ".join(A) proline_searcher=re.search('\BP',joined) if proline_searcher != None: proline_counter+=1 if proline_counter<2: proline_count=re.search('^\d+',joined) sparta_file_list1.append(f'{proline_count.group(0)}PN'+' 1000'+' 1000') else: if proline_count == 4: proline_count=re.search('^\d+',joined) sparta_file_list1.append(f'{proline_count.group(0)}PHN'+' 1000'+' 1000') proline_counter=0 sparta_file_list1.append(joined) return sparta_file_list1 #Each function the entries get longer and longer as they start using the outputs of the previous functions def add_mutation(mutation_list1,mutation_list2,sparta_file,sparta_directory): sparta_file_list2=[] if mutation_list1==() or mutation_list2==(): for amino_acids in format_sparta(sparta_file,sparta_directory): sparta_file_list2.append(amino_acids) else: for mutations,mutations2 in zip(mutation_list1,mutation_list2): for amino_acids in format_sparta(sparta_file,sparta_directory): if re.findall(mutations,amino_acids): splitting=amino_acids.split() mutation=re.sub(mutations,mutations2,splitting[0]) mutation_value=re.sub('\d+.\d+',' 1000',splitting[1]) mutation_value2=re.sub('\d+.\d+',' 1000',splitting[2]) mutation_replacement=mutation+mutation_value+mutation_value2 sparta_file_list2.append(mutation_replacement) else: sparta_file_list2.append(amino_acids) return sparta_file_list2 def filter_sparta_using_seq(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start): sparta_file_list3=[] sparta_comparison=create_seq_list(seq_file,seq_directory,seq_start) for aa in add_mutation(mutation_list1,mutation_list2,sparta_file,sparta_directory): modifiers=aa.strip() splitter=modifiers.split() searcher=re.search('^\d+[A-Z]',splitter[0]) compiler=re.compile(searcher.group(0)) sparta_sequence_comparison=list(filter(compiler.match,sparta_comparison)) if sparta_sequence_comparison != []: sparta_file_list3.append(aa) return sparta_file_list3 def check_sparta_file_boundaries(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start): temp_list=[] temp_counter=0 sparta_filtered_list=filter_sparta_using_seq(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start) for checker in sparta_filtered_list: temp_modifier=checker.strip() temp_split=temp_modifier.split() temp_finder=re.search('^\d+',temp_split[0]) temp_list.append(temp_finder.group(0)) temp_counter+=1 if temp_counter==5: if int(temp_finder.group(0))==int(temp_list[0]): break else: del sparta_filtered_list[0:4] break if len(sparta_filtered_list)%6 != 0: del sparta_filtered_list[-5:-1] return sparta_filtered_list 

Edit:

In terms of exactly what sparta is and what my code is doing. I won't go into too much detail regarding sparta, outside of it is a text file with information we want. This is the format:

REMARK SPARTA+ Protein Chemical Shift Prediction Table REMARK All chemical shifts are reported in ppm: .... 3 Y HA 0.000 4.561 4.550 0.018 0.000 0.201 3 Y C 0.000 175.913 175.900 0.021 0.000 1.272 3 Y CA 0.000 58.110 58.100 0.017 0.000 1.940 3 Y CB 0.000 38.467 38.460 0.011 0.000 1.050 4 Q N 3.399 123.306 119.800 0.179 0.000 2.598 ... 

We only care about the lines with the numbers, so I use a regex search to only extract that. Now the info I want is the first 3 columns, with the 4 column. I want each data formatted 3YHA 4.561 (2nd function). Now every number should have 6 values associated with it, those that are P, will only have 4, so I add 2 extra values (you may note in the above, the format is HA,C,CA,CB,etc. So I add the values so the format of P is N,HA,C,CA,CB.

Sometimes the user will wish to change a specific letter (mutation). So they indicate which letter, the number, and what to change it to (3rd loop).

Finally, these files can sometimes have extra info we don't care about. The user specifies the range of info they want by using a seq file (1st and 4rd loop).

As stated, every letter should have 6 values. However, the first letter will always have 4. The last letter will also only have 5. So these need to be removed (loop 5).

Here is some sample input files as examples:

seq_number=1 #seq.txt MSYQVLARKW #sparta_pred.tab 3 Y HA 0.000 4.561 4.550 0.018 0.000 0.201 3 Y C 0.000 175.913 175.900 0.021 0.000 1.272 3 Y CA 0.000 58.110 58.100 0.017 0.000 1.940 3 Y CB 0.000 38.467 38.460 0.011 0.000 1.050 4 Q N 3.399 123.306 119.800 0.179 0.000 2.598 4 Q HA 0.146 4.510 4.340 0.039 0.000 0.237 4 Q C -2.091 173.967 176.000 0.097 0.000 0.914 4 Q CA -0.234 55.623 55.803 0.092 0.000 1.065 4 Q CB 3.207 32.000 28.738 0.092 0.000 1.586 4 Q HN 0.131 8.504 8.270 0.173 0.000 0.484 5 V N 0.131 120.091 119.914 0.078 0.000 2.398 5 V HA 0.407 4.575 4.120 0.080 0.000 0.286 5 V C 0.162 176.322 176.094 0.109 0.000 1.026 5 V CA -1.507 60.840 62.300 0.078 0.000 0.868 5 V CB 0.770 32.625 31.823 0.052 0.000 0.982 5 V HN 0.418 8.642 8.190 0.057 0.000 0.443 6 L N 7.083 128.385 121.223 0.130 0.000 2.123 6 L HA -0.504 4.085 4.340 0.415 0.000 0.217 6 L C 1.827 178.814 176.870 0.195 0.000 1.081 6 L CA 3.308 58.271 54.840 0.205 0.000 0.772 6 L CB -1.005 41.051 42.059 -0.005 0.000 0.890 6 L HN 0.241 8.694 8.230 0.097 -0.164 0.437 7 A N -4.063 118.812 122.820 0.092 0.000 2.131 7 A HA -0.337 4.023 4.320 0.067 0.000 0.220 7 A C 0.433 178.071 177.584 0.090 0.000 1.158 7 A CA 2.471 54.552 52.037 0.073 0.000 0.665 7 A CB -0.332 18.690 19.000 0.036 0.000 0.795 7 A HN -0.517 7.889 8.150 0.063 -0.219 0.460 8 R N -4.310 116.247 120.500 0.096 0.000 2.191 8 R HA -0.056 4.313 4.340 0.048 0.000 0.196 8 R C 2.152 178.488 176.300 0.060 0.000 0.991 8 R CA 1.349 57.485 56.100 0.060 0.000 1.075 8 R CB 0.834 31.147 30.300 0.023 0.000 1.040 8 R HN 0.244 8.408 8.270 0.109 0.172 0.526 9 K N 0.144 120.608 120.400 0.108 0.000 2.283 9 K HA -0.130 4.148 4.320 -0.069 0.000 0.202 9 K C 0.691 177.214 176.600 -0.129 0.000 1.048 9 K CA 2.415 58.707 56.287 0.008 0.000 0.948 9 K CB -0.114 32.430 32.500 0.074 0.000 0.742 9 K HN -0.617 7.728 8.250 0.159 0.000 0.458 10 W N -4.007 117.283 121.300 -0.016 0.000 2.846 10 W HA 0.195 4.850 4.660 -0.009 0.000 0.391 10 W C -1.455 175.056 176.519 -0.013 0.000 1.011 10 W CA -1.148 56.191 57.345 -0.011 0.000 1.832 10 W CB 0.166 29.622 29.460 -0.007 0.000 1.151 10 W HN -0.634 7.728 8.180 0.377 0.045 0.582 11 R N 1.894 122.475 120.500 0.134 0.000 2.483 11 R HA -0.096 4.293 4.340 0.083 0.000 0.329 11 R C -1.368 174.959 176.300 0.045 0.000 0.961 11 R CA -0.713 55.431 56.100 0.073 0.000 1.041 11 R CB 0.187 30.506 30.300 0.033 0.000 0.930 11 R HN -0.880 7.272 8.270 0.107 0.182 0.413 12 P HA -0.173 4.278 4.420 0.051 0.000 0.257 12 P C -1.027 176.281 177.300 0.014 0.000 1.162 12 P CA 0.741 63.865 63.100 0.040 0.000 0.762 12 P CB 0.046 31.768 31.700 0.036 0.000 0.753 13 Q N 1.152 120.951 119.800 -0.001 0.000 2.396 13 Q HA 0.193 4.514 4.340 -0.032 0.000 0.220 13 Q C 0.275 176.261 176.000 -0.024 0.000 0.900 13 Q CA 0.394 56.181 55.803 -0.027 0.000 0.925 13 Q CB 2.516 31.223 28.738 -0.051 0.000 1.065 13 Q HN 0.012 8.472 8.270 0.002 -0.188 0.535 
\$\endgroup\$
2
  • 2
    \$\begingroup\$ What is Sparta? What is this actually doing? \$\endgroup\$ Commented Jun 28, 2020 at 3:00
  • \$\begingroup\$ I added edits to address this. But its basically a text file with data, we extract the data we want and modify it based on user input (mutations to change specific letters, sequence to indicate the bounds), However my question here is to find another technique of using functions like this without having a bunch of entries. \$\endgroup\$ Commented Jun 28, 2020 at 5:30

2 Answers 2

1
\$\begingroup\$

Returns, not globals

Don't declare these at the global level:

sparta_file=() sparta_directory=() seq_file=() seq_directory=() 

Instead, return them from functions; e.g.

def input_file(): fullpath = filedialog.askopenfilename(parent=root, title='Choose a file') sparta_directory=os.path.dirname(fullpath) sparta_file= os.path.basename(fullpath) return sparta_directory, sparta_file 

Pathlib

Probably best to replace your use of os.path with pathlib, whose object-oriented interface is nicer to use.

Local imports

such as

 from sparta_file_formatter import check_sparta_file_boundaries 

should be moved to the top of the file.

\$\endgroup\$
2
  • \$\begingroup\$ Is there any advantage to not redefine them at the global level? It seems a little bit less clean since in my other script, I'd have to call the function in the argument I.E. some_function(input_file()), instead of some_function(input_file). And in some cases when I have multiple inputs, then I'll have multiple functions I'll be calling in as entries \$\endgroup\$ Commented Jun 30, 2020 at 17:20
  • \$\begingroup\$ Yes, there are many advantages. The fewer side-effects a function has, the better; and setting a global is a side-effect. Functions that accept parameters are easier to test, reuse, maintain and understand than functions that rely on globals. \$\endgroup\$ Commented Jul 2, 2020 at 17:09
1
\$\begingroup\$

Architecture

Your main architectural problem is that instead of

def make_a(params): return a def make_b(a, params): return b def make_c(b, params): return c def make_result(c, params): return result a = make_a(params_a) b = make_b(a, params_b) c = make_c(b, params_c) result = make_result(c, params_result) 

you do

def make_a(params): return a def make_b(params_a, params_b): a = make_a(params_a) return b def make_c(params_a, params_b, params_c): b = make_b(params_a, params_b) return c def make_result(params_a, params_b, params_c, params_result): c = make_c(params_a, params_b, params_c) return result result = makeresult(params_a, params_b, params_c, params_result) 

Instead of calling a function_1 to generate the necessary artefacts to pass to the next function_2 you call the function_1 inside function_2 and therefore you have to pass the requirements for function_2 as well.

In your case in function

def check_sparta_file_boundaries(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start): temp_list=[] temp_counter=0 sparta_filtered_list=filter_sparta_using_seq(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start) for checker in sparta_filtered_list: temp_modifier=checker.strip() temp_split=temp_modifier.split() temp_finder=re.search('^\d+',temp_split[0]) temp_list.append(temp_finder.group(0)) temp_counter+=1 if temp_counter==5: if int(temp_finder.group(0))==int(temp_list[0]): break else: del sparta_filtered_list[0:4] break if len(sparta_filtered_list)%6 != 0: del sparta_filtered_list[-5:-1] return sparta_filtered_list 

you shall call filter_sparta_using_seq before calling check_sparta_file_boundaries and pass sparta_filtered_list instead of the parameters required for filter_sparta_using_seq

def check_sparta_file_boundaries(sparta_filtered_list): temp_list=[] temp_counter=0 # line removed ... for checker in sparta_filtered_list: temp_modifier=checker.strip() temp_split=temp_modifier.split() temp_finder=re.search('^\d+',temp_split[0]) temp_list.append(temp_finder.group(0)) temp_counter+=1 if temp_counter==5: if int(temp_finder.group(0))==int(temp_list[0]): break else: del sparta_filtered_list[0:4] break if len(sparta_filtered_list)%6 != 0: del sparta_filtered_list[-5:-1] return sparta_filtered_list def main_program_flow(): sparta_filtered_list = filter_sparta_using_seq(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start) sparta_filtered_list = check_sparta_file_boundaries(sparta_filtered_list) 

Next you do the same for filter_sparta_using_seq and so on.

I tried to answer your specific question and hope you got the idea.


EDIT:

The same is valid for your function in the first file

def sparta_gen_only(): from sparta_file_formatter import check_sparta_file_boundaries os.chdir(save_directory) with open(save_file_sparta,'w') as file: for stuff_to_write in check_sparta_file_boundaries(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start): file.write(stuff_to_write+'\n') 

where you did not pass the parameters but act on globals. Again we do not call from the inside but call before and pass the results. Also we pass parameters instead of using globals.

def sparta_gen_only(sparta_filtered_list, directory_name, file_name): os.chdir(directory_name) with open(file_name, 'w') as file: for stuff_to_write in sparta_filtered_list: file.write(stuff_to_write + '\n') def main_program_flow(): sparta_filtered_list = filter_sparta_using_seq(seq_file,seq_directory,mutation_list1,mutation_list2,sparta_file,sparta_directory,seq_start) sparta_filtered_list = check_sparta_file_boundaries(sparta_filtered_list) sparta_gen_only(sparta_filtered_list, save_directory, save_file_sparta) 

some other points

  • Get rid of the habit to change directory. At least for file read this is a no-go. Let the user determine the working directory.
  • There is nothing wrong with fully qualified file names. You do not need to split to directory/basename.
  • After restructuring your code according to the pattern above, there shall be no more globals
\$\endgroup\$
1
  • \$\begingroup\$ OH I see, so if I want to call the function and import its value, all I need to do is use it as an entry for the next function. The only issue is how do I use its value though in the next function. I assumed I'd need to redefine that value (that's why I do a=previous fun(previous entry) to use its values. \$\endgroup\$ Commented Jun 30, 2020 at 4:21

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.