Return to Revisions

2 of 3

added 642 characters in body

edited Oct 11, 2020 at 3:34

Simplify code!

with open('bmrb.csv') as file: followed by for lines in file: can be simplified into for lines in open("bmrb.csv").readlines():
with the change above you can completely remove if (lines == '\n') clause

Use `Enum` for clarity

split_lines[0], split_lines[1]. 0 and 1 are called magic numbers.

A magic number is a numeric literal (for example, 8080, 2048) that is used in the middle of a block of code without explanation. It is considered good practice to avoid magic numbers by assigning the numbers to named constants and using the named constants instead.

Instead what if you made an Enum called Data and named those constants?
Enums in Python

from enum import Enum class Data(Enum): residue = 1 atom = 2 # the rest of the elements

Now when you want to refer to the 1st element, you can simply do split_lines[Data.atom.value] It is a little more typing, but it is also clearer as to what you mean from that line.

This also means you can remove the creation of copies. Not to create a new variable residue but just split_lines[Data.residue.value]

Format your code

if you write x = y + 65 compared to x+y=65 and x = float(y) compared to x=float(y), your code becomes much more readable

More simplification

question=input('input carbon and hydrogen values: ') split_question=question.split() search_fun(float(split_question[0]),float(split_question[1]))

becomes

carbon, hydrogen = map(float,input("Enter carbon and hydrogen values: ").split()) search_fun(carbon, hydrogen)

Split work into functions

you have this line

if float(split_carbon[3]) > (0.25*float(split_carbon[2])) or float(split_hydrogen[3]) > (0.25*float(split_hydrogen[2])): print(f'{values} {values2} HIGH ERROR')

Give a meaningful name to a new function where it would take in the various args and return True or False based on the formula. This way you can get rid of a lot of clunk in the search_fun() function.

if formula_1(Args...) or formula_2(Args...): print(f'{values} {values2} HIGH ERROR')

The same idea can apply to many other code segments, and make your code much more readable.

Using `csv.DictReader`

As suggested by @Graipher, it will be much better to use csv.DictReader as it will do a lot of the splitting work for you

import csv with open("csvfile.csv") as csvfile: reader = DictReader(csvfile, delimiter = ',') for line in reader: print(line['atom_id')

This will split the values into a dictionary, where the keys will be the words at the top of the file comp_id,atom_id,count,min,max,avg,std. This is much better as you won't need to split the lines manually, and there won't be any magic numbers as the keys to your dictionary will be pre-defined by you.
csv file handling in Python

answered Oct 10, 2020 at 3:27

user228914