3

I have some code and in it there are functions which can be run in one of two modes (in my case server mode and local mode). For example, most functions look something like this:

def path_join(path_elements,run_type='local'): if run_type == 'local': path = os.path.join(*path_elements).replace("/", os.path.sep) elif run_type == 'blob': path = clean_path('/'.join(path_elements)) return path def clean_path(path: str, run_type='local') -> str: """ Replaces repeated path separators in a path. """ if run_type == 'local': separator = get_os_path_sep() elif run_type == 'blob': separator = constants.BLOB_PATH_SEP separator = separator * 2 if separator == constants.WINDOWS_PATH_SEP else separator path = re.sub(fr"{separator}+", f"{separator}", path) return path def read_text(filename,run_type,container): root = get_root_path() if run_type == 'local': with open(filename, 'r') as file: return file.read() elif run_type == 'blob': stream = read_blob(container, blob_path, run_type, use_lease=True) return stream.read().decode('utf-8') 

I'm trying to refactor this code because it feels redundant to check for case in all these functions. One option I guess is to break down the functions and have the and create a class to decide which variation of the function to call. The problem is that all the functions don't have too much in common so I think in the end I would just have a lot of small classes which actually may be more complex (but maybe I'm wrong OOP is still fresh to me). Does anyone have any advice on how to make this better?

4
  • Does this answer your question? Style for control flow with validation checks Commented Dec 22, 2021 at 18:17
  • @gnat I don't think so (or at least I don't see how). That seems to be focused on running a set of validations. My control flow doesn't do validations/the same thing every-time. I Commented Dec 22, 2021 at 19:20
  • 2
    I think it would help to see a code snippet where those methods are called. How is run_type actually determined? Commented Dec 22, 2021 at 19:21
  • Python supports object-oriented programming via classes and duck typing. Does that suggest anything? Commented Jan 4, 2022 at 0:51

5 Answers 5

5

If the value of run_type can be determined before these are called then classes may be a good solution.

Currently it looks like you call the code this way:

read_text( clean_path( path_join(path, run_type), run_type ), run_type, container ) 

When it could be:

run_type.read_text(path, container) 

This would use polymorphism to call different read_text methods based on the type of run_type.

You may be reluctant to “create a lot of small classes” (in this case it'd be two, for now) but that is typical of OOP. It's a very different style. Rather then pull details towards you and try to know everything, you push details behind abstractions so you can forget about them.

Of course OOP, classes, and the strategy patern are hardly the only way to solve this problem. This question reminds me of the Boolean argument code smell. You can find example solutions for that here and here.

1

I like to look at how usable the abstraction is from the perspective of the consuming client first, and only second from the perspective of its internal implementation.

If you already have a sizeable code base, what I would do is to look at the code that is consuming this (set of) abstraction(s).

Do calling consumers pass constants or variables for that parameter?

  • At one extreme, if users of these methods always pass constants, then those are really two different methods that should simply have two different names, and the parameter should be removed from both.

  • At the other extreme, if users never pass constants, always a variable, then it makes sense to keep a single signature with a differentiating parameter.

Of course, we can do both, if there are significant uses of constants being passed as well as uses of the variable, where one implementation (e.g. taking the variable or separate methods) can be written in terms of the other(s), and vice versa, without necessarily affecting the consuming clients.

0

Please ignore the kneejerk reaction of creating a new class whenever you find a pattern. No one thing can solve all problems.

Creating a new class is a solution, a solution to a type of problems. Its not a solution to all types of problems. The type of problem it solves is when your data has a pattern - a commonality, a repetition of sorts.

The problem you are solving is not based on that pattern. The problem you are facing is a cross-cutting concern. Its a commonality in code. Code and data are two different things.

Object Oriented Programming is great for adding same functionality to similar objects. That is, apply same functions to same pattern of data.

Functional Programming is opposite of Object Oriented Programming. Its great for applying functions that have a pattern, to same data.

You can use OOP to teach singing to sparrows, magpies and cuckows. You can use FP to teach different types of singing to sparrows.

As data can have patterns, functions can have patterns too.

You can only refactor a pattern and nothing else.

When data has pattern you make class hierarchies using inheritance, or make class maps as you would link tables in Relational Database using composition. Both of these are based on commonalities in data.

When code has pattern you use Higher Order Functions.

The code you want to refactor has this pattern: This function divides the flow in two parts, that function over there also divides the flow in two parts, that function further away also divides the flow in two parts. Thats the pattern. Ofcourse being in real world the functions also do something before or after the division. Real world problems arent textbook examples.

There is no language construct in imperative programming to refactor this out. Your only choice is using a functional programming construct: higher order function.

You have to make a higher order function that takes into arguments four functions. In its body the higher order function will first call the first argument function - its the part your unfactored functions have before the if-else statements. Then the higher order function will have the if-else statements i.e. the conditions - this is the fixed part in the unfactored functions, the conditions are always same.

Inside the if statement the higher order function will call the second argument function, and inside the else statement the third argument function.

Finally, after the if-else statements the higher order function will call the fourth argument function.

-1

this looks like you need to configure the calls before you use the functions. This is done by "dependency injection" in OOP or callbacks/"higher order functions" in FP. What you do is you abstract away what is inside of the if blocks and then construct the function and return it. I am not super familiar how you would solve this in python (in JS world you have module.exports/export that make this easier).

def _path_join_local(path_elements): return os.path.join(*path_elements).replace("/", os.path.sep) def _path_join_blob(path_elements): return _clean_path('/'.join(path_elements)) def _get_separator_local(): get_os_path_sep() def _get_separator_blob(): constants.BLOB_PATH_SEP def _get_separator(system_separator): return system_separator * 2 if system_separator == constants.WINDOWS_PATH_SEP else system_separator # separator can be determined at the start # Yes... I know this is ugly and over engineered sparator = _get_separator(_get_separator_local() if run_type == 'local' else _get_separator_blob()) def _clean_path(path: str) -> str: return re.sub(fr"{separator}+", f"{separator}", path) def _read_text_local(filename): with open(filename, 'r') as file: return file.read() def _read_text_blob(filename): # I have no idea if container is a parameter or if you can get it here container = ... stream = read_blob(container, blob_path, use_lease=True) return stream.read().decode('utf-8') public = { "path_join": _path_join_local if run_type == 'local' else _path_join_blob, "clean_path": _clean_path, "read_text": _read_text_local if run_type == 'local' else _read_text_blob, } 

The optimist in me would now expect the above used as follows:

from pkg import public as util util.path_join(["aesd", "unqwfav"]) util.read_text("jazbec") 

The super optimistic claim would be that the following would be possible:

from pkg.public import path_join, read_text ... 

but I don't think that is how python works.

Anyways, this is the idea: have public object configured on startup by some sort of a factory (the factory "methods" in this case are the ternary expressions defining the public object properties). Hopefully this makes sense, but I feel like it really overcomplicates the code. It might be of some help if these were split into multiple files.

-1

In OOP style, you would create a class that contains all the functions let say for server mode. Then you derive a new class from the first, that inherits all the functions. Here you can overwrite the functions that are different for local mode.

Now you can create instances (objects) from the one or other class, depending on your mode. That way you always call the right function without decision.

4
  • Is that the way you do it in OO? What does "inheritance" model? Does that relationship show up in the question's code - the difference between "local" and "blob"? For example: given the question, why did you pick "server" to be the superclass instead of "blob"? Couldn't it have been the other way around? You suggest so yourself (the word "say"). Does "inheritance" model that apparently arbitrary choice? Commented Jan 3, 2022 at 23:59
  • Isn't this what you said?: > ...run in one of two modes (in my case server mode and local mode) Inheritance is what happens, if you derive one class from the another: The second one inherits all the properties of the first one. But you can overwrite functions to let the second class behave differently. In this example, it does not matter which class is created first. But there may be situations where you create an abstract class first and derive both from the abstract class. Commented Jan 4, 2022 at 1:18
  • Derivation is for an IS-A relationship. One thing IS-A another thing, just more refined. A Giraffe IS-A Mammal which IS-A Animal. It is not the case that Local IS-A Blob and it is also not the case that a Blob IS-A Local. Derivation - i.e., inheritance from a class - is not appropriate. There are other appropriate O-O techniques, e.g., implementation of an interface that would be proper in this case. Or, as you say in your comment, derive from a common abstract class (that implements an interface plus any common behavior). But that's not what you say in your answer. Commented Jan 4, 2022 at 2:14
  • Sorry, maybe I misunderstood your question. Commented Jan 6, 2022 at 19:57

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.