Should I add functionality by adding a new method to a class - or should I "register" the new functionality into a data structure?

Question

I have one large class that computes ~50 different metrics (each metric has no side effects).

My code is similar to this:

class ReportingMetrics: def __init__(self, data:pd.DataFrame, config:dict): self.data = data ... # Common data validation, etc... def calculate_metric1(self)->pd.Series: ... def calculate_metric2(self)->pd.Series: ... def calculate_metric3(self)->pd.Series: ... def calculate_metric4(self)->pd.Series: ... def calculate_metric5(self)->pd.Series: ... def calculate_metric6(self)->pd.Series: ... ...

And as you can imagine, I add metrics fairly often.

I am considering a refactor where each calculate_metric{i} is either a function or a class - and they get "registered" to a data structure of some kind.

It would look something like:

class BaseReportingMetric: @abstractmethod def calculate(self)->pd.Series: ... class ReportingMetric1(BaseReportingMetric): ... class ReportingMetric2(BaseReportingMetric): ... reporting_metric_register = [ ReportingMetric1, ReportingMetric2, ... ]

I've also seen a common pattern where the BaseReportingMetric class implements a .register(self, reporting_metric_register) if I want to keep the registering/unregistering as part of the metric code.

However, doing this would be a fairly large refactor, so happy to hear of good/bad experiences from people that have changed their code from one style to the other.

Christophe · Accepted Answer · 2022-02-03 11:52:23Z

As your metrics evolve and get enriched quite often, it’s important to keep in mind that classes should be open for extension but closed for changes (Open Closed Principle).

In this regard, your current mega-class has the following drawbacks:

the kind of mega-class exposes its internals to all the metric calculations. This bears the risk of losing the benefits of encapsulation and increases the risk of accidental side effects. Depending on the internals, in the worst case it could even lead to some spaghetti code, where you’d rely on assumptions about sequence of metrics calculation.
every new metric requires to change the class. Unfortunately such a change might affect the whole class and would require extensive non-regression testing.
every new metric changes the interface of the class. These kind of changes could propagate to all te code using your mega-class that would also require extensive non-regression testing.

These are considerable drawbacks for long term maintenance, with hidden risks and costs. So addressing this issue early, has the inconvenience of some substantial refactoring. But with significative long term benefits, that should largely offset the (very temporary)inconveniences.

Conclusion: yes, go for your new approach. It provides better encapsulation, is compliant with OCP, facilitates separation of concerns and promote reusability (e.g. different sets of metrics in different contexts).

I would add to this: unless you really need access to the internal state of the metrics, just use functions instead of classes. Python gives you first class functions, might as well make use of them. — Turksarama
– Turksarama, Commented Feb 3, 2022 at 12:19
@Turksarama That’s indeed a very valid point for languages like python that allow free functions. But it depends how the functions shall be used. In OP’s example there are for example some validation on the dataset used and other hidden common preparatory steps. This is why I elaborated on the proposed approach. Moreover, if you’d want to use the set of metrics as parameters to other functions, it would become less convenient to work with arrays of closures. — Christophe
– Christophe, Commented Feb 3, 2022 at 20:10

Joe · Accepted Answer · 2022-02-09 12:46:50Z

While I agree with @Christophe's answer that from a design perspective, a refactor would likely improve the quality of the code (and he gave very compelling reasons in what ways it would), you should think about the motivation behind your change. If you refactor it "just because you feel like it", it's probably YAGNI and in its current state a waste of (your) time.

If you see already have valid use cases that justify adding additional complexity to your system, and you have a clear idea how that complexity will mutate in the future, then it might be a good idea to do so now. Otherwise there's a chance you can introduce a wrong abstraction.

Stack Exchange Network

Should I add functionality by adding a new method to a class - or should I "register" the new functionality into a data structure?

2 Answers 2

Hot Network Questions

Should I add functionality by adding a new method to a class - or should I "register" the new functionality into a data structure?

2 Answers 2

Related

Hot Network Questions