1

What is the best practice for creating an object based on the value of a string argument?

I am loading my configuration from a file and passing the string as an argument that gets mapped to a specific OCR engine subclass constructor. Each class in the code block below My current solution is to use a dictionary and create the object based on the kwargs that are passed into it. Is this a design issue? Should I refactor my inheritance hierarchy to use composition instead of inheritance? I'm looking for any thoughts.

I feel this passing of a string negates the point of polymorphism. For anyone who has any books or docs on design patterns that address issues like this, please send them my way.

Please ignore the lack of error checking - this is only a snippet.

def main(): config_data = load_config(config_path_file) preprocessing_settings = config_data["preprocessing"] engine_name = preprocessing_settings["ocr_engine"] engine_config_args = parse_args(preprocessor(f'{engine_name}-settings')) with create_ocr_engine(engine_name, engine_config_args) as engine: engine.batch_load_from_dir(data_path) engine.process_all() documents = (engine.document_from_batch() .to_json()) def create_ocr_engine(name, kwargs) -> OcrEngine: return # instance of specified subclass based on name and kwargs # # class TesseractOcrEngine(OcrEngine): # pass # class DeepSeekOcrEngine(OcrEngine): # pass # class EasyOcrEngine(OcrEngine): # pass 
4
  • 3
    Don't overthink it - a simple if/elif chain or a lookup dictionary mapping valid names/aliases to the corresponding types. Commented Nov 18 at 19:02
  • 1
    If you will be needing a ton of classes and/or they will depend a lot on runtime data, you can use the type() function to construct them. Optionally, you can use a dictionary to return the same class every time the same constructor parameter are used. Commented Nov 18 at 19:20
  • 1
    You can try googling "python factory pattern" to find ideas. Commented Nov 18 at 19:40
  • I decided to go with the method user Wim suggested. I modified the constructors to all take **kwargs for the most flexibility. Thanks for all the help! Commented Nov 18 at 20:20

1 Answer 1

0

This might be overcomplicating things, but I've used a solution in which a factory class keeps a registry mapping names to classes and provides a method to instantiate a class based on a name. The registry is managed through a decorator that is applied to the classes themselves.

######################################################################## # for testing only class OcrEngine: def __init__(self, *args, **kwargs): argstrs = [f'{a!r}' for a in args] + [f'{k}={v!r}' for (k,v) in kwargs.items()] print( f'''Creating {self.__class__.__name__}(''' f'''{', '.join(argstrs)}''' f''')''' ) ######################################################################## class OcrEngineFactory: """Factory class to create OcrEngine subclasses based registed names.""" _registry:dict[str,type(OcrEngine)] = {} """The registry mapping names to OcrEngine subclasses""" @classmethod def register(cls, name:str) -> Callable[[type(OcrEngine)],type(OcrEngine)]: """Decorator to register a class with the given name""" def add_engine_to_registry(engine:type(OcrEngine)) -> type(OcrEngine): cls._registry[name.lower()] = engine return engine return add_engine_to_registry @classmethod def create_ocr_engine(cls, name:str, *args, **kwargs) -> OcrEngine: """Create an OcrEngine instance based on the engine name and arguments""" if name.lower() in cls._registry: return cls._registry[name.lower()](*args, **kwargs) raise ValueError(f'No such OcrEngine {name}') ######################################################################## @OcrEngineFactory.register("DeepSeek") class DeepSeekOcrEngine(OcrEngine): pass @OcrEngineFactory.register("Tesseract") class TesseractOcrEngine(OcrEngine): pass @OcrEngineFactory.register("Easy") class EasyOcrEngine(OcrEngine): pass ######################################################################## def main(): config_data = load_config(config_path_file) preprocessing_settings = config_data["preprocessing"] engine_name = preprocessing_settings["ocr_engine"] engine_config_args = parse_args(preprocessor(f'{engine_name}-settings')) with OcrEngineFactory.create_ocr_engine(engine_name, engine_config_args) as engine: engine.batch_load_from_dir(data_path) engine.process_all() documents = (engine.document_from_batch() .to_json()) # testing only OcrEngineFactory.create_ocr_engine('Easy', 'easy goes it as you') OcrEngineFactory.create_ocr_engine('DeepSeek', 'deep seek through') OcrEngineFactory.create_ocr_engine('Tesseract', 'the tesseract') 
Sign up to request clarification or add additional context in comments.

4 Comments

thank you so much whoever downvoted this answer without leaving a comment as to why it should be downvoted.
Some might see this solution as pretty clunky but it makes sense to do over the current structure I use now. It enables me to have the structure of keeping all of the supported engines in the OCRenginefactory class and the instantiation.
i use this structure because it keeps the control over mapping names to classes at the class definition, so when you add something new, you're only working in one place in the source rather than having to remember to go add the new class to the name-to-class mapping elsewhere.
BTW - I forgot to put this in my answer, but: if you have ownership over the OcrEngine class, the OcrEngineFactory functionality can be put directly into OcrEngine. (This is mostly what I do, since my factories are generally for classes I am developing anyway.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.