0

I have a jupyter-notebook (in Microsoft Fabric) in which I define functions, imagine like this:

NOTEBOOK_1: CELL_1:

def add(int a, int b): return a + b 

Then I want to run this notebook in another one to access the function like this: NOTEBOOK_2: CELL_1:

%run NOTEBOOK_1 

CELL_2:

x, y = 5 print(add(x, y)) 

Since I have several notebooks like notebook 2 and run them in sometimes arbitrary order, I need access to the function at all times but want to skip running notebook 1 if the function is already defined. How can I do this?

I tried this: NOTEBOOK_1: CELL_0:

try: prep_source_metadata except NameError: print("Execution necessary. Proceeding...") else: print("Execution unnecessary. Exiting...") mssparkutils.notebook.exit(1) 

However, this also exits notebook 2 since there is no dedicated separate notebook run.

I also found this post, How to pass a variable to magic ´run´ function in IPython, where I insert a variable into the %run part. Since I wouldn't want an error, I suppose I would have to make an empty dummy notebook for the "don't run" version. That would work well enough, I guess, but would be totally unelegant. Is there a better way?

EDIT1: Regarding the comments (thanks!):

  • Since I'm constrained by Microsoft Fabric, I can't just use a .py file, I have to work with notebooks.
  • importnb is not finding the other notebook (might be due to the Microsoft Fabric framework...):
import importnb with importnb.Notebook(): from NOTEBOOK_2 import add 

leads to

Cell In[7], line 4 1 import importnb 3 with importnb.Notebook(): ----> 4 from Gold_Automation_Functions import prep_source_metadata ModuleNotFoundError: No module named 'Gold_Automation_Functions' 
  • Running another Notebook with MSSparkUtils does not work because the namespace is not shared (or however you describe that correctly, it's "NameError: name 'my_function' is not defined")

Another thing I tried which doesn't work is exiting the called notebook with MSSparkUtils. That exits the calling notebook as well, unfortunately. Trying to skip all cells might be a worthwhile approach, but there doesn't seem to exist an easy, elegant solution either.

EDIT2: Most promising avenue so far seems to be using a .whl file (https://learn.microsoft.com/en-us/fabric/data-engineering/environment-manage-library).

3
  • maybe keep functions in normal file .py and use import to load functions. Commented Jun 27, 2024 at 15:54
  • fura's idea of import is a good one. Though there is no need to necessarily move to .py if you are happy with the .ipynb file, and the way it is set up lends itself to use with importnb, see here and here. Commented Jun 27, 2024 at 15:58
  • if your running a spark notebook try MSSparkUtils, rather than %run to call the subnotebook, you can add that to an if clause and will run or not based on if a function exists learn.microsoft.com/en-us/azure/synapse-analytics/spark/… Commented Jun 28, 2024 at 6:41

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.