I wrote a notebook on Kaggle and imported a dataset.
The main content of the notebook is as follows:
%%bash pip install xxx # Install dependencies if [ ! -d "/kaggle/working/latex-ocr-pytorch" ]; then echo "Directory does not exist, copying..." cp -r /kaggle/input/latex-ocr-pytorch /kaggle/working else echo "Directory already exists, skipping copy" fi cd /kaggle/working/latex-ocr-pytorch python train.py # Train and save the model (save checkpoints) After that, I ran the notebook using "Save & Run All (Commit)".
After the run completes, multiple checkpoint_xxx.pth.tar files are generated in the /kaggle/working/latex-ocr-pytorch/checkpoints directory. At this point, if I want to download the files in the checkpoints folder, it becomes quite troublesome because there are multiple files, and I have to click download for each one individually.
I tried using the command kaggle kernels output user_name/kaggle-latex-ocr-pytorch -p /path/to/dest to download, but I found that what I downloaded were log files, which is not what I wanted.
So I referred to an online blog and added a code cell at the end of the notebook:
%%bash cd /kaggle/working/latex-ocr-pytorch/ if [ -d "checkpoints" ]; then tar -czf checkpoints.tar.gz checkpoints echo "Compression successful" else echo "Warning: checkpoints directory does not exist" exit 1 fi I also noticed that after running the notebook with "Save & Run All (Commit)", when I edit the notebook again, the checkpoints folder under /kaggle/working/latex-ocr-pytorch/ disappears. Therefore, I cannot package the checkpoints folder in a draft session either.
Since the model takes a long time to run and Kaggle has a single-session time limit of 12 hours, the notebook disconnected midway during execution. As a result, the last code cell for packaging the checkpoints did not run. What is the solution to this?