I would like to run my MD simulation in parallel on a cluster (16 CPUs per node).
From the ASE doc we can see that:
ASE will automatically run in parallel, if it can import an MPI communicator from any of the supported libraries. ASE will attempt to import communicators from these external libraries: GPAW, Asap, Scientific MPI and MPI4PY.
I submit a slurm script with these options:
#SBATCH --nodes=1 #SBATCH --ntasks-per-node=16 ... srun python ase_script.py ... It looks like the fastest is always the single cpu calculation and increasing the ntasks per core (1 to 16), makes the calculation go slower.
The only difference I see is that the log output contains multiple lines (times ntasks-per-node) for a given step, but with different results. E.g., for a Velocity Verlet calculation and 8 cpus per nodes, I get:
Time[ps] Etot[eV] Epot[eV] Ekin[eV] T[K] ... 0.0440 -8297.460 -8299.281 1.821 79.6 0.0440 -8297.476 -8299.305 1.829 80.0 0.0440 -8297.477 -8299.304 1.827 79.8 0.0440 -8297.741 -8299.557 1.816 79.4 0.0440 -8297.744 -8299.563 1.819 79.5 0.0440 -8297.744 -8299.570 1.826 79.8 0.0440 -8297.746 -8299.584 1.838 80.3 0.0440 -8297.749 -8299.612 1.863 81.4 ... I have tried to decorate my dynamics function with @parallel_function (before the definition and before run() method) but I get an error.
I have installed all the required libraries in my conda environment:
$ conda list > conda_list $ cat conda_list | grep -e asap -e gpaw -e mpi asap3 3.12.12 py310hb818612_2 conda-forge gpaw 23.9.1 py310_mpi_openmpi_omp_0 conda-forge gpaw-data 0.9.20000 hd8ed1ab_2 conda-forge mpi 1.0 openmpi conda-forge mpi4py 3.1.4 py310h6075a6b_0 conda-forge openmpi 4.1.5 h414af15_101 conda-forge Is there something else I can try?
Small side question, how can I print the volume of the cell in ASE output above?