I realize this might be hard to answer without you knowing how my cluster is setup, but I' am trying to submit jobs (via SGE) to a cluster, but the environment is not setup correctly and the jobs fails. Moreover, there are two different master nodes I can login into to submit jobs to the same cluster, and my scripts work on one while not on the other.
The is the machine info for the master node that my script does work on:
cat /proc/version Linux version 2.6.32-279.el6.x86_64 ([email protected]) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Wed Jun 13 18:24:36 EDT 2012 The machine it does not work on:
cat /proc/version Linux version 3.10.0-514.6.2.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Thu Feb 23 03:04:39 UTC 2017 Here is a test script I' am using:
#!/bin/bash -I #$ -wd ~ #$ -N test #$ -o ~/test.log #$ -j y #$ -terse #$ -V #$ -notify #$ -l h_vmem=2G -pe smp 1 -l athena=true ls hostname nproc Here is the output after running "qsub test.sh":
/bin/bash: module: line 1: syntax error: unexpected end of file /bin/bash: error importing function definition for `BASH_FUNC_module' /opt/sge/default/spool/execd/node156/job_scripts/1063646: line 11: ls: command not found /opt/sge/default/spool/execd/node156/job_scripts/1063646: line 12: hostname: command not found To add to the confusion, when I ssh directly into those job nodes (node156 in the above example) I can run the ls and hostname commands just fine!
I've been in contact with the cluster admins, and they are unable to replicate my issue (even if they login as me). We first tested that if setting ~/.bashrc and ~/.bash_profile to default settings would fix it, but it did not. Here are those files:
cat ~/.bashrc # .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi .bash_profile:
cat ~/.bash_profile # .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi # User specific environment and startup programs Any suggestions?