2

I'm trying to submit a job to a school server (HPC) with:

#!/bin/bash #$ -S /bin/bash #$ -cwd #$ -o ./out_$JOB_ID.txt #$ -e ./err_$JOB_ID.txt #$ -notify #$ -pe orte 1 date pwd ################################## RESULT_DIR=~/Results SCRIPT_FILE=sample_job ################################## . /etc/profile . /etc/bashrc module load packages/comsol/4.4 module load packages/matlab/r2012b comsol server matlab "sample_job, exit" -nodesktop -mlnosplash /bin/uname -a mkdir $RESULT_DIR/$name cp *.csv $RESULT_DIR/$name 

The job aborts saying:

Sun Jun 8 14:20:21 EDT 2014 COMSOL 4.4 (Build: 150) started listening on port 2036 Use the console command 'close' to exit the program /usr/bin/xterm Xt error: Can't open display: /usr/bin/xterm: DISPLAY is not set Program_did_not_exit_normally Exception: com.comsol.util.exceptions.FlException: Program did not exit normally Messages: Program did not exit normally Stack trace: at com.comsol.mli.application.a.a(Unknown Source) at com.comsol.mli.application.MatlabApplication.doStart(Unknown Source) at com.comsol.util.application.ComsolApplication.doStart(Unknown Source) at com.comsol.util.application.ComsolApplication.doRun(Unknown Source) at com.comsol.bridge.Bridge$2.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR: Could not start COMSOL Application. See log file: /home/.comsol/v44/logs/server2.log java.lang.IllegalStateException: Shutdown in progress at java.lang.ApplicationShutdownHooks.add(Unknown Source) at java.lang.Runtime.addShutdownHook(Unknown Source) at org.apache.catalina.startup.Catalina.start(Catalina.java:699) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:322) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:451) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at com.comsol.util.application.ServerApplication.a(Unknown Source) at com.comsol.util.application.ServerApplication.a(Unknown Source) at com.comsol.util.application.ServerApplication.a(Unknown Source) at com.comsol.util.application.ServerApplication.main(Unknown Source) 

What might be the reason and how should I fix it?

1 Answer 1

2

I'm assuming that you're using GridEngine as the clustering software when you submit this script to run. Something like this:

$ qsub myscript.sh 

You can include environment variables to qsub that you want the resulting shells that get spawned on the HPC cluster nodes like so:

$ qsub -v DISPLAY=$(hostname):0.0 myscript.sh 

This should "inject" the hostname of the system that you're doing the submitting from as the system that you'd like any GUI's to be remote displayed to.

You may also need to do this to allow your local system to "receive" this remote displayed window. The easiest and least secure way to do this is like so:

$ xhost + 

If this works and you're concerned about making this "more secure" you can be more explicit with xhost + but it's likely not necessary. Let us know how you make out and we can adjust this further, if needed.

What if the above doesn't work?

Newer versions of qsub now include a switch, -X which is purported to pass the environment variable, $DISPLAY along correctly like so:

$ qsub -X myscript.sh 

You could also try using the submitting host's IP address instead of the hostname. It may be the case that the HPC nodes do not have DNS setup properly.

$ qsub -v DISPLAY="$(hostname -i):0.0" myscript.sh 

References

6
  • Hi, so it is comes from the no-display feature of the HPC node, right? I did host + and qsub -v DISPLAY=$(hostname):0.0 run.sh just now, but the error persists. Commented Jun 8, 2014 at 20:35
  • @FarticlePilter The HPC node should be able to remote display GUIs, so you'll have to work out this if you really want to get a GUI from it. Can you use qsh? This should return a xterm GUI. Commented Jun 8, 2014 at 20:39
  • Yes, I can do qsh to get the GUI. Sorry for misleading you. I do not really need the GUI. It can be completely suppressed! The thing is I wish to get the software starting. Because of the GUI thing, it cannot be started. Commented Jun 8, 2014 at 20:42
  • @FarticlePilter - NP, I just wanted to confirm that the GUI could be displayed from the HPC node. So getting the display set will resolve your issue. When you run your command try qsub -V myscript.sh. This will pass ALL the environment variables to the batch job's shell. Also you can give the commands like this: echo env | qsub -V. The resulting -e` and -o files should contain the env. vars. that are available. Commented Jun 8, 2014 at 21:35
  • @FarticlePilter - you might want to try the -X switch to qsub too. See here: hpc.uark.edu/hpc/support/interactive.html. Commented Jun 8, 2014 at 21:46

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.