3

We encountered the following problem in our company. We have multiple Red Hat Enterprise Linux Servers on which "SAP HANA S/4" is running. We created a systemd Service to automatically start and stop the daemon so that we don't need to manually interact with the system on a reboot or shutdown.

Autostart works well, but there seems to be a problem with stopping the daemon correctly on shutdown. The daemons are running with another user (individual for each server). It seems that systemd starts to kill the user sessions before the actual service is stopped; as a result, the service won't stop properly.

Service

[Unit] Description=saphana After=remote-fs.target user.slice sapinit.service multi-user.target Requires=user.slice [Service] KillMode=none Type=oneshot ExecStart=/hana/source/scripts/sapHanaControl.pl start ExecStop=/hana/source/scripts/sapHanaControl.pl stop RemainAfterExit=yes [Install] WantedBy=multi-user.target 

The script wich is called in ExecStart and ExecStop basically executes the following command.

On Start:

"sudo -u $username csh -c "sapcontrol -nr $instance -function Start" 

On Stop: "sudo -u $username csh -c "sapcontrol -nr $instance -function Stop"

Shutdown log

Output of the Systemd Log shows the following:

Jun 20 16:23:05 host123 systemd[1]: Stopping Session c4 of user **userxy**. Jun 20 16:23:05 host123sapHanaControl.pl[15003]: sudo -u **userxy** csh -c "sapcontrol -nr 00 -function Stop" Jun 20 16:23:05 host123 sapHanaControl.pl[15003]: 20.06.2018 16:23:05 Jun 20 16:23:05 host123 sapHanaControl.pl[15003]: Stop Jun 20 16:23:05 host123 sapHanaControl.pl[15003]: FAIL: NIECONN_REFUSED (Connection refused), NiRawConnect failed in plugin_fopen() 

Update

I see the following processes running when the system is running normally:

[root@wsstadt325 ~]# ps -ef | grep sapstartsrv d61adm 1740 1 0 11:56 ? 00:00:01 /usr/sap/D61/HDB05/exe/sapstartsrv pf=/usr/sap/D61/SYS/profile/D61_HDB05_wsstadt325 -D -u d61adm sapadm 1741 1 0 11:56 ? 00:00:04 /usr/sap/hostctrl/exe/sapstartsrv pf=/usr/sap/hostctrl/exe/host_profile -D d21adm 1946 1 0 11:56 ? 00:00:02 /usr/sap/D21/ASCS01/exe/sapstartsrv pf=/usr/sap/D21/SYS/profile/D21_ASCS01_wsstadt325 -D -u d21adm d21adm 2182 1 0 11:56 ? 00:00:02 /usr/sap/D21/D00/exe/sapstartsrv pf=/usr/sap/D21/SYS/profile/D21_D00_wsstadt325 -D -u d21adm` 

Chnaged my script to log the "ps -ef | grep sapstartsrv" output when the system gets rebooted / powered off.

ps -ef | grep sapstartsrv sapadm 1683 1 0 13:52 ? 00:00:01 /usr/sap/hostctrl/exe/sapstartsrv pf=/usr/sap/hostctrl/exe/host_profile -D root 5706 5522 0 14:00 ? 00:00:00 sh -c ps -ef | grep sapstartsrv root 5708 5706 0 14:00 ? 00:00:00 grep sapstartsrv 

The sapstartsrv Service is started by a default SAP Service (sapinit) wich gets startet before my own Systemd Service (So on a reboot it is the reversed order [Stop my Systemd Service -> stop the Sapinit Service]) The problem seems to be that systemctl starts to kill the user session (In my case for the user: d21adm and d61adm) where the sapstartsrv process is running before my actual Systemd service is stopped. (Hope that makes at least a bit sense)

Here's an image of the whole systemd chain (my services are at the very end): The services involved: - sapinit.service (The default one) - saphana.service (My custom one)

Image Systemd Chain

5
  • what if you manually try to stop sapcontrol? Does the error occur as well? Commented Jun 22, 2018 at 9:42
  • I've found an article, which indicates that NiRawConnect error can occur if sapstartsrv service isn't running while stopping sapcontrol. Check it out here: blogs.sap.com/2015/09/07/… Commented Jun 22, 2018 at 9:58
  • Manually stopping the sapcotrol works well. The error just occours on a reboot or shutdown of the system. As you found out the problem seems to be with the sapstartsrv service wich in my case is running under a differen user. (Will edit my queston with the new infomation) @sys463 Commented Jun 22, 2018 at 11:54
  • Under which user are you running the systemd unit? Commented Jun 22, 2018 at 14:43
  • Both systemd units are running as root. Commented Jun 24, 2018 at 8:59

1 Answer 1

2

Figured out the cause of my problem as described in the following KB https://www.suse.com/de-de/support/kb/doc/?id=7022671

Systemd kills every user.slice after 90 seconds (This timeout can't be changed) It looks like systemd just isn't made to automatically Stop SAP HANA Instances without modifying pam.d. The Solution described there seems to be a bit "hackish" but it works.

cp /etc/pam.d/system-auth /etc/pam.d/custom-su-session vim /etc/pam.d/custom-su-session 

Insert the following line before "session optional pam_systemd.so"

session [success=1 new_authtok_reqd=ok default=ignore] pam_listfile.so item=user sense=allow file=/etc/custom-su-session

This line skips the user.slice creation when the su command is executed an the user is listed in the file /etc/custom-su-session

vim /etc/pam.d/su 

Replace session include system-auth with session include custom-su-session

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.