Skip to content

Add memory watchdog#33

Merged
EvilOlaf merged 6 commits intomainfrom
watchdog-1
Jan 5, 2026
Merged

Add memory watchdog#33
EvilOlaf merged 6 commits intomainfrom
watchdog-1

Conversation

@EvilOlaf
Copy link
Owner

@EvilOlaf EvilOlaf commented Jan 5, 2026

SCUM server is known to be poorly designed and suffering from memory leaks. That's the reason why most public servers are restarted frequently to do sort of garbage collection.

This watchdog will check if the system memory runs critically low and (attempts to) perform a grateful shutdown in order to prevent data loss due to out-of-memory kill.

untested

Summary by CodeRabbit

  • New Features
    • Added automatic memory monitoring with configurable threshold and check interval; when the configured threshold is exceeded the server initiates a graceful shutdown to reduce memory-related incidents.
    • Exposed new runtime configuration options for memory threshold and check interval so behavior can be adjusted without code changes.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

Warning

Rate limit exceeded

@EvilOlaf has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7dd04ad and 91999ac.

📒 Files selected for processing (1)
  • start-server.sh

Walkthrough

Adds a memory watchdog: two environment variables in docker-compose.yml (MEMORY_THRESHOLD_PERCENT, MEMORY_CHECK_INTERVAL) and a background monitoring loop in start-server.sh that checks system memory periodically and sends SIGINT to the SCUM process when the threshold is exceeded.

Changes

Cohort / File(s) Change Summary
Memory watchdog config
docker-compose.yml
Added MEMORY_THRESHOLD_PERCENT=95 and MEMORY_CHECK_INTERVAL=60 to the scum-server environment block.
Memory watchdog runtime
start-server.sh
Added MEMORY_THRESHOLD and MEMORY_CHECK_INTERVAL parsing/validation and a background loop that checks memory usage and sends SIGINT to SCUM_PID when threshold is exceeded; supports disabling via threshold ≤ 0 and logs warnings for invalid values.

Sequence Diagram

sequenceDiagram participant Start as start-server.sh participant Watchdog as Memory Watchdog Loop participant System as System Memory (free) participant SCUM as SCUM_PID Process participant Kernel as Kernel (signals) Start->>SCUM: launch SCUM server (record SCUM_PID) Start->>Watchdog: start watchdog (threshold, interval) [background] loop every MEMORY_CHECK_INTERVAL Watchdog->>System: query memory usage System-->>Watchdog: memory percent alt usage >= threshold Watchdog->>Watchdog: log threshold exceeded Watchdog->>Kernel: send SIGINT to SCUM_PID Kernel-->>SCUM: deliver SIGINT SCUM->>SCUM: perform graceful shutdown SCUM-->>Watchdog: process exits Watchdog->>Watchdog: stop loop else usage < threshold Note over Watchdog: sleep until next check end end 
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • add proper shutdown handling #26: Implements graceful shutdown handling and stop_grace_period configuration in docker-compose—both PRs modify start-server.sh shutdown behavior and docker-compose shutdown-related settings.

Poem

🐰 I nibble logs and count the heap,
A careful hop when bytes run deep,
If memory climbs above the line,
I tap SIGINT, a gentle sign,
And watch the server sleep, all fine.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add memory watchdog' directly and clearly summarizes the main change—introducing a memory monitoring component to the codebase. It is concise, specific, and accurately reflects the primary focus of the changeset across both modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Fix all issues with AI Agents 🤖
In @start-server.sh: - Around line 95-96: The env vars MEMORY_THRESHOLD (from MEMORY_THRESHOLD_PERCENT) and CHECK_INTERVAL (from MEMORY_CHECK_INTERVAL) are not validated and non-numeric values will break the numeric comparisons and sleep; add validation after their assignment: check each with a regex like [[ "$VAR" =~ ^[0-9]+$ ]] and ensure the value is greater than zero, log a warning and fall back to the safe defaults (e.g., 95 for MEMORY_THRESHOLD and 60 for CHECK_INTERVAL) or exit with an error if you prefer strictness; update references to MEMORY_THRESHOLD and CHECK_INTERVAL so subsequent uses (the -gt comparison and sleep) always receive valid positive integers. - Line 102: The MEM_USAGE assignment currently computes used/total (using $3) which counts cache/buffers; change it to compute available/total by using the "available" column from free (the field for "available" rather than the used field) so MEM_USAGE reflects reclaimable memory; update the awk/field expression in the MEM_USAGE assignment (the line that sets MEM_USAGE) to divide available by total instead of used by total and keep the same formatting/rounding. 
🧹 Nitpick comments (3)
docker-compose.yml (1)

11-12: Consider lowering the memory threshold for earlier intervention.

A 95% memory threshold leaves minimal safety margin; the system may already be under pressure or triggering OOM conditions by that point. For more reliable prevention of OOM kills and data loss, consider lowering the default to 85-90%.

🔎 Suggested adjustment
- - MEMORY_THRESHOLD_PERCENT=95 + - MEMORY_THRESHOLD_PERCENT=85 - MEMORY_CHECK_INTERVAL=60
start-server.sh (2)

100-110: Consider adding watchdog cleanup when the server process dies.

The watchdog continues running even if SCUMServer.exe exits naturally or crashes. While the overhead is minimal, it's cleaner to terminate the watchdog when its target process is no longer alive.

🔎 Add process liveness check
 ( while true; do + # Exit watchdog if target process is gone + if ! kill -0 "$SCUM_PID" 2>/dev/null; then + echo "Memory watchdog: target process no longer running, exiting" + break + fi +  MEM_USAGE=$(free | awk '/Mem/{printf("%.0f"), $3/$2*100}') if [ "$MEM_USAGE" -ge "$MEMORY_THRESHOLD" ]; then echo "Memory watchdog triggered at ${MEM_USAGE}% memory usage! Attempt graceful shutdown to prevent data loss..."

103-107: Remove unnecessary verification request; however, ensure watchdog waits for graceful shutdown like the shutdown() function does.

The memory watchdog sends SIGINT to the server, which is the same signal used by the shutdown() function that already handles graceful shutdown. However, the watchdog breaks immediately after sending the signal without waiting for the process to actually terminate, unlike shutdown() which waits up to 60 seconds using kill -0 checks. Consider adding a similar wait loop to the watchdog to ensure the server persists data before the script continues.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ee6481 and 1d4c0f6.

📒 Files selected for processing (2)
  • docker-compose.yml
  • start-server.sh
Repository owner deleted a comment from coderabbitai bot Jan 5, 2026
Repository owner deleted a comment from coderabbitai bot Jan 5, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @start-server.sh: - Around line 111-116: MEM_USAGE assignment can be empty/non-numeric if free fails; before comparing to MEMORY_THRESHOLD in the watchdog loop validate that MEM_USAGE is a non-empty numeric value (e.g., test with a regex or shell arithmetic check) and if invalid set a safe fallback value or log an error and continue the loop rather than performing the numeric comparison. If validation fails, write a clear error/warning (including the raw free output if helpful), skip the kill/exit branch for that iteration, and ensure SCUM_PID handling still uses the existing kill -INT ... || true behavior so the watchdog doesn't crash. Use the existing MEM_USAGE, MEMORY_THRESHOLD and SCUM_PID variables to locate where to add the check and logging. 
♻️ Duplicate comments (2)
start-server.sh (2)

98-101: Missing upper bound validation for MEMORY_THRESHOLD.

The validation checks if MEMORY_THRESHOLD is numeric but doesn't verify it's within a valid percentage range (0-100). Invalid values like 150 or 200 would pass validation and cause incorrect behavior.

🔎 Proposed validation enhancement
 if ! [[ "$MEMORY_THRESHOLD" =~ ^[0-9]+$ ]]; then echo "WARNING: Invalid MEMORY_THRESHOLD_PERCENT value '$MEMORY_THRESHOLD_PERCENT'. Forcing default: 95" MEMORY_THRESHOLD=95 +elif [ "$MEMORY_THRESHOLD" -gt 100 ]; then + echo "WARNING: MEMORY_THRESHOLD_PERCENT must be between 0-100 (got: $MEMORY_THRESHOLD). Forcing default: 95" + MEMORY_THRESHOLD=95 fi

111-112: CRITICAL: Memory calculation logic is inverted.

The current implementation calculates available memory percentage ($7/$2*100) but then checks if it's <= MEMORY_THRESHOLD (95). This means the watchdog triggers when the system has 95% or less memory available—in other words, when memory usage is only 5% or more. This is backwards and will cause immediate false-positive shutdowns when the system has plenty of free memory.

Expected behavior: MEMORY_THRESHOLD=95 should trigger when memory usage (not available) reaches 95%.

Current behavior: Triggers when available memory drops to 95% or below (= usage exceeds 5%).

🔎 Proposed fix

Option 1 (Recommended): Calculate used percentage instead of available percentage:

 while true; do - MEM_USAGE=$(free | awk '/Mem/{printf("%.0f"), $7/$2*100}') - if [ "$MEM_USAGE" -le "$MEMORY_THRESHOLD" ]; then + # Calculate percentage of memory used (inverse of available) + MEM_USAGE=$(free | awk '/Mem/{printf("%.0f"), 100-($7/$2*100)}') + if [ "$MEM_USAGE" -ge "$MEMORY_THRESHOLD" ]; then echo "Memory watchdog triggered at ${MEM_USAGE}% memory usage! Attempt graceful shutdown to prevent data loss..."

Option 2: Keep available percentage but fix the comparison:

 while true; do - MEM_USAGE=$(free | awk '/Mem/{printf("%.0f"), $7/$2*100}') - if [ "$MEM_USAGE" -le "$MEMORY_THRESHOLD" ]; then - echo "Memory watchdog triggered at ${MEM_USAGE}% memory usage! Attempt graceful shutdown to prevent data loss..." + # Calculate percentage of available memory + MEM_AVAIL=$(free | awk '/Mem/{printf("%.0f"), $7/$2*100}') + if [ "$MEM_AVAIL" -le "$((100 - MEMORY_THRESHOLD))" ]; then + echo "Memory watchdog triggered: only ${MEM_AVAIL}% memory available! Attempt graceful shutdown to prevent data loss..."
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d4c0f6 and 7dd04ad.

📒 Files selected for processing (1)
  • start-server.sh
@EvilOlaf EvilOlaf merged commit 05acb0d into main Jan 5, 2026
4 checks passed
@EvilOlaf EvilOlaf deleted the watchdog-1 branch January 5, 2026 13:37
@coderabbitai coderabbitai bot mentioned this pull request Jan 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant