- Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Issue Description
When using healthchecks in Podman 5.x, we’ve observed that the internal health log grows continuously (into the thousands of entries) and never prunes older records, In our tests, the health.log field in the container’s inspect output eventually contains over 12,000 records, which keeps growing by time. This contrasts with Podman 4.x, which typically keeps only ~5 log entries. Furthermore, running top on the host shows unusually high memory usage by the /usr/bin/podman healthcheck process over time. These symptoms suggest a memory leak tied to Podman’s healthcheck mechanism in version 5.x.
Steps to reproduce the issue
Steps to Reproduce:
- Healthcheck Configuration:
Use a healthcheck configuration identical to the one that worked in Podman 4.x. For example:
"Healthcheck": { "Test": [ "CMD", "curl", "-f", "http://agent:8080/health" ], "Interval": 30000000000, "Timeout": 10000000000, "Retries": 5 } -
Run the Container:
Start a container with this configuration on Podman 5.x. -
Monitor Health Log:
After the container runs for a while, run podman inspect and check the State.Health.Log field. In Podman 5.x, it continuously accumulates records (e.g., over 12,000 entries) rather than being capped (as observed in Podman 4.x, which only shows about 5 entries). -
Observe Memory Usage:
Use monitoring tools (e.g., top) to observe the memory usage. There is a significant and continuous increase in memory consumption, particularly in kernel memory (kmalloc-2k and kmalloc-4k slabs).
This is high usage in top command for healthcheck is randomly visible and we are running 8 containers.
Describe the results you received
When using healthchecks in Podman 5.x, we’ve observed that the internal health log continuously grows instead of being capped at a few entries (as seen in Podman 4.x). In our tests, the health.log field in the container’s inspect output eventually contains over 12,000 records compared to the expected ~5 entries in version 4.x. This uncontrolled log growth correlates with a continuous increase in memory usage.
Describe the results you expected
the mem usages should not increace, and it should have limited logs
podman info output
host: arch: amd64 buildahVersion: 1.37.5 cgroupControllers: - memory - pids cgroupManager: systemd cgroupVersion: v2 conmon: package: conmon-2.1.12-1.el9.x86_64 path: /usr/bin/conmon version: 'conmon version 2.1.12, commit: b3f4044f63d830049366c05304a1d5d558571e85' cpuUtilization: idlePercent: 76.81 systemPercent: 6.73 userPercent: 16.46 cpus: 2 databaseBackend: sqlite distribution: distribution: ol variant: server version: "9.5" eventLogger: file freeLocks: 2026 hostname: k-jambunatha-tf64-ecp-edge-multi-int-openstack-perf-1771036--ed idMappings: gidmap: - container_id: 0 host_id: 2001 size: 1 - container_id: 1 host_id: 100000 size: 65536 uidmap: - container_id: 0 host_id: 2002 size: 1 - container_id: 1 host_id: 100000 size: 65536 kernel: 5.15.0-304.171.4.1.el9uek.x86_64 linkmode: dynamic logDriver: k8s-file memFree: 809750528 memTotal: 3803951104 networkBackend: netavark networkBackendInfo: backend: netavark dns: package: aardvark-dns-1.12.2-1.el9_5.x86_64 path: /usr/libexec/podman/aardvark-dns version: aardvark-dns 1.12.2 package: netavark-1.12.2-1.el9.x86_64 path: /usr/libexec/podman/netavark version: netavark 1.12.2 ociRuntime: name: crun package: crun-1.16.1-1.el9.x86_64 path: /usr/bin/crun version: |- crun version 1.16.1 commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32 rundir: /run/user/2002/crun spec: 1.0.0 +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL os: linux pasta: executable: /usr/bin/pasta package: passt-0^20240806.gee36266-2.el9.x86_64 version: | pasta 0^20240806.gee36266-2.el9.x86_64 Copyright Red Hat GNU General Public License, version 2 or later <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. remoteSocket: exists: true path: /run/user/2002/podman/podman.sock rootlessNetworkCmd: pasta security: apparmorEnabled: false capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT rootless: true seccompEnabled: true seccompProfilePath: /usr/share/containers/seccomp.json selinuxEnabled: true serviceIsRemote: false slirp4netns: executable: /usr/bin/slirp4netns package: slirp4netns-1.3.1-1.el9.x86_64 version: |- slirp4netns version 1.3.1 commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236 libslirp: 4.4.0 SLIRP_CONFIG_VERSION_MAX: 3 libseccomp: 2.5.2 swapFree: 2469085184 swapTotal: 4194299904 uptime: 312h 40m 36.00s (Approximately 13.00 days) variant: "" plugins: authorization: null log: - k8s-file - none - passthrough - journald network: - bridge - macvlan - ipvlan volume: - local registries: search: - container-registry.oracle.com store: configFile: /home/user/.config/containers/storage.conf containerStore: number: 10 paused: 0 running: 10 stopped: 0 graphDriverName: overlay graphOptions: {} graphRoot: /home/user/.local/share/containers/storage graphRootAllocated: 40961572864 graphRootUsed: 2026479616 graphStatus: Backing Filesystem: xfs Native Overlay Diff: "true" Supports d_type: "true" Supports shifting: "false" Supports volatile: "true" Using metacopy: "false" imageCopyTmpDir: /var/tmp imageStore: number: 10 runRoot: /run/user/2002/containers transientStore: false volumePath: /home/user/.local/share/containers/storage/volumes version: APIVersion: 5.2.2 Built: 1735903242 BuiltTime: Fri Jan 3 06:20:42 2025 GitCommit: "" GoVersion: go1.22.9 (Red Hat 1.22.9-2.el9_5) Os: linux OsArch: linux/amd64 Version: 5.2.2Podman in a container
No
Privileged Or Rootless
None
Upstream Latest Release
No
Additional environment details
podman --version
podman version 5.2.2
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting