Skip to content

Conversation

@chrischdi
Copy link
Member

What this PR does / why we need it:

High-level description:

  • KCP: block scale-up/scale-down/rollout/remediation if we know there is an upcoming version upgrade pending from the topology
  • Topology: Do not block propagating changes anymore when the ControlPlane is in scaling state

Note: This change only affects topology based Clusters.

More detailed description:

Changes to KCP:

  • Adds a preflight check for KCP for topology based clusters to check if there is an upgrade pending to get propagated down (Cluster.spec.topology.version != KCP.spec.version)
  • From the perspective of KCP it should be better to roll-forward with the new desired spec instead of first continuing with the old (propably broken) spec of KCP

Changes to the topology controller:

  • Relax computeControlPlaneVersion to allow to propagate down changes when a Control Plane is in scaling.
    • Note: this was only used for ControlPlane providers which implement replicas.
    • The ControlPlane provider itself should know better if it should first finish scaling or rollout the new desired state instead.
    • If this check would have been kept, the above preflight check could have lead to deadlocks when e.g.:
      • a CP machine was marked for remediation while the Cluster's .spec.topology.version was set to a newer one
      • triggering a scale-up or scale-down and immediately after update the Cluster's .spec.topology.version
      • pretty sure others too

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

/area provider/control-plane-kubeadm

@k8s-ci-robot k8s-ci-robot added area/provider/control-plane-kubeadm Issues or PRs related to KCP cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 4, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 4, 2025
@chrischdi chrischdi added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 4, 2025
Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall sgtm only one small suggestion on top of @sbueringer comments

Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nit

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 6, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2cc20dafe39490dd8ec3481a4ec49b13ec6f4f71

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 6, 2025
@sbueringer
Copy link
Member

Thx!

/lgtm
/approve

/hold
Similar to the MS PR, will think about the big picture and then merge in a bit

/test pull-cluster-api-e2e-main

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 6, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2d5643c166c8a6e8167c4b91e31a772dcc980356

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 6, 2025
@chrischdi chrischdi force-pushed the pr-kcp-block-pending-upgrade branch from bbfe7ba to 8f81a31 Compare March 6, 2025 10:29
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 6, 2025
@k8s-ci-robot k8s-ci-robot requested a review from sbueringer March 6, 2025 10:29
@sbueringer
Copy link
Member

@chrischdi Did you intentionally revert the last commit?

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-main

@sbueringer
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 6, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2d5643c166c8a6e8167c4b91e31a772dcc980356

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2025
@k8s-ci-robot k8s-ci-robot requested a review from sbueringer March 7, 2025 10:58
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 7, 2025
@k8s-ci-robot
Copy link
Contributor

@chrischdi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-apidiff-main 505b552 link false /test pull-cluster-api-apidiff-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@sbueringer
Copy link
Member

Thanks!

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Mar 7, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 53b9c80552cb18a26e808fb1ca620ed3e11571b5

@k8s-ci-robot k8s-ci-robot merged commit 3a8728f into kubernetes-sigs:main Mar 7, 2025
17 of 18 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.10 milestone Mar 7, 2025
cprivitere pushed a commit to cprivitere/cluster-api that referenced this pull request Mar 18, 2025
…ubernetes-sigs#11927) * kcp: add preflight check for pending version upgrade from topology * topology: propagate changes to CP even when scaling to prevent deadlocks * kcp: adjust predicates to reconcile the event of cluster version changes * review fixes * fixup * review fix * drop IsScaling for kcp in upgradetracker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/control-plane-kubeadm Issues or PRs related to KCP cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

4 participants