Skip to content

Add per-browser CDP accessibility tree viewport collapse #4133

@magreenblatt

Description

@magreenblatt

Problem

AI agents using Playwright's ariaSnapshot() or CDP Accessibility.getFullAXTree() receive the entire accessibility tree, including hundreds of off-screen nodes. This wastes serialization time and consumes AI context budget on nodes the agent can't see or interact with.

Solution

Add a CefBrowserSettings.ax_viewport_collapse setting (experimental) that collapses off-screen nodes in CDP accessibility tree serialization:

  • Off-screen landmarks/headings: serialized as summaries (role + name only, empty childIds)
  • Off-screen interactive/structural nodes: pruned from the response
  • In-viewport nodes: fully serialized with all children and properties
  • position:fixed/sticky descendants: correctly detected as in-viewport even when their parent is off-screen

The setting is per-browser and can be toggled at runtime via CefBrowserHost::SetAxViewportCollapse(). Platform screen readers (NVDA, JAWS, VoiceOver) are completely unaffected — they use a separate code path. CDP nodesUpdated events are suppressed when active to maintain tree consistency. queryAXTree is left unfiltered so agents can still locate off-screen elements for scroll targeting.

Agentic workflow

With the setting enabled, an AI agent's typical interaction loop becomes:

  1. Get collapsed treegetFullAXTree returns in-viewport nodes fully serialized, off-screen landmarks/headings as name-only summaries, and everything else pruned. The agent sees a compact representation of the page.
  2. Interact with visible content — buttons, inputs, links in viewport are fully described.
  3. Discover off-screen sections — collapsed landmark summaries (e.g. navigation: "Footer Nav") tell the agent what exists below the fold without serializing the full subtree.
  4. Scroll to a section — the agent uses the summary node's backendDOMNodeId with DOM.scrollIntoViewIfNeeded to bring it into view.
  5. Re-get tree — the scrolled-to section is now fully serialized; previously visible content that scrolled out is collapsed.

No changes are required in Playwright, MCP servers, or agent code — only the CEF embedder enables the setting.

Verification

  • 12 ceftests covering: default/disabled/enabled states, heading levels, childId filtering, position:fixed visibility, nested landmarks, runtime toggle, all-in-viewport no-op, queryAXTree passthrough, full agentic scroll workflow, and zoom-level coordinate correctness
  • Manual verification via cef/tools/debug/ax_viewport_collapse/verify_viewport_collapse.py
  • cefclient --ax-viewport-collapse enables the setting for manual testing

Metadata

Metadata

Assignees

No one assigned

    Labels

    agenticRelated to supporting AI workflowsenhancementEnhancement request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions