Commonly I see dataset.countDataset.count throughout codebases in 3 scenarios:
- logging
log.info("this ds has ${dataset.count} rows") - branching
if (dataset.count > 0) do x else do y - force a cache
dataset.persist.count
Does it prevent the query optimizer from creating the most efficient dag by forcing it to be eager prematurely in any of those scenarios?