Suppose you have a linear regression model $Y = \beta_0 + \beta_1 T + \beta_2 X + \dotsc + \epsilon$ where $Y$ is the weight of livestock, $T \in \{0,1\}$ represents treatment assignment between group A and B or antidote and growth hormone, and $X$ is a control variable such as initial weight.
Objective of a causal inference: getting unbiased, efficient estimate of the marginal effect of a treatment on the outcomes, which is typically measured by $\beta_1$ the coefficient of $T$ in a model for $Y$. Particularly, we want to find evidence if $\beta_1 \neq 0$. It has little to do with $R^2$. Under correct experimental design and model specification, it is possible to have a very large $\beta_1$ but very small $R^2$, which means that there are many other determinants of $Y$ not accounted for remaining in the error term. On the other hand, it is also possible to have a very small $\beta_1$ but very large $R^2$, which means that there are just a few determinants of $Y$ mostly accounted for by $T$ and $X$. A small $R^2$ invalidates neither the experimental design nor the model specification, as long as $\beta_1$ is unbiased.
Benefits of adding more covariates or control variables: smaller standard errors and possible correction for confounding effects in observational studies. By having more predictors, $R^2$ increases while the residual standard error decreases. Because all coefficients' standard errors are proportional to that of the residual, the standard error of the treatment coefficient will be smaller with additional covariates. Holding the point estimate of the treatment coefficient constant, a smaller standard error of the estimate corresponds to a larger $t$ or $z$ statistic and a smaller $p$ value. This means that a causal effect could appear nonsignificant when important covariates are missing but significant when they are included. Adding covariates is necessary if they affect both treatment and outcome. For example, $X$ is the initial weight which of course determines the ending weight $Y$. If hormone use $T$ is more likely among animals with lower $X$, which makes $X$ and $T$ negatively correlated, $\beta_1$ will absorb effects of both $T$ and the portion of $X$ that correlates with $T$ if $X$ is omitted and return a smaller estimate than the unbiased one.
Caveats of adding more covariates: avoiding variables that distort what $\beta_1$ is supposed to measure. Recall that $\beta_1$ should return the marginal effect of $T$ switching from 0 to 1 on $Y$, holding $X$ constant. If $T$ cannot freely switch between 0 to 1 on $Y$ without changes in $X$, then the model is misspecified. This could happen if $X$ is either the cause or effect of $T$. For example, $X$ is a screening procedure, and only those passing the screening will receive any growth hormone, so $X$ is the cause of $T$; if $X$ is the length, which increases along with weight and can be affected by $T$, so both $X$ and $Y$ are the effect of $T$. In the former case, $X$ should be removed from the equation and act as an instrument variable for $T$. See "instrument variable estimator." In the latter case, $X$ should be removed from the equation and analyzed separately under a different topic. These decisions are made based on $\beta_1$ instead of $R^2$. See What variables to include/exclude when estimating causal relationships using regression.
Key assumption of an unbiased treatment effect: $T$ is independent from the error term $\epsilon$. This usually comes from the experimental design instead of any modeling effort if $T$ is assigned randomly, such as randomized, double-blind, controlled trial. However, treatment heterogeneity can be present even under random assignment where functions, polynomials, and interactions of $T$ should be included. If omitted, these terms related to $T$ will be absorbed in $\epsilon$, making $T$ and $\epsilon$ correlated and biasing the effect estimate. When $T$ is binary, interaction terms between $T$ and $X$ may be necessary if the effect of $T$ on $Y$ varies with $X$. For example, if the growth hormone accelerates weight gain among female animals but inhibits growth of male ones, omitting interaction between hormone $T$ and sex $X$ may find $\beta_1$ nearly zero and miss the important discovery that the effect is negative among males and positive among females.
Overall, a good causal inference study requires careful experimental design and correct model specification, which is assessed both by the property of $\beta_1$ and neither by the size of $R^2$.