The main decision to be made is about the time scale that you plan on using. Is it time since the origin of the recurrent event process (like some diagnosis or some intervention, or birth) or is it time between two visits? These two approaches are called calendar time and gap time. As a general rule, what answers one of them does not necessarily answer the other one.
For calendar time, the canonical framework is like this: each individual has a counting process $N(t)$ which denotes the number of visits up to time $t$. The intensity of this process is denoted as $\lambda(t)$ and it has to satisfy certain conditions, but for the most part you can treat it as the hazard function from classical survival analysis. What is commonly done is that you take a form such as $\lambda(t | x_i) = \lambda_0(t) e^{\beta' x_i}$. In this case $t$ is time since origin. For an individual with events at $t_1 < t_2 < ... t_n$ and a followup until $\tau > t_n$, you can estimate a model like that in $R$ by taking
coxph(Surv(tstart, tstop, status) ~ x) with tstart = c(0, t_1 ... t_n), tstop = c(t_1, ... t_n, tau) and status = c(1, 1, 1 ... 0) and status is 1 for when tstop corresponds to an event and tau corresponds to the end of follow-up. You should also use a +cluster(id) specification in the formula to get the correct standard errors.
The model implies that the number of events in a given period $(t_a, t_b)$ is Poisson distributed with expectation $\int_{t_a}^{t_b} \lambda(t)dt$, analogous to the cumulative hazard. So in some sense it is really easy to answer this kind of questions.
The cool thing is that you can easily incorporate random effects called frailty with the specification +frailty(id). This is used quite often as a variance reduction technique. You can also use it for prediction although that might require a bit more work and thinking.
The second option is to use gap time scale. The gaps would be defined as tstop - tstart from the calendar time case. In this case the model would be something like $\phi(w | x_i) = \phi_0(w) e^{\beta'x_i}$ where $w$ is time since the previous visit. If the gaps for an individual are gaps = c(w_1 ... w_n) then you would fit this as coxph(Surv(gaps, status) ~ x) again with a +cluster(id) and the status variable as before. Essentially this is the exact same problem as clustered survival data.
The model implies that the individual gets "restarted" after every visit. Hence, it's really easy to predict "survival" probabilities (which would be the probability not to get another visit in some time). You can do some math to see that given that an individual "survived" up to some time, what is the survival conditional on that.
Here you can use the +frailty option to account for individual heterogeneity as well.
It is common to use gap times and include some covariate like previous number of events (or log(previous number of events)), or stratify on the previous number of events so that you get different $\phi_0$'s for the time between the first and the second visit, second and third, etc. Here however you should be able to really defend you choices, as you are de facto altering the time scale of the model.
In general, with the calendar time approach you have quick access to the distribution of the events in a certain time window, and with the gap time scale you have easy access to the distribution of the gaps themselves, but you can not easily get the best of both worlds, at least not in nice closed forms (you can achieve many things by simulation though). The only time when the two coincide is when $\lambda_0$ (or $\phi_0$) is constant. In that case, the number of events in any time interval is Poisson (with the same expectation) and the gaps follow an exponential distribution. This is usually seen as a very strong assumption.
A book that is arguably the best introduction to the analysis of the recurrent events is this one. The authors there emphasise that the decision between which time scale to use may depend a lot on the problem at hand. Their general point seems to be that for incident events, that do not alter the process itself, calendar time is the most useful (like warranty repairs on cars, myocardial infarctions). On the other hand, gap time scales are most useful when you want to predict the time to the next event, and is most natural when at every event the unit of interest has some intervention (like a car repair, or some transplant).