Questions tagged [zero-inflation]
Excessive 0's in a variable compared to a specified reference distribution. Regression approaches include zero-inflated models and hurdle (2-part) models. For count data, zero-inflated and hurdle models based on Poisson or negative binomial distributions are common (ZIP/ZINB and HP/HNB).
658 questions
0 votes
0 answers
29 views
How to model 0 vs non-0 and split positive values in a hurdle model with glmmTMB in R?
I know that in glmmTMB in R, using a hurdle model, it is possible to model the 0 vs non-0 part. For example, in my current model: ...
0 votes
0 answers
44 views
Are there clustering algorithms or preprocessing strategies tailored for zero-inflated and continuous data types?
I am currently working on the project where I need to assign customers across N recipes before AB testing such that KPIs for each customer are balanced across recipes (reduce pre-test bias) Dataset ...
4 votes
1 answer
155 views
Why do large negative coefficients in zi, dispersion, or random effects suggest unnecessary components?
I tried to fit a model which didn't converge: ...
1 vote
0 answers
88 views
Why are the confidence intervals for predictions from a binomial model different from the confidence intervals for predictions from a hurdle model?
I used the R package glmmTMB to analyze a dataset using a binomial model and a hurdle model, then used the package ggeffects to generate predictions from both models. In glmmTMB, binomial models ...
0 votes
0 answers
31 views
Why is the ones trick making my hurdle model crash in jags?
I'm running a hurdle model in JAGS but if randomly crashes for reasons I cannot determine. The model describes the presence absence of a species and, when the species is present, its biomass. It uses ...
3 votes
1 answer
81 views
Interpreting hurdle model output with splines
I am trying to determine which variables affect the likelihood of lumpfish eating salmon lice and which variables predict the number of lice eaten. My data are highly zero-inflated, so I decided to ...
4 votes
1 answer
234 views
What does it mean that the zero part of a hurdle model has a "negative binomial" distribution?
In a hurdle model, we have a Count part, which models the probability of a positive outcome. For instance, we can use a truncated Poisson regression (truncated because the raw Poisson regression amy ...
1 vote
0 answers
72 views
PERMANOVA or GLMM?
I have a data set looking at the effect of 2 different treatments on shrub growth (percentage cover). This has been measured over 5 months and 3 species have established themselves. The data set of ...
1 vote
0 answers
43 views
model label enrichmnet using post prediction signals
I'm tackling a real live problem, describe here as an analogy: problem description let's say I'm gold mining, where the problem setting is that I have to predict the gold volume given a location (on ...
4 votes
1 answer
77 views
Can you specify a Generalized Poisson Hurdle Model (GPHM) in glmmTMB?
Can glmmTMB can specify GPHMs? I read it can include "truncated_genpois" which I assume is a GPHM? for example, ...
2 votes
1 answer
313 views
How to interpret hurdle model in glmmTMB - do you need to subset your data first?
I am running a model to determine the probability of seedlings occurring in each sub-plot based on several responses. However, Dharma tests tell me my data is under-dispersed. I want to run a hurdle ...
1 vote
0 answers
56 views
How to model Endogeneity in a Zero-Inflated process
I have a multivariate dataset consisting of household level variables such as education, location of the household, occupation, income, consumption etc. The regressand variable is number of social ...
3 votes
0 answers
64 views
What clustering methods handle zero-inflated and continuous variables together?
I'm relatively new to data science and currently working on a project to group global cities based on exposure to various climate hazards. I've sourced climate data from GCMs participating in CMIP6 as ...
0 votes
0 answers
46 views
Hurdle modelling of heavy tails for continuous variables
In a standard hurdle model[ [wikipedia]][1], the decision to participate would be associated to some strictly positive outcomes (eg positive count if Poisson) and non-participation would necessarily ...
0 votes
0 answers
64 views
Impact of Excessive Zeros in miRNA Data on PCA and LDA
I am working on a case-cohort (~ case-control, but putting all cases in the subcohort) study evaluating miRNA markers. The variables of interest are continuous quantitative measures of miRNA ...