Skip to main content
8 events
when toggle format what by license comment
Nov 5 at 13:01 vote accept HansKloss
Nov 4 at 3:29 answer added Thomas Lumley timeline score: 2
Nov 3 at 15:03 comment added HansKloss @PBulls the use of offset in such a way is a standard practice in life insurance GLMs. We want to achieve a model that after transforming back from log space predicts claim count as: claim_count_expected_by_industry * coeff_factor1 * coreff_factor2... And yes, our goal is to model incidence rates on subgroups of insured population. and since the datasets can be small and therefore not always credible, the 'grand mean' from it is also under the question mark
Nov 3 at 14:57 comment added HansKloss @Dave yes, the industry table is a prior belief, like you said above. this is also needed because each row represents a different number of insured lives the model setup in R is: observed_claim_count ~ factors... + 1, offset=log(claim_count_implied_by_industry_table), family=poisson(), ...
Nov 3 at 12:57 comment added Dave Do you want to use the industry table as part of a prior distribution, kind of like “if the data don’t refute the established ideas, go with the established ideas”?
Nov 3 at 12:52 comment added PBulls Neither the intercept nor the offset are penalized by default. As the links explain, this is because you usually want a model with no other parameters to predict the grand mean -- otherwise additional sensitivities are introduced in parametrization (e.g. 'adding a constant $c$ to all $y$ would not simply result in a shift of all predictions by $c$'). I'm not sure that you're using this offset in its intended way though: the offset $\text{log}(\eta)$ makes your outcome be $y/\eta$, i.e. a rate rather than a count. This would suggest your data is not proportional to your reference...
Nov 3 at 11:38 comment added Dave What do you mean that you use the industry table as an offset term?
Nov 3 at 11:01 history asked HansKloss CC BY-SA 4.0