Questions tagged [rare-events]
Situations where a level of a categorical variable occurs very rarely (eg, a rare disease). This can be a problem especially when the variable is the response variable in a model.
180 questions
3 votes
1 answer
115 views
Binary logistic regression with zero events
Before analysing I have decided to treat a variable both continous and binary, this is a common approach in the literature for this variable. I'd like to know if the outcome differs between my three ...
0 votes
0 answers
37 views
Binary repeated measures outcome with rare events?
I have a binary repeated measures outcome with rare events. In particular, when comparing the outcome between different groups, sometimes the Odds ratios can blow up to infinity due to sparsity/rare ...
5 votes
1 answer
236 views
Meta analysis for one-sample proportion with 0 events in some studies?
I am doing meta analysis for a one-sample proportion where some of the studies have 0 events. My understanding of the statistical literature is that: Traditional meta analysis methods that require a ...
2 votes
1 answer
197 views
Relationship between return period and probability
As I understand it, the return period of an event (such as an earthquake or a flood) is the average time between two consecutive occurrences of that event. The probability of occurrence of an event ...
1 vote
0 answers
77 views
estimating a proportion from repeated measurements
I'm working on a simple descriptive study of a case series with a rare skin disease. The aim is to describe Vitamin A deficiency in the cohort. We have 12 patients (8 male, 4 female), with a total of ...
5 votes
2 answers
557 views
Detect rare high-value measurements in a series of measurements
We do a measurement on 1000 samples to detect if a chemical element A is present, and for each measurement, two cases can happen : the element A is not present, and the values we get are a "...
0 votes
0 answers
205 views
Interpreting hazard ratio in table
This poster has been making headlines lately due to its subject and conclusions. Specifically, I'm interested in understanding how the hazard ratios are being calculated in this table: For example, ...
0 votes
1 answer
95 views
Restricted mean survival time for rare events
I like the restricted mean survival time (and the difference in RMST between cohorts) in survival analyses. However, it does not really convey the effect of an exposure on rare outcomes. For instance, ...
5 votes
0 answers
113 views
Is there an English translation of Ladislaus Bortkiewicz’s book “The Law of Small Numbers" (1898)?
I finished reading Deaths by horsekick in the Prussian army – and other ‘Never Events’ in large organisations. It led me down a rabbit hole of thinking about Poisson point processes modelling demand ...
3 votes
2 answers
272 views
Is the Kaplan-Meier estimator appropriate when I have observed only one event?
Let's say I have a database with 21 patients and one of these patients has died. In such a scenario, can we apply overall survival analysis using the Kaplan-Meier model or no? Meaning that does it ...
1 vote
0 answers
134 views
Test set creation for a rare category classifier
I want to make a classifier for a very rare category. The base rate in a random sample is about 0.01%, estimated from finding about 10 positive examples using a zero-shot classifier on 100,000 ...
1 vote
3 answers
811 views
Poisson regression for rare events?
Poisson regression is commonly used to analyse count data. However, when we deal with rare events it does not seem to be appropriate any more. At least, graphical criteria to assess the model fit like ...
4 votes
2 answers
203 views
Classification error when estimating population size of rare phenomena
I need to understand how a particular statistical challenge has been formally recognised or is commonly described in literature, and what the best academic resources are that discuss it. Here's the ...
2 votes
1 answer
197 views
Difference-in-difference regression design
I'm working with a large panel dataset tracking various units (i-dimension) over an extended period (t-dimension). These units are classified as either 'blue' or 'brown', and some (not all) of these ...
3 votes
1 answer
99 views
Resulting beta distribution from two different samples
Let’s say each sample consists of 300 units inspected for defects. I have historic data from 100 samples in the past that give me an idea of what I expect the defect rate to be. I have a new batch to ...