Timeline for How to re-mean a vector of probabilities, without having values beyond the [0, 1] bounds?

Current License: CC BY-SA 4.0

21 events

when toggle format	what		by	license	comment
Nov 27, 2024 at 20:12	answer	added	jblood94		timeline score: 1
Nov 20, 2024 at 16:51	comment	added	Mark White		In summary, I'm acknowledging other solutions do exist (I am looking into a handful), but my specific question was on this particular approach, which I will be using to compare to other methods.
Nov 20, 2024 at 16:49	comment	added	Mark White		I've got probabilistic scores on folks before an event. After the event, I know what the true rate was for a specific group. The idea is to adjust these a priori scores such that their group mean is equal to what we know is the truth after the event. This is only one solution I'm considering for how to adjust with that the actual rate; thus, my question here deliberately asks about a specific solution because I was getting tripped up on implemention. The optimization on minimizing squared difference for various possible adjustments in logit space answered precisely what I was trying to do.
Nov 20, 2024 at 15:42	comment	added	Friede		In a comment below the accepted answer, you mentioned that you have edited an additional constraint on the ordering of $x_i$ into your question. I am interested in your motivation. Would it be possible to share the underlying story?
Nov 20, 2024 at 4:00	history	became hot network question
Nov 20, 2024 at 2:31	history	edited	Mark White	CC BY-SA 4.0	included an additional condition that needs to be met
Nov 19, 2024 at 22:11	comment	added	Glen_b		I get the feeling this is an XY problem. It might help to explain the underlying problem this "remeaning" is attempting to solve, so that we can help identify good ways to solve the actual problem rather than solving the "problem" with your attempted solution to your original problem.
Nov 19, 2024 at 21:23	comment	added	Sycorax♦		Each $X$ has the same $p$. So the regression has $n$ copies of $p$ as the outcome. Or, if you like, $p_i=p$.
Nov 19, 2024 at 21:21	comment	added	whuber♦		The constraint on the mean is a 1D constraint on a countably infinite space of functions, so those requesting additional desiderata are putting it mildly! There's a huge array of solutions.
Nov 19, 2024 at 21:10	comment	added	Nuclear Hoagie		"However, this will force values to be below 0 or above 1, so not representing probabilities anymore" - I don't see any particular reason why the end result should represent probabilities. The simple fact that the transformation puts everything in the range [0, 1] says nothing about whether those values have any probabilistic interpretation whatsoever. Say you have a weather forecast with the probability of rain for the next week, and you want to adjust it by some value p. The "re-meaned" values aren't calibrated probabilities and say little about the actual chance of rain.
Nov 19, 2024 at 20:52	vote	accept	Mark White
Nov 19, 2024 at 20:47	answer	added	Nathan Wycoff		timeline score: 8
Nov 19, 2024 at 20:44	comment	added	Mark White		@Sycorax I misspoke in the earlier comment—how do you re-arrange it to get the values of $a$ and $b$, with a known $p$? I see how you could solve it using linear regression, but usually the other side of the equation would have a subscript $i$ as well, whereas here it is a fixed value. How does one solve for $a$ and $b$ then?
Nov 19, 2024 at 20:27	comment	added	Mark White		Apologies if I'm being dense, but what's tripping me up there is that $p$ is a constant while $x$ varies. So how does one do linear regression there? I.e., with the example I gave in R below, `lm(plogis(p) ~ x)` throws an error because the lengths of `p` and `x` are different.
Nov 19, 2024 at 20:22	comment	added	Sycorax♦		Minimizing $\epsilon_i$ in $\sigma^{-1}(p) = ax_i + b + \epsilon_i$ is just linear regression...
Nov 19, 2024 at 20:19	comment	added	Mark White		@Sycorax that would work, but how does one re-arrange what you have there to solve for the average instead of solve for $p$?
Nov 19, 2024 at 20:16	comment	added	Sycorax♦		Setting all $x=p$ has the mean $p$, but presumably you also want to preserve some other qualities of the data, such as their relative ordering. So a completely general solution is $p = \sigma(ax+b)$ where $a >0$ and $b$ is any real number and $\sigma(y) = \frac{1}{1 + \exp(-y)}$. You can adjust $a,b$ to get any mean $ 0 < p <1$ you desire.
Nov 19, 2024 at 20:08	history	edited	Mark White	CC BY-SA 4.0	add additional details about what I'm looking for
Nov 19, 2024 at 20:05	comment	added	Mark White		Instead of de-meaning (making the mean be 0) or mean-centering (making the mean be 1), I want to rescale the variable such that the mean is equal to some arbitrary number, $p$. As I mention above: What I would like to do is go from $x$ to $x'$ where $\bar{x'} = p$.
Nov 19, 2024 at 20:03	comment	added	Dave		What do you mean be "re-mean"?
Nov 19, 2024 at 19:56	history	asked	Mark White	CC BY-SA 4.0

toggle format