Skip to main content

Questions tagged [zipf]

0 votes
0 answers
82 views

I have a dataset of 40 million discrete values, whose histogram follows a Zipf distribution with the following statistical parameters: Minimum = 1, Maximum = 1738, Mean = 2.16, STD = 16.50, P95 = 4, ...
MHT's user avatar
  • 1
1 vote
1 answer
75 views

Been wracking my brain over this one but can't quite get it. I want a distribution for the population size of randomly-chosen cities, where I can assume the city's rank obeys Zipf's law with a known ...
famulare's user avatar
  • 362
1 vote
0 answers
25 views

From a total of $N$ words i have the following dataset where the first column represents the ranks and the second the frequency. For example $$\begin{array}{cc} 1 & 4300 \\ 2 & 3100 \\ 3 & ...
Indominus's user avatar
0 votes
1 answer
288 views

I'm having trouble understanding why I get radically different results if I try to find the parameter of a Zipf distribution when I use the methods proposed by Clauset et al. (2009) as opposed to ...
MarcoLin8's user avatar
1 vote
1 answer
284 views

I have several ranking distributions and would, for each one, like to fit a [Zipf distribution][1], and estimate the goodness of fit relative to some standard benchmark. With the Matlab code below, I ...
z8080's user avatar
  • 2,372
6 votes
1 answer
1k views

Zipf's law states that in a text set $s=1$ a few words occur very often, and many words hardly ever occur. Zipf’s law for text sets $s = 1$ in the Zipf distribution defined by: $$f(k; s, N) = \frac{k^{...
Slim Shady's user avatar
0 votes
1 answer
274 views

I have been trying to develop to calculate the sample size to maximize the power of KS test (±0.8) on an underlying Zipf distribution. I have tried estimating the power by performing simulations: <...
kaisar.dauletbek's user avatar
2 votes
0 answers
118 views

I am studying a system of cities where the largest city appears to be in many aspects an outlier. The distribution of city size - in any country - are often claimed to follow Zipf's law. According to ...
Jesper for President's user avatar
1 vote
0 answers
46 views

Say I have a large corpus of p words, and constant (f.r = C) equals p.(1/10). How do you go ...
Rahul Dev's user avatar
  • 111
0 votes
1 answer
268 views

I'm trying to compare different approaches to rank predictions. I have the ground truth distribution $P$ (discrete, zeta distribution) and two or more distributions ($Q, Q', Q'', Q'''$ in this case) I'...
MrAkroMenToS's user avatar
3 votes
1 answer
2k views

I need to get a simple, but clear idea of Discrete Pareto Distribution vs Zipf Distribution and Power Law vs Zipf Law. (Are they similar/ how they relate to each other.) Wikipedia definitions do not ...
Dovini Jayasinghe's user avatar
0 votes
1 answer
452 views

I need to model the popularity of some requested files from a library with Zipf distribution and I want to simulate it in MATLAB. I don't know what's the effect of parameter s on my result. for ...
Bonnie's user avatar
  • 25
1 vote
0 answers
734 views

I would like to know the practical threshold of the TF-IDF (just like the practical p-value cutoff of 0.1 or 0.05 in hypothesis tests). I tried to look at it in some previous post, and some people ...
H42's user avatar
  • 133
10 votes
2 answers
2k views

I'm attempting to find out whether some highly skewed data are drawn from a power law distribution, following the popular paper by Clauset, Shalizi and Newman, 2009. Clauset et al. use the Kolmogorov-...
JaydenM-C's user avatar
  • 203
2 votes
1 answer
1k views

I am fetching trending topics from social media where the frequency of likes is said to follow a Zipf-Mandelbrot distribution; i.e., some of the posts will have a high number of likes and some other ...
Bruno Cortez's user avatar

15 30 50 per page