-1
$\begingroup$

I have a column in my dataset by name 'production_output' whose values are integers in the range of 2.300000e+01 to 1.110055e+08.

I want to arbitrarily split the values in this column into different buckets based on say, percentile ranges like say [0, 25, 50, 75, 100] and get count of the length of each of theses buckets.

How do I do this using python data-science packages?

$\endgroup$
5
  • 1
    $\begingroup$ Your asking how to compute a histogram. $\endgroup$ Commented Apr 29, 2018 at 20:51
  • 1
    $\begingroup$ pandas.DataFrame.quantile $\endgroup$ Commented Apr 29, 2018 at 21:49
  • $\begingroup$ @kbrose, you are correct. What I'm essentially asking is the background computations done for histogram. And I'm so asking, because I don't see a way to pass parameters to histogram function to get things or am I missing something here? $\endgroup$ Commented Apr 30, 2018 at 6:45
  • $\begingroup$ To that guy who down-voted this question without a reason: Why down-voting this question without giving any reason? What a way to help your community members? If you can't help, at least get out of the way from mothers answering this question. $\endgroup$ Commented Apr 30, 2018 at 9:37
  • 1
    $\begingroup$ I did not downvote, but your question does not show much research. I’m not saying you didn’t research, but there’s no evidence in your question. Maybe it would have been improved by you saying what you tried, what you googled already, etc. $\endgroup$ Commented Apr 30, 2018 at 12:38

1 Answer 1

1
$\begingroup$

numpy.histogram

https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.histogram.html

Use numpy.percentile to get the bin edges you desire.

https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.percentile.html

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.