Fast ceiling of an integer division in C / C++

Question

Given integer values x and y, C and C++ both return as the quotient q = x/y the floor of the floating point equivalent. I'm interested in a method of returning the ceiling instead. For example, ceil(10/5)=2 and ceil(11/5)=3.

The obvious approach involves something like:

q = x / y; if (q * y < x) ++q;

This requires an extra comparison and multiplication; and other methods I've seen (used in fact) involve casting as a float or double. Is there a more direct method that avoids the additional multiplication (or a second division) and branch, and that also avoids casting as a floating point number?

the divide instruction often returns both quotient and remainder at the same time so there's no need to multiply, just q = x/y + (x % y != 0); is enough — phuclv
– phuclv, Commented Jan 25, 2014 at 11:17
@LưuVĩnhPhúc Seriously you need to add that as the answer. I just used that for my answer during a codility test. It worked like a charm though I am not certain how the mod part of the answer works but it did the job. — Zachary Kraus
– Zachary Kraus, Commented Aug 26, 2014 at 0:56
@AndreasGrapentin the answer below by Miguel Figueiredo was submitted nearly a year before Lưu Vĩnh Phúc left the comment above. While I understand how appealing and elegant Miguel's solution is, I'm not inclined to change the accepted answer at this late date. Both approaches remain sound. If you feel strongly enough about it, I suggest you show your support by up-voting Miguel's answer below. — andand
– andand, Commented Aug 26, 2014 at 2:51
Strange, I have not seen any sane measurement or analysis of the proposed solutions. You talk about speed on near-the-bone, but there is no discussion of architectures, pipelines, branching instructions and clock cycles. — Rado
– Rado, Commented Dec 18, 2016 at 19:35

Ganesh Kamath - 'Code Frenzy' · Accepted Answer · 2022-08-18 12:44:38Z

558

For positive numbers where you want to find the ceiling (q) of x when divided by y.

unsigned int x, y, q;

To round up ...

q = (x + y - 1) / y;

or (avoiding overflow in x+y)

q = 1 + ((x - 1) / y); // if x != 0

edited Aug 18, 2022 at 12:44

Ganesh Kamath - 'Code Frenzy'

5,3413 gold badges48 silver badges65 bronze badges

answered Apr 30, 2010 at 14:19

Sparky

14.3k4 gold badges29 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

David Thornley Over a year ago

@bitc: For negative numbers, I believe C99 specifies round-to-zero, so x/y is the ceiling of the division. C90 didn't specify how to round, and I don't think the current C++ standard does either.

Mashmagar Over a year ago

See Eric Lippert's post: stackoverflow.com/questions/921180/c-round-up/926806#926806

Jørgen Fogh Over a year ago

Note: This might overflow. q = ((long long)x + y - 1) / y will not. My code is slower though, so if you know that your numbers will not overflow, you should use Sparky's version.

Omry Yadan Over a year ago

The second one has a problem where x is 0. ceil(0/y) = 0 but it returns 1.

jamesmstone Over a year ago

@OmryYadan would x == 0 ? 0 : 1 + ((x - 1) / y) resolve this safely and efficiently?

|

Yun · Accepted Answer · 2023-05-30 12:18:39Z

143

For positive numbers:

q = x/y + (x % y != 0);

edited May 30, 2023 at 12:18

Yun

3,8826 gold badges13 silver badges34 bronze badges

answered Feb 14, 2013 at 15:52

Miguel Figueiredo

1,4391 gold badge9 silver badges2 bronze badges

6 Comments

phuclv Over a year ago

most common architecture's divide instruction also includes remainder in its result so this really needs only one division and would be very fast

Jan Schultke Over a year ago

This is the most elegant solution. I have extended it to negative numbers and floor / up rounding too.

v.oddou Over a year ago

@phuclv it appears that MSVC is not able to exploit the x86 idiv remainder trick, ending up calling it twice even with /O2. godbolt proof: godbolt.org/z/YM16j3xes . gcc generates a nice minimal assembly with one idiv a testne and an add. clang too, one idiv only, a cmp and a weird sbb eax,-1

v.oddou Over a year ago

More on this matter: it appears that it's a regression that happened between MSVC versions 19.29 vs16.11, and 19.30vs17.

Henrik Alsing Friberg Mar 7 at 12:12

@v.oddou MSVC treats x and y as volatile and generates correct code for that case. Turn them into function arguments and the assembly is good with only one idiv.

|

Tatsuyuki Ishi · Accepted Answer · 2018-01-08 09:44:43Z

Sparky's answer is one standard way to solve this problem, but as I also wrote in my comment, you run the risk of overflows. This can be solved by using a wider type, but what if you want to divide long longs?

Nathan Ernst's answer provides one solution, but it involves a function call, a variable declaration and a conditional, which makes it no shorter than the OPs code and probably even slower, because it is harder to optimize.

My solution is this:

q = (x % y) ? x / y + 1 : x / y;

It will be slightly faster than the OPs code, because the modulo and the division is performed using the same instruction on the processor, because the compiler can see that they are equivalent. At least gcc 4.4.1 performs this optimization with -O2 flag on x86.

In theory the compiler might inline the function call in Nathan Ernst's code and emit the same thing, but gcc didn't do that when I tested it. This might be because it would tie the compiled code to a single version of the standard library.

As a final note, none of this matters on a modern machine, except if you are in an extremely tight loop and all your data is in registers or the L1-cache. Otherwise all of these solutions will be equally fast, except for possibly Nathan Ernst's, which might be significantly slower if the function has to be fetched from main memory.

There was an easier way to fix overflow, simply reduce y/y: q = (x > 0)? 1 + (x - 1)/y: (x / y);
No, it does not. As I explained in the answer, the % operator is free when you already perform the division.
Then q = x / y + (x % y > 0); is easier than ? : expression?
It depends on what you mean by "easier." It may or may not be faster, depending on how the compiler translates it. My guess would be slower but I would have to measure it to be sure.
I don't see how adding a branch instruction should make this faser, actually.

Nathan Ernst · Accepted Answer · 2010-04-30 14:39:21Z

You could use the div function in cstdlib to get the quotient & remainder in a single call and then handle the ceiling separately, like in the below

#include <cstdlib> #include <iostream> int div_ceil(int numerator, int denominator) { std::div_t res = std::div(numerator, denominator); return res.rem ? (res.quot + 1) : res.quot; } int main(int, const char**) { std::cout << "10 / 5 = " << div_ceil(10, 5) << std::endl; std::cout << "11 / 5 = " << div_ceil(11, 5) << std::endl; return 0; }

As an interesting case of the double bang, you could also return res.quot + !!res.rem; :)
Doesn't ldiv always promote the arguments into long long's? And doesn't that cost anything, up-casting or down-casting?
@einpoklum: std::div is overloaded for int, long, long long and intmax_t (the latter two since C++11); whether it internally promotes would be an implementation detail (and I can't see a strong reason for why they wouldn't implement it independently for each). ldiv promotes, but std::div shouldn't need to.

cubuspl42 · Accepted Answer · 2021-10-25 12:05:31Z

17

There's a solution for both positive and negative x but only for positive y with just 1 division and without branches:

int div_ceil(int x, int y) { return x / y + (x % y > 0); }

Note, if x is positive then division is towards zero, and we should add 1 if reminder is not zero.

If x is negative then division is towards zero, that's what we need, and we will not add anything because x % y is not positive

edited Oct 25, 2021 at 12:05

cubuspl42

8,5914 gold badges46 silver badges67 bronze badges

answered Jun 13, 2015 at 23:06

RiaD

47.8k12 gold badges85 silver badges128 bronze badges

5 Comments

Wolf Over a year ago

interesting, because there are common cases with y being constant

M.kazem Akhgary Over a year ago

mod requires division so its not just 1 division here, but maybe complier can optimize two similar divisions into one.

cubuspl42 Over a year ago

This comment implies that modern architectures can divide and calculate module with one instruction. That still requires a smart compiler, of course.

RARE Kpop Manifesto Sep 16 at 19:03

@RiaD : if both x and y were negative, both { x, y } < -1, but not divisible, your approach would end up being either truncated or floored division, because (x % y) < 0 even though the quotient is >= 0.

RiaD Sep 18 at 15:43

I explicitely wrote that it's only for positive y

Ben Voigt · Accepted Answer · 2010-05-01 04:38:45Z

How about this? (requires y non-negative, so don't use this in the rare case where y is a variable with no non-negativity guarantee)

q = (x > 0)? 1 + (x - 1)/y: (x / y);

I reduced y/y to one, eliminating the term x + y - 1 and with it any chance of overflow.

I avoid x - 1 wrapping around when x is an unsigned type and contains zero.

For signed x, negative and zero still combine into a single case.

Probably not a huge benefit on a modern general-purpose CPU, but this would be far faster in an embedded system than any of the other correct answers.

Your else will always return 0, no need to calculate anything.

Greg Kramida · Accepted Answer · 2020-05-31 22:13:27Z

9

I would have rather commented but I don't have a high enough rep.

As far as I am aware, for positive arguments and a divisor which is a power of 2, this is the fastest way (tested in CUDA):

//example y=8 q = (x >> 3) + !!(x & 7);

For generic positive arguments only, I tend to do it like so:

q = x/y + !!(x % y);

edited May 31, 2020 at 22:13

Greg Kramida

4,2845 gold badges32 silver badges49 bronze badges

answered Feb 11, 2019 at 3:08

OffBy0x01

3282 silver badges11 bronze badges

2 Comments

Greg Kramida Over a year ago

It would be interesting to see how q = x/y + !!(x % y); stacks up against q = x/y + (x % y == 0); and the q = (x + y - 1) / y; solutions performance-wise in contemporary CUDA.

Sean W Over a year ago

seems like q = x/y + (x % y == 0); should be q = x/y + (x % y != 0); instead

Eliahu Aaron · Accepted Answer · 2019-07-01 17:02:41Z

5

This works for positive or negative numbers:

q = x / y + ((x % y != 0) ? !((x > 0) ^ (y > 0)) : 0);

If there is a remainder, checks to see if x and y are of the same sign and adds 1 accordingly.

edited Jul 1, 2019 at 17:02

Eliahu Aaron

4,6425 gold badges32 silver badges43 bronze badges

answered Mar 14, 2014 at 22:45

Mark Conway

591 silver badge3 bronze badges

2 Comments

Alan Kałuża Over a year ago

Doesn't work with a negative x and a positive y.

RARE Kpop Manifesto Sep 16 at 19:17

!((x > 0) ^ (y > 0)) - what a convoluted way of saying ( x <= 0 )^( 0 < y ) - you're essentially trying to say "sign matches", or XNOR - so just invert one side of the xor equation, then you can skip the logical negate altogether

evoskuil · Accepted Answer · 2021-06-15 12:03:40Z

For signed or unsigned integers.

q = x / y + !(((x < 0) != (y < 0)) || !(x % y));

For signed dividends and unsigned divisors.

q = x / y + !((x < 0) || !(x % y));

For unsigned dividends and signed divisors.

q = x / y + !((y < 0) || !(x % y));

For unsigned integers.

q = x / y + !!(x % y);

Zero divisor fails (as with a native operation). Cannot cause overflow.

Corresponding floored and modulo constexpr implementations here, along with templates to select the necessary overloads (as full optimization and to prevent mismatched sign comparison warnings):

https://github.com/libbitcoin/libbitcoin-system/wiki/Integer-Division-Unraveled

Community · Accepted Answer · 2017-05-23 12:02:44Z

simplified generic form,

int div_up(int n, int d) { return n / d + (((n < 0) ^ (d > 0)) && (n % d)); } //i.e. +1 iff (not exact int && positive result)

For a more generic answer, C++ functions for integer division with well defined rounding strategy

qwr · Accepted Answer · 2025-10-24 02:27:20Z

With the usual caveats about profiling if this really matters (it won't unless you're doing this A LOT):

As @phuclv says, on modern processors quotient and remainder will be calculated in one instruction. All these assume unsigned numbers without worrying about overflow. With x86-64 GCC -O3

unsigned int f(unsigned int x, unsigned int y) { return x / y + (x % y != 0); }

produces

mov eax, edi xor edx, edx # zero edx div esi # divides edx:eax (y) by esi (x) # eax = quotient, edx = remainder cmp edx, 1 # set CF = (edx - 1 < 0), i.e. edx == 0 sbb eax, -1 # eax -= CF - 1, i.e. eax += 1 - CF, no branch ret

https://godbolt.org/z/4Gsn3Kj5s

unsigned int f(unsigned int x, unsigned int y) { return (x + y - 1) / y; }

is clever and uses lea to do the addition and subtraction

lea eax, [rsi-1+rdi] xor edx, edx div esi ret

https://godbolt.org/z/9sfsc1Wa5

For 64-bit inputs, the results are similar but with 64-bit registers instead.

I would guess LEA is faster than CMP/SBB as LEA is a fast instruction, but I didn't benchmark anything.

In a deleted answer, @Matt suggests the remainder increment version is faster, but his g++ compile command didn't include optimization flag which is supsect.

user2165 · Accepted Answer · 2019-06-09 08:19:12Z

-4

Compile with O3, The compiler performs optimization well.

q = x / y; if (x % y) ++q;

edited Jun 9, 2019 at 8:19

user2165

2,1213 gold badges23 silver badges40 bronze badges

answered Jun 9, 2019 at 6:27

dhb

11 bronze badge

Collectives™ on Stack Overflow

Fast ceiling of an integer division in C / C++

12 Answers 12

10 Comments

6 Comments

9 Comments

3 Comments

5 Comments

2 Comments

2 Comments

2 Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

10 Comments

6 Comments

9 Comments

3 Comments

5 Comments

2 Comments

2 Comments

2 Comments

Comments

Comments

Comments

Comments

Linked

Related