84

Given an NxN binary matrix (containing only 0's or 1's), how can we go about finding largest rectangle containing all 0's?

Example:

 I 0 0 0 0 1 0 0 0 1 0 0 1 II->0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 <--IV 0 0 1 0 0 0 IV 

For the above example, it is a 6×6 binary matrix. the return value in this case will be Cell 1:(2, 1) and Cell 2:(4, 4). The resulting sub-matrix can be square or rectangular. The return value can also be the size of the largest sub-matrix of all 0's, in this example 3 × 4.

4
  • 1
    Please consider changing the accepted answer to J.F. Sebastian's answer, which is now correct and has optimal complexity. Commented Jan 15, 2011 at 4:18
  • 1
    Please check very similar (I'd say duplicate) questions: stackoverflow.com/questions/7770945/… , stackoverflow.com/a/7353193/684229 . The solution is O(n). Commented Mar 29, 2012 at 20:15
  • I'm trying to do the same with a rectangle oriented in any direction. see question: stackoverflow.com/questions/22604043/… Commented Mar 24, 2014 at 8:16
  • @TMS It's actually the other way around. Those questions are duplicates of this one. Commented May 24, 2015 at 0:46

9 Answers 9

48

Here's a solution based on the "Largest Rectangle in a Histogram" problem suggested by @j_random_hacker in the comments:

[Algorithm] works by iterating through rows from top to bottom, for each row solving this problem, where the "bars" in the "histogram" consist of all unbroken upward trails of zeros that start at the current row (a column has height 0 if it has a 1 in the current row).

The input matrix mat may be an arbitrary iterable e.g., a file or a network stream. Only one row is required to be available at a time.

#!/usr/bin/env python from collections import namedtuple from operator import mul Info = namedtuple('Info', 'start height') def max_size(mat, value=0): """Find height, width of the largest rectangle containing all `value`'s.""" it = iter(mat) hist = [(el==value) for el in next(it, [])] max_size = max_rectangle_size(hist) for row in it: hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)] max_size = max(max_size, max_rectangle_size(hist), key=area) return max_size def max_rectangle_size(histogram): """Find height, width of the largest rectangle that fits entirely under the histogram. """ stack = [] top = lambda: stack[-1] max_size = (0, 0) # height, width of the largest rectangle pos = 0 # current position in the histogram for pos, height in enumerate(histogram): start = pos # position where rectangle starts while True: if not stack or height > top().height: stack.append(Info(start, height)) # push elif stack and height < top().height: max_size = max(max_size, (top().height, (pos - top().start)), key=area) start, _ = stack.pop() continue break # height == top().height goes here pos += 1 for start, height in stack: max_size = max(max_size, (height, (pos - start)), key=area) return max_size def area(size): return reduce(mul, size) 

The solution is O(N), where N is the number of elements in a matrix. It requires O(ncols) additional memory, where ncols is the number of columns in a matrix.

Latest version with tests is at https://gist.github.com/776423

Sign up to request clarification or add additional context in comments.

15 Comments

Good try, but this fails max_size([[0,0,0,0,1,1,1], [0,0,0,0,0,0,0], [0,0,0,1,1,1,1], [0,0,1,1,1,1,1]] + [[1,0,1,1,1,1,1]] * 3), returning (2, 4) when there is a 3x3 square in the top left.
The basic problem is that it's not always sufficient to track just (several) largest-area rectangles of neighbouring points as you're doing here. The only O(N) algorithm that I know to be correct works by iterating through rows from top to bottom, for each row solving this problem: stackoverflow.com/questions/4311694/…, where the "bars" in the "histogram" consist of all unbroken upward trails of zeros that start at the current row (a column has height 0 if it has a 1 in the current row).
@j_random_hacker: I've updated my answer to use "histogram"-based algorithm.
This looks great, however, I'm trying to actually FIND the largest rectangle (as in, return the coordinates). This algorithm will reliably return the area, but once I know that, how would a person discover that the location of a 3 column x 2 row rectangle, with its upper-left corner at [3, 5] (for example)?
where does one get the bouding column info? (left or right column of the rectangle?). We can get width and height from max_rectangle_size, and the bottom row from the for row in it: iteration, but I can't find the bounding column info.
|
30

Please take a look at Maximize the rectangular area under Histogram and then continue reading the solution below.

Traverse the matrix once and store the following; For x=1 to N and y=1 to N F[x][y] = 1 + F[x][y-1] if A[x][y] is 0 , else 0 Then for each row for x=N to 1 We have F[x] -> array with heights of the histograms with base at x. Use O(N) algorithm to find the largest area of rectangle in this histogram = H[x] From all areas computed, report the largest. 

Time complexity is O(N*N) = O(N²) (for an NxN binary matrix)

Example:

Initial array F[x][y] array 0 0 0 0 1 0 1 1 1 1 0 1 0 0 1 0 0 1 2 2 0 2 1 0 0 0 0 0 0 0 3 3 1 3 2 1 1 0 0 0 0 0 0 4 2 4 3 2 0 0 0 0 0 1 1 5 3 5 4 0 0 0 1 0 0 0 2 6 0 6 5 1 For x = N to 1 H[6] = 2 6 0 6 5 1 -> 10 (5*2) H[5] = 1 5 3 5 4 0 -> 12 (3*4) H[4] = 0 4 2 4 3 2 -> 10 (2*5) H[3] = 3 3 1 3 2 1 -> 6 (3*2) H[2] = 2 2 0 2 1 0 -> 4 (2*2) H[1] = 1 1 1 1 0 1 -> 4 (1*4) The largest area is thus H[5] = 12 

4 Comments

nice explanation with example
are you sure this is O(N*N)? There are two passes over the whole matrix, but my impression is this is O(N).
very nice explanation.. :) I wish, you would have explained "Maximize the rectangular area under Histogram" too.. :D
To make it more clear. The solution is O(N*N), where N is the number of items in a row/col since the question states the input is NxN in size. If N was the total number of items in the input, then it is O(N)
14

Here is a Python3 solution, which returns the position in addition to the area of the largest rectangle:

#!/usr/bin/env python3 import numpy s = '''0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0''' nrows = 6 ncols = 6 skip = 1 area_max = (0, []) a = numpy.fromstring(s, dtype=int, sep=' ').reshape(nrows, ncols) w = numpy.zeros(dtype=int, shape=a.shape) h = numpy.zeros(dtype=int, shape=a.shape) for r in range(nrows): for c in range(ncols): if a[r][c] == skip: continue if r == 0: h[r][c] = 1 else: h[r][c] = h[r-1][c]+1 if c == 0: w[r][c] = 1 else: w[r][c] = w[r][c-1]+1 minw = w[r][c] for dh in range(h[r][c]): minw = min(minw, w[r-dh][c]) area = (dh+1)*minw if area > area_max[0]: area_max = (area, [(r-dh, c-minw+1, r, c)]) print('area', area_max[0]) for t in area_max[1]: print('Cell 1:({}, {}) and Cell 2:({}, {})'.format(*t)) 

Output:

area 12 Cell 1:(2, 1) and Cell 2:(4, 4) 

1 Comment

Works great! I made a Fortran version out from this and compiled it to use in Python, as traversing a large array in Python like this is painfully slow.
5

Here is J.F. Sebastians method translated into C#:

private Vector2 MaxRectSize(int[] histogram) { Vector2 maxSize = Vector2.zero; int maxArea = 0; Stack<Vector2> stack = new Stack<Vector2>(); int x = 0; for (x = 0; x < histogram.Length; x++) { int start = x; int height = histogram[x]; while (true) { if (stack.Count == 0 || height > stack.Peek().y) { stack.Push(new Vector2(start, height)); } else if(height < stack.Peek().y) { int tempArea = (int)(stack.Peek().y * (x - stack.Peek().x)); if(tempArea > maxArea) { maxSize = new Vector2(stack.Peek().y, (x - stack.Peek().x)); maxArea = tempArea; } Vector2 popped = stack.Pop(); start = (int)popped.x; continue; } break; } } foreach (Vector2 data in stack) { int tempArea = (int)(data.y * (x - data.x)); if(tempArea > maxArea) { maxSize = new Vector2(data.y, (x - data.x)); maxArea = tempArea; } } return maxSize; } public Vector2 GetMaximumFreeSpace() { // STEP 1: // build a seed histogram using the first row of grid points // example: [true, true, false, true] = [1,1,0,1] int[] hist = new int[gridSizeY]; for (int y = 0; y < gridSizeY; y++) { if(!invalidPoints[0, y]) { hist[y] = 1; } } // STEP 2: // get a starting max area from the seed histogram we created above. // using the example from above, this value would be [1, 1], as the only valid area is a single point. // another example for [0,0,0,1,0,0] would be [1, 3], because the largest area of contiguous free space is 3. // Note that at this step, the heigh fo the found rectangle will always be 1 because we are operating on // a single row of data. Vector2 maxSize = MaxRectSize(hist); int maxArea = (int)(maxSize.x * maxSize.y); // STEP 3: // build histograms for each additional row, re-testing for new possible max rectangluar areas for (int x = 1; x < gridSizeX; x++) { // build a new histogram for this row. the values of this row are // 0 if the current grid point is occupied; otherwise, it is 1 + the value // of the previously found historgram value for the previous position. // What this does is effectly keep track of the height of continous avilable spaces. // EXAMPLE: // Given the following grid data (where 1 means occupied, and 0 means free; for clairty): // INPUT: OUTPUT: // 1.) [0,0,1,0] = [1,1,0,1] // 2.) [0,0,1,0] = [2,2,0,2] // 3.) [1,1,0,1] = [0,0,1,0] // // As such, you'll notice position 1,0 (row 1, column 0) is 2, because this is the height of contiguous // free space. for (int y = 0; y < gridSizeY; y++) { if(!invalidPoints[x, y]) { hist[y] = 1 + hist[y]; } else { hist[y] = 0; } } // find the maximum size of the current histogram. If it happens to be larger // that the currently recorded max size, then it is the new max size. Vector2 maxSizeTemp = MaxRectSize(hist); int tempArea = (int)(maxSizeTemp.x * maxSizeTemp.y); if (tempArea > maxArea) { maxSize = maxSizeTemp; maxArea = tempArea; } } // at this point, we know the max size return maxSize; } 

A few things to note about this:

  1. This version is meant for use with the Unity API. You can easily make this more generic by replacing instances of Vector2 with KeyValuePair. Vector2 is only used for a convenient way to store two values.
  2. invalidPoints[] is an array of bool, where true means the grid point is "in use", and false means it is not.

Comments

3

Solution with space complexity O(columns) [Can be modified to O(rows) also] and time complexity O(rows*columns)

public int maximalRectangle(char[][] matrix) { int m = matrix.length; if (m == 0) return 0; int n = matrix[0].length; int maxArea = 0; int[] aux = new int[n]; for (int i = 0; i < n; i++) { aux[i] = 0; } for (int i = 0; i < m; i++) { for (int j = 0; j < n; j++) { aux[j] = matrix[i][j] - '0' + aux[j]; maxArea = Math.max(maxArea, maxAreaHist(aux)); } } return maxArea; } public int maxAreaHist(int[] heights) { int n = heights.length; Stack<Integer> stack = new Stack<Integer>(); stack.push(0); int maxRect = heights[0]; int top = 0; int leftSideArea = 0; int rightSideArea = heights[0]; for (int i = 1; i < n; i++) { if (stack.isEmpty() || heights[i] >= heights[stack.peek()]) { stack.push(i); } else { while (!stack.isEmpty() && heights[stack.peek()] > heights[i]) { top = stack.pop(); rightSideArea = heights[top] * (i - top); leftSideArea = 0; if (!stack.isEmpty()) { leftSideArea = heights[top] * (top - stack.peek() - 1); } else { leftSideArea = heights[top] * top; } maxRect = Math.max(maxRect, leftSideArea + rightSideArea); } stack.push(i); } } while (!stack.isEmpty()) { top = stack.pop(); rightSideArea = heights[top] * (n - top); leftSideArea = 0; if (!stack.isEmpty()) { leftSideArea = heights[top] * (top - stack.peek() - 1); } else { leftSideArea = heights[top] * top; } maxRect = Math.max(maxRect, leftSideArea + rightSideArea); } return maxRect; } 

But I get Time Limite exceeded excpetion when I try this on LeetCode. Is there any less complex solution?

Comments

2

I propose a O(nxn) method.

First, you can list all the maximum empty rectangles. Empty means that it covers only 0s. A maximum empty rectangle is such that it cannot be extended in a direction without covering (at least) one 1.

A paper presenting a O(nxn) algorithm to create such a list can be found at www.ulg.ac.be/telecom/rectangles as well as source code (not optimized). There is no need to store the list, it is sufficient to call a callback function each time a rectangle is found by the algorithm, and to store only the largest one (or choose another criterion if you want).

Note that a proof exists (see the paper) that the number of largest empty rectangles is bounded by the number of pixels of the image (nxn in this case).

Therefore, selecting the optimal rectangle can be done in O(nxn), and the overall method is also O(nxn).

In practice, this method is very fast, and is used for realtime video stream analysis.

Comments

2

Here is a version of jfs' solution, which also delivers the position of the largest rectangle:

from collections import namedtuple from operator import mul Info = namedtuple('Info', 'start height') def max_rect(mat, value=0): """returns (height, width, left_column, bottom_row) of the largest rectangle containing all `value`'s. Example: [[0, 0, 0, 0, 0, 0, 0, 0, 3, 2], [0, 4, 0, 2, 4, 0, 0, 1, 0, 0], [1, 0, 1, 0, 0, 0, 3, 0, 0, 4], [0, 0, 0, 0, 4, 2, 0, 0, 0, 0], [0, 0, 0, 2, 0, 0, 0, 0, 0, 0], [4, 3, 0, 0, 1, 2, 0, 0, 0, 0], [3, 0, 0, 0, 2, 0, 0, 0, 0, 4], [0, 0, 0, 1, 0, 3, 2, 4, 3, 2], [0, 3, 0, 0, 0, 2, 0, 1, 0, 0]] gives: (3, 4, 6, 5) """ it = iter(mat) hist = [(el==value) for el in next(it, [])] max_rect = max_rectangle_size(hist) + (0,) for irow,row in enumerate(it): hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)] max_rect = max(max_rect, max_rectangle_size(hist) + (irow+1,), key=area) # irow+1, because we already used one row for initializing max_rect return max_rect def max_rectangle_size(histogram): stack = [] top = lambda: stack[-1] max_size = (0, 0, 0) # height, width and start position of the largest rectangle pos = 0 # current position in the histogram for pos, height in enumerate(histogram): start = pos # position where rectangle starts while True: if not stack or height > top().height: stack.append(Info(start, height)) # push elif stack and height < top().height: max_size = max(max_size, (top().height, (pos - top().start), top().start), key=area) start, _ = stack.pop() continue break # height == top().height goes here pos += 1 for start, height in stack: max_size = max(max_size, (height, (pos - start), start), key=area) return max_size def area(size): return size[0] * size[1] 

Comments

1

An appropriate algorithm can be found within Algorithm for finding the largest inscribed rectangle in polygon (2019).

I implemented it in python:

import largestinteriorrectangle as lir import numpy as np grid = np.array([[0, 0, 0, 0, 1, 0], [0, 0, 1, 0, 0, 1], [0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1], [0, 0, 1, 0, 0, 0]], "bool") grid = ~grid lir.lir(grid) # [1, 2, 4, 3] 

the result comes as x, y, width, height

Comments

0

To be complete, here's the C# version which outputs the rectangle coordinates. It's based on dmarra's answer but without any other dependencies. There's only the function bool GetPixel(int x, int y), which returns true when a pixel is set at the coordinates x,y.

 public struct INTRECT { public int Left, Right, Top, Bottom; public INTRECT(int aLeft, int aTop, int aRight, int aBottom) { Left = aLeft; Top = aTop; Right = aRight; Bottom = aBottom; } public int Width { get { return (Right - Left + 1); } } public int Height { get { return (Bottom - Top + 1); } } public bool IsEmpty { get { return Left == 0 && Right == 0 && Top == 0 && Bottom == 0; } } public static bool operator ==(INTRECT lhs, INTRECT rhs) { return lhs.Left == rhs.Left && lhs.Top == rhs.Top && lhs.Right == rhs.Right && lhs.Bottom == rhs.Bottom; } public static bool operator !=(INTRECT lhs, INTRECT rhs) { return !(lhs == rhs); } public override bool Equals(Object obj) { return obj is INTRECT && this == (INTRECT)obj; } public bool Equals(INTRECT obj) { return this == obj; } public override int GetHashCode() { return Left.GetHashCode() ^ Right.GetHashCode() ^ Top.GetHashCode() ^ Bottom.GetHashCode(); } } public INTRECT GetMaximumFreeRectangle() { int XEnd = 0; int YStart = 0; int MaxRectTop = 0; INTRECT MaxRect = new INTRECT(); // STEP 1: // build a seed histogram using the first row of grid points // example: [true, true, false, true] = [1,1,0,1] int[] hist = new int[Height]; for (int y = 0; y < Height; y++) { if (!GetPixel(0, y)) { hist[y] = 1; } } // STEP 2: // get a starting max area from the seed histogram we created above. // using the example from above, this value would be [1, 1], as the only valid area is a single point. // another example for [0,0,0,1,0,0] would be [1, 3], because the largest area of contiguous free space is 3. // Note that at this step, the heigh fo the found rectangle will always be 1 because we are operating on // a single row of data. Tuple<int, int> maxSize = MaxRectSize(hist, out YStart); int maxArea = (int)(maxSize.Item1 * maxSize.Item2); MaxRectTop = YStart; // STEP 3: // build histograms for each additional row, re-testing for new possible max rectangluar areas for (int x = 1; x < Width; x++) { // build a new histogram for this row. the values of this row are // 0 if the current grid point is occupied; otherwise, it is 1 + the value // of the previously found historgram value for the previous position. // What this does is effectly keep track of the height of continous avilable spaces. // EXAMPLE: // Given the following grid data (where 1 means occupied, and 0 means free; for clairty): // INPUT: OUTPUT: // 1.) [0,0,1,0] = [1,1,0,1] // 2.) [0,0,1,0] = [2,2,0,2] // 3.) [1,1,0,1] = [0,0,1,0] // // As such, you'll notice position 1,0 (row 1, column 0) is 2, because this is the height of contiguous // free space. for (int y = 0; y < Height; y++) { if (!GetPixel(x, y)) { hist[y]++; } else { hist[y] = 0; } } // find the maximum size of the current histogram. If it happens to be larger // that the currently recorded max size, then it is the new max size. Tuple<int, int> maxSizeTemp = MaxRectSize(hist, out YStart); int tempArea = (int)(maxSizeTemp.Item1 * maxSizeTemp.Item2); if (tempArea > maxArea) { maxSize = maxSizeTemp; maxArea = tempArea; MaxRectTop = YStart; XEnd = x; } } MaxRect.Left = XEnd - maxSize.Item1 + 1; MaxRect.Top = MaxRectTop; MaxRect.Right = XEnd; MaxRect.Bottom = MaxRectTop + maxSize.Item2 - 1; // at this point, we know the max size return MaxRect; } private Tuple<int, int> MaxRectSize(int[] histogram, out int YStart) { Tuple<int, int> maxSize = new Tuple<int, int>(0, 0); int maxArea = 0; Stack<Tuple<int, int>> stack = new Stack<Tuple<int, int>>(); int x = 0; YStart = 0; for (x = 0; x < histogram.Length; x++) { int start = x; int height = histogram[x]; while (true) { if (stack.Count == 0 || height > stack.Peek().Item2) { stack.Push(new Tuple<int, int>(start, height)); } else if (height < stack.Peek().Item2) { int tempArea = (int)(stack.Peek().Item2 * (x - stack.Peek().Item1)); if (tempArea > maxArea) { YStart = stack.Peek().Item1; maxSize = new Tuple<int, int>(stack.Peek().Item2, (x - stack.Peek().Item1)); maxArea = tempArea; } Tuple<int, int> popped = stack.Pop(); start = (int)popped.Item1; continue; } break; } } foreach (Tuple<int, int> data in stack) { int tempArea = (int)(data.Item2 * (x - data.Item1)); if (tempArea > maxArea) { YStart = data.Item1; maxSize = new Tuple<int, int>(data.Item2, (x - data.Item1)); maxArea = tempArea; } } return maxSize; } 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.