5
\$\begingroup\$

I need to solve the following competitive programming problem from coderun.ru:

Strength of a Pyramid Top
The pyramid consists of n horizontal layers of blocks: the first layer contains n blocks, the second layer contains n−1 blocks, and so on, up to the n-th layer, which consists of a single block. Each block has a certain strength, expressed as a positive integer (determined by various factors such as material type, density, etc.).

Since the first layer of the pyramid is already built, it remains unchanged. Each of the following n−1 layers is of one of two types: min or max, and the blocks on it are arranged as follows:

For a min layer: the strength of each block equals the minimum of the strengths of the two blocks beneath it.
For a max layer: the strength of each block equals the maximum of the strengths of the two blocks beneath it.
You are required to determine the strength of the block at the top of the pyramid, given the structure of the first layer and the type of each of the following n−1 layers.

Input format

The first line contains a single integer (1≤n≤1,000,000) — the number of layers in the pyramid.

The second line contains n positive integers r1,r2,…,rn (1≤rin) — the strengths of the blocks in the first layer.

The third line contains n−1 characters from the set {M,m}, describing the layers. The i-th character describes the (i+1)-th layer:

M — a max layer
m — a min layer

Output format

A single integer — the strength of the block on the n-th (top) level.

Limitations: time limit is 4 seconds; space limit is 512Mb

Examples

Test 1: Input

7 3 1 4 5 6 2 1 MmmMmM 

Output

5

Test 2: Input

2 2 1 m 

Output

1

I've written a C++ code that creates a vector of double-linked lists (v) based on the rank-ordering patterns of the elements (non-decreasing vs decreasing). The idea is that for each linked list we can safely remove either its tail or head depending on whether the current operation is min or max.

If the next level is created as the minimum of two blocks from the lower level: – remove the tails of non-decreasing linked lists (getDir=1) – remove the heads of decreasing linked lists (getDir=0).

If the next level is created as the maximum of two blocks from the lower level: – remove the heads of non-decreasing linked lists (getDir=1) – remove the tails of decreasing linked lists (getDir=0).

We should also merge adjacent linked lists if:

- both are decreasing and the head of the right list is less than or equal to the tail of the left list, or
- both are non-decreasing and the head of the right list is greater than or equal to the tail of the left list.
- the pair of adjacent linked lists have only one unique value each (e.g.: {{2,2,2}; {3,3}} becomes {2,2,2,3,3})
This way, the overall time complexity of the algorithm will be about O(nlogn), which will fit within the 4-second time limit when n is up to 1,000,000.

However, this solution doesn't fit into the time limit.

Could you please either poke at the sub-optimalities of my solution/code or suggest a better one?

Here is my code:

#include <bits/stdc++.h> using namespace std; struct Node { int data; Node* next; Node* prev; Node(int val) : data(val), next(nullptr), prev(nullptr) {} }; class LinkedList { private: Node* head; Node* tail; int length; bool dir; public: LinkedList() : head(nullptr), tail(nullptr), length(0), dir(true) {} LinkedList(const LinkedList&) = delete; LinkedList& operator=(const LinkedList&) = delete; LinkedList(LinkedList&& other) noexcept { head = other.head; tail = other.tail; length = other.length; dir = other.dir; other.head = other.tail = nullptr; other.length = 0; } LinkedList& operator=(LinkedList&& other) noexcept { if (this != &other) { clear_safe(); head = other.head; tail = other.tail; length = other.length; dir = other.dir; other.head = other.tail = nullptr; other.length = 0; } return *this; } void setDir(bool d) { dir = d; } bool getDir() const { return dir; } void append(int val) { Node* newNode = new Node(val); if (!head) { head = tail = newNode; } else { tail->next = newNode; newNode->prev = tail; tail = newNode; } length++; } void prepend(int val) { Node* newNode = new Node(val); if (!head) { head = tail = newNode; } else { newNode->next = head; head->prev = newNode; head = newNode; } length++; } void pop_head() { if (!head) return; Node* toDelete = head; head = head->next; if (head) head->prev = nullptr; else tail = nullptr; delete toDelete; length--; } void pop_tail() { if (!tail) return; Node* toDelete = tail; tail = tail->prev; if (tail) tail->next = nullptr; else head = nullptr; delete toDelete; length--; } void clear_safe() { while (head) pop_head(); } Node* getHead() const { return head; } Node* getTail() const { return tail; } int getLength() const { return length; } int getHeadData() const { return head ? head->data : -1; } int getTailData() const { return tail ? tail->data : -1; } void merge(LinkedList&& other) { if (!other.head) return; if (!head) { head = other.head; tail = other.tail; length = other.length; dir = other.dir; } else { tail->next = other.head; other.head->prev = tail; tail = other.tail; length += other.length; } other.head = other.tail = nullptr; other.length = 0; } ~LinkedList() { clear_safe(); } }; int main() { ios_base::sync_with_stdio(false); cin.tie(0); int n; cin >> n; vector<LinkedList> v; LinkedList temp; bool flag = true; int inter; cin >> inter; if (n == 1) { cout << inter << endl; return 0; } temp.append(inter); cin >> inter; if (inter < temp.getTail()->data) { flag = false; temp.setDir(false); } temp.append(inter); for (int i = 0; i < n - 2; i++) { cin >> inter; if (flag) { if (inter >= temp.getTail()->data) { temp.append(inter); } else { v.push_back(std::move(temp)); temp = LinkedList(); temp.setDir(false); flag = false; temp.append(inter); } } else { if (inter < temp.getTail()->data) { temp.append(inter); } else { v.push_back(std::move(temp)); temp = LinkedList(); temp.setDir(true); flag = true; temp.append(inter); } } } v.push_back(std::move(temp)); string s; cin >> s; for (char op : s) { if (v.size() > 1) { for (int i = 0; i < v.size() - 1; i++) { int left_tail = v[i].getTailData(); int right_head = v[i+1].getHeadData(); if (op == 'm') { if (left_tail < right_head) { v[i].append(left_tail); } else { v[i+1].prepend(right_head); } } else { if (left_tail >= right_head) { v[i].append(left_tail); } else { v[i+1].prepend(right_head); } } } } vector<LinkedList> non_empty; for (auto& run : v) { if (run.getLength() > 0) { if (op == 'M') { if (run.getDir()) { run.pop_head(); } else { run.pop_tail(); } } else { if (run.getDir()) { run.pop_tail(); } else { run.pop_head(); } } if (run.getLength() > 0) { non_empty.push_back(std::move(run)); } } } if (non_empty.empty()) { break; } v = move(non_empty); vector<LinkedList> merged; merged.push_back(std::move(v[0])); for (int i = 1; i < v.size(); i++) { LinkedList& current = v[i]; LinkedList& last = merged.back(); bool canMerge = false; if (last.getDir() == current.getDir()) { if (last.getDir()) { if (last.getTailData() <= current.getHeadData()) { canMerge = true; } } else { if (last.getTailData() >= current.getHeadData()) { canMerge = true; } } } else if (last.getHeadData() == last.getTailData()) { if (current.getDir() && last.getTailData() <= current.getHeadData()) { canMerge = true; last.setDir(true); } else if (!current.getDir() && last.getTailData() >= current.getHeadData()) { canMerge = true; last.setDir(false); } else if (current.getHeadData() == last.getTailData()) { canMerge = true; if (last.getTailData() <= current.getHeadData()) { last.setDir(true); } else { last.setDir(false); } } } if (canMerge) { last.merge(std::move(current)); } else { merged.push_back(std::move(current)); } } v = move(merged); } cout << v[0].getHeadData() << endl; return 0; } 

As an FYI: I've also tried to solve this problem with minimax optimization and apply alpha-beta pruning, but given that mins and maxs are not necessarily going one by another, it defeats the whole idea of using minimax optimization. Besides, since the pairs overlap (each block's affects both the left and right blocks at the upper level) the pruning is not happening, though I gave it a spin. Also, alpha-beta pruning does not guarantee a quick execution.

Additionally, I contemplated on the dynamic programming, but it would require O(n^2) complexity and hence won't work here due to the time limitation (1000000^2 is too many iterations for 4 seconds)

\$\endgroup\$
4
  • \$\begingroup\$ Where on coderun.ru is this? \$\endgroup\$ Commented Aug 29 at 10:46
  • \$\begingroup\$ Please reply... \$\endgroup\$ Commented Aug 31 at 16:14
  • \$\begingroup\$ I see you've visited and replied elsewhere. Why are you ignoring me? \$\endgroup\$ Commented Sep 4 at 10:48
  • \$\begingroup\$ I have an O(n) solution that I would've liked to test there before I post it. Sad that you didn't want it... \$\endgroup\$ Commented Sep 15 at 8:24

3 Answers 3

6
\$\begingroup\$

General Observations

Writing your own linked list implementation may not perform as well as the standard library version, compiler and library writers do their best to optimize their solutions.

The fact that the code allocates and deletes memory could be a speed issue that is costing additional time. The use of some other standard library feature such as a vector or array might perform much better. An array would be the fastest if you can use it. Using iterators rather than indexes could possibly improve the performance of each loop.

If you are worried about the time issue, it might be better to put the data directly into the proper data structures rather than repeatedly going over it multiple times, or least that seems to be what is going on with temp and v.

The code appears to ignore all the C++ best practices:

  • The user of non-standard headers #include <bits/stdc++.h>.
  • The use of using namespace std;.
  • There are no functions or methods in the main program and it is 135 lines long with no vertical spacing between logic blocks.
  • In the main program many of the variables have a single character as a name, this makes it more difficult to understand what the code is doing.
  • If the code were self documenting (good variable names and good function names) there might not need to be comments, but right now the lack of comments are a problem.

As the code is written without functions, there is no automated way to find where the bottlenecks may be.

The code is not portable. The code is barely readable and therefore it can't be maintained.

Type Mismatch

The std::vector::size() function returns std::size_t, not int, this can cause issues. It is not likely in this case, but it could cause a performance issue if the wrong sized variable is used.

\$\endgroup\$
5
  • 3
    \$\begingroup\$ Unfortunately, #include <bits/stdc++.h> and using namespace std; is part of the C++ submission template on some coding challenge websites. \$\endgroup\$ Commented Aug 28 at 13:57
  • \$\begingroup\$ @MartinR True, and definitely unfortunate. \$\endgroup\$ Commented Aug 28 at 14:39
  • 1
    \$\begingroup\$ There is a reason why I created my own doubly linked list struct: I need to enable Direction property called Dir (nondcending or 1 vs descending or 0) as well as a couple of methods related to that property, namely getDir to check if the list is descending or not and setDir to assign a new value to the Dir property. I could have used pair<std::list, bool> and store the direction in the second value of that pair, but I figured that it would increase the space load, which is unnecessary. And the functions were not used to speed up the execution, though I'm not sure if it helps \$\endgroup\$ Commented Aug 28 at 15:17
  • 2
    \$\begingroup\$ @Stony: "[using a] pair<std::list, bool> and [storing] the direction in the second value of that pair […] would increase the space load" I seriously doubt that: Measure! \$\endgroup\$ Commented Aug 28 at 20:06
  • 1
    \$\begingroup\$ @greybeard - I've tried the updated code with pair{std::list, bool} instead of the custom LinkedList struct, and it does work slower and hence also didn't fit it. \$\endgroup\$ Commented Aug 29 at 14:58
1
\$\begingroup\$

Review comment

Though not seeming to be too much, a 'pyramid' with (maximally) 1e6 "layers" would define the nature of each of the layers with 1e6 - 1 'M's and 'm's. There's no need to preload ALL those definitions (into ANOTHER 'slow-ish' LL).

Notice, a strong 'clue' has been given by the task designer. The value of any 'strength' is, maximally, the number of bricks in the bottom layer; maximally 1e6. All of those values used in the test appear on ONE line, and each could be as much as 7 digits wide. In a relatively extreme case, that second line of "values to use" could be as much as (6+1) * 1e6 = seven million digits and SPs long (the +1 for a SP character between each 'strength'). The next line is the "in flight" series of 'M' and 'm' layer definitions. This line, at maximally 1e6 characters FOLLOWS the longer line of "bottom layer strengths" (IE. multiple digits per strength of each brick versus one single character per layer.) The designer has given the clue that each 'M'/'m' can be read-in and utilised during processing, instead of as part of some preparation activity by the code.

Shift the cin so that the 'M'/'m' nature of the current supporting layer is retrieved only when needed for this layer's iteration. (Bottom layer has infinite support, of course.)

Recognise that (re-)instantiating vector<LinkedList> merged; (and any others) inside a loop does not come for 'free'. "Under the hood", those objects grow (or shrink) dynamically, too.

Analogy: 'stock' cars have their performance 'improved' by "expert" understanding of their operation 're-crafting' their subsystems for optimisation. Perhaps jettisoning the spare tire, and the rear seat will lighten the vehicle for better performance at Sunday's race meet-up. In other words, the 'convenience' and 'robustness' of using a standard library object risks introducing unseen and slightly costly, and in this case unnecessary, 'flexibility' that contributes to slowing down completion of the task.

A linked list is a wonderful "workhorse" data model; flexible, dynamic, and easy to code. However, they are, in use, likely to be "cache-unfriendly". Each useful item of "payload" data is welded to at least one 'pointer' value. This "bulking up" of each item reduces the density of useful data values. Result: more 'administrative' time spent fetching more lines into the cache. In this problem, each 4-byte int is accompanied by the OP's double linked list pointers; that is: 2x8 extra bytes required.

Additionally: the code's 'traversal while computing and storing' mimics pouring a liquid from one container to another. This could pour "back and forth" using fewer resources. Thus the code becomes more complicated by requiring a "parity awareness" the OP calls dir. Using two simple, contiguous arrays, one for reference values, the other for recording results, would be much simpler to understand and to code. Merely toggling between the two (reversing roles) requires only something like:

 // 'skeletal' demo int parity = 0; int arr[ 2 ] [ 1e6 ]; // fill array[0] with 'bottom layer' values ... // fetch this layer's strength: 'M' or 'm' int *src = &arr[ parity ][ 0 ]; parity = !parity; // 0 ==>> 1 and 1 ==>> 0 int *dst = &arr[ parity ][ 0 ]; ... ... // read "pairs" from *src // compute and assign to *dst // incr both ptrs and continue loop until done last brick on this layer // loop to begin computing next layer ... // have determined value of top brick 

NB: Shown as nested, each looping block should be in its own simple and easy-to-understand function. 'A' calls its worker 'B', 'B' calls its worker 'C'. Strive to keep things simple and modular...

NB: Consideration of "the program's stack" must not be overlooked. Massive arrays declared on the stack may risk triggering the dreaded "stack overflow".

After thought: Using a 'double linked list' may have been getting off to a bad start (unless that was part of the purpose to complete this challenge.)

An alternative would be to use the OP's LL, but as a "circular buffer" variation with 'single linked list' nodes. One traversal of an original ring of nodes, using a simple counter, could halt at the intended location (or its predecessor). A small bit of pointer manipulation could 'snip' the 'tail' node from the ring, and then the next traversal could begin. In practice, the dir of processing would not need to reverse. (Simpler.) The ultimate end would be detectable when the 'ring' has been pared-down to one node whose pointer points to itself. A variation on the insight noted by @jb_dk, would mean each single node and its adjacent node are combined to determine the new value of the current node, then the 'ribosome' moves on to make the next node its current node, and so on and so on. Use of C++ library container classes may be overkill for this function. Sometimes, the simplest solution can be the best solution, though.


Suggested improvement

Read about "Sobel Filters" used in image processing.

The parallel with your project is strong.

In brief, 2D "arrays" are used to store individual "pixel" values.
Traversing the array, the code uses "neighbouring" pixel values to 'tweak' value of each pixel, one-by-one.

Unchanging array size needs ONE allocation of working memory buffer.
Dynamic allocation can be relatively expensive.
Find and use an existing paradigm as 'model' for your objective.
In your case, it seems only two "levels" are needed.
(In your project, you may get away with two-or-three working buffers.
Depends on how you redesign the internal data representation.)

Familiarise yourself with "cache hits and misses".
Anything you can do that improves the possibility that using one parcel of data means the next parcel will already be available in the cache will increase performance.
E.g. if presented with 100K 'brick' problem and the data for 20 'bricks' fits in one line of cache, then increasing that "density" to 25 'bricks' per line of cache could achieve something approaching a 25% speed improvement. These are made-up values used only to illustrate the concept of designing toward "cache friendly" code (i.e. faster code).

Another analogue would be "Conway's Game of Life".


The first question to ask oneself is, "What exists that can I use as a good model for this task I'm undertaking?"

Perhaps, the next question is, "What is most important about this project?"
When used as they were intended to be used, the "generic" and reliable classes and functions offered by the language's libraries are suitable for most cases. When one wants/needs, for instance, "fast-fast", DIY code meticulously designed, coded and tested might be worth the time required for its development.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ To your points about linked lists, the OP may wish to peruse this Stack Overflow question and its answers: stackoverflow.com/questions/34170566/… \$\endgroup\$ Commented Aug 29 at 17:15
0
\$\begingroup\$

A better solution is to treat the problem as a pure math problem first and a programming task second.
So first look at the logic of the problem and derive a simpler algorithm that doesn't keep all the bricks of all the layers in memory at the same time.
Next implement the optimized algorithm to complete in 4 seconds.

Somehow, you will need to load the input in less than 4 seconds and find the result in the leftover time.

Given the structure of the input data, perhaps use the bits from the m/M row to determine if the final result will be min of all bottom bricks (BBs), max of all BBs or something else.
Then apply that conclusion to the loop over the numbers row. This should reduce the memory need to much less .

Example resoning: If the 1 brick layer is m, only the min of the 2 brick layer, if it is M, only the max of the 2 brick layer.
If the 2 brick layer is the same, the conclusion applies also to the 3 brick layer.
If the tip is m and the 2 BL is M, then the formula is min(max(a, b), max(b, c)) = max(b, min(a, c)).
If the opposite min(b, max(a, c)).
Iterate the argument all the way to the bottom layer and input the strengths as needed to create an algorithm.

A pure programming approach would be to create a streaming but direct algorithm. Load the m/M bits into a compact bit array (N bits) and keep an array of last brick strengths per layer (N values).
As each strength number is loaded, update the array.
At the end the final value in the entry for the tip will be the result.

An even simpler programming approach would be to have an array of the strengths in the current layer (initially the input values), and then apply the layer type overwriting all but he last array element, which is logically discarded, but not deallocated.
In the end the array will contain one valid brick strength (the result) and N-1 values describing each brick on the ending slope of the pyramid.

To understand this kind of thinking read the first few chapters of "The Art of Computer Programming" by Donald Knuth and do a few of the exercises.

Overall, these suggestions should reduce the problem from O(N^2) to O(N) in memory and perhaps calculation time (my last 2 suggestions still require about N*N/2 min or max operations).
In contrast your code would seem to allocate a lot of extra space for the strength of all approximately N*N/2 bricks. For N=1e6, that is 5e11 ints of memory plus all your unneeded pointers.

Your mistakes are not uncommon.
In particular, some versions of Microsoft Windows used code like yours to decide which updates to install and in what order, causing the update management processes to run for a very long time on every windows computer in the world every month.

\$\endgroup\$
4
  • 1
    \$\begingroup\$ (Please check if the line breaks I made effective are indeed intended.) \$\endgroup\$ Commented Aug 29 at 7:01
  • \$\begingroup\$ Thanks for the recommendation, but my original code only stores one layer of bricks, so the space consumption has never been a problem. The only issue is the time complexity, which as per your recommendation stays at around O(n^2), hence won't fit into 4 seconds when n = 1e6. \$\endgroup\$ Commented Aug 29 at 11:06
  • \$\begingroup\$ If you were just working on a single array describing a single layer, why were you faffing about with linked lists? Also, you need to do more work on reducing the problem mathematically to avoid O(n^2) comparisons and O(n^2) assignments . @greybeard: I think you broke up some paragraphs that contained chained arguments that were acting as single text blocks within the larger answer, I mashed those argument sentences into single paragraphs to mark them as a logical unit of thought . \$\endgroup\$ Commented Sep 4 at 15:51
  • \$\begingroup\$ By all means, put together again what you think belongs together: It's your answer after all. (Most simply by deleting one of the trailing blanks. I commented the edit with the rationale I used - the wording in the post editor help used to be less clear than I found it to be now.) \$\endgroup\$ Commented Sep 4 at 17:03

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.