Return to Question

Bumped by Community user

occurred Feb 20, 2021 at 18:03

Bumped by Community user

occurred Oct 23, 2020 at 18:00

Added comment about the language the codec is being implemented

edited Sep 23, 2020 at 17:45

Cristian Maruan Bosin

21
2

I need to implement an encoder which compresses a 5-dimensional structure of 10 bits values. Each dimension has between 4 and 12 elements. If a dimension ever has more than 12 elements, it is partitioned in half. So far, I have been using a separable DCT to do that, but my implementation is very slow: I have implemented a naive DCT transform, which has complexity of $O(N^2)$. I have tried some fast algorithms, but they seem to be slower than the naive approach: a naive DCT for 6 elements is faster the Fast version, but for $N = 4$ and $8$, the fast approach is definitely faster. Also, I cannot force the friendly power-of-2 sizes.

I remember the Cooley–Tukey FFT algorithm decomposes a sequence of size $N$ into their factors $N_1$ and $N_2$ such that $N = N_1N_2$. I also know that there are specialized algorithms for 2D DCT due to their use in video and image codecs.

My question is: Is there any reference about a fast multidimensional DCT or even a "reversed" Cooley-Tukey for DCT of which I could transform my 5D array into 1D and use a fast DCT?

Exrta bits: We are researching not only the DCT-II but the whole family of trigonometric transforms., insights about any of the DST and DCT are welcome!

Edited 9/23

Just as an extra comment, this solution is being implemented in C++.

I need to implement an encoder which compresses a 5-dimensional structure of 10 bits values. Each dimension has between 4 and 12 elements. If a dimension ever has more than 12 elements, it is partitioned in half. So far, I have been using a separable DCT to do that, but my implementation is very slow: I have implemented a naive DCT transform, which has complexity of $O(N^2)$. I have tried some fast algorithms, but they seem to be slower than the naive approach: a naive DCT for 6 elements is faster the Fast version, but for $N = 4$ and $8$, the fast approach is definitely faster. Also, I cannot force the friendly power-of-2 sizes.

I remember the Cooley–Tukey FFT algorithm decomposes a sequence of size $N$ into their factors $N_1$ and $N_2$ such that $N = N_1N_2$. I also know that there are specialized algorithms for 2D DCT due to their use in video and image codecs.

My question is: Is there any reference about a fast multidimensional DCT or even a "reversed" Cooley-Tukey for DCT of which I could transform my 5D array into 1D and use a fast DCT?

Exrta bits: We are researching not only the DCT-II but the whole family of trigonometric transforms., insights about any of the DST and DCT are welcome!

I need to implement an encoder which compresses a 5-dimensional structure of 10 bits values. Each dimension has between 4 and 12 elements. If a dimension ever has more than 12 elements, it is partitioned in half. So far, I have been using a separable DCT to do that, but my implementation is very slow: I have implemented a naive DCT transform, which has complexity of $O(N^2)$. I have tried some fast algorithms, but they seem to be slower than the naive approach: a naive DCT for 6 elements is faster the Fast version, but for $N = 4$ and $8$, the fast approach is definitely faster. Also, I cannot force the friendly power-of-2 sizes.

I remember the Cooley–Tukey FFT algorithm decomposes a sequence of size $N$ into their factors $N_1$ and $N_2$ such that $N = N_1N_2$. I also know that there are specialized algorithms for 2D DCT due to their use in video and image codecs.

My question is: Is there any reference about a fast multidimensional DCT or even a "reversed" Cooley-Tukey for DCT of which I could transform my 5D array into 1D and use a fast DCT?

Exrta bits: We are researching not only the DCT-II but the whole family of trigonometric transforms., insights about any of the DST and DCT are welcome!

Edited 9/23

Just as an extra comment, this solution is being implemented in C++.

Removed confusing part of the question.

edited Sep 23, 2020 at 0:09

Cristian Maruan Bosin

21
2

I need to implement an encoder which compresses a 5-dimensional structure of 10 bits values. Each dimension has between 4 and 12 elements. If a dimension ever has more than 12 elements, it is partitioned in half. So far, I have been using a separable DCT to do that, but my implementation is very slow: I have implemented a naive DCT transform, which has complexity of $O(N^2)$. I have tried some fast algorithms, but they seem to be slower than the naive approach: a naive DCT for 6 elements is faster the Fast version, but for $N = 4$ and $8$, the fast approach is definitely faster. Also, I cannot force the friendly power-of-2 sizes because $2^{15}$ is much bigger than $7^5$.

I remember the Cooley–Tukey FFT algorithm decomposes a sequence of size $N$ into their factors $N_1$ and $N_2$ such that $N = N_1N_2$. I also know that there are specialized algorithms for 2D DCT due to their use in video and image codecs.

My question is: Is there any reference about a fast multidimensional DCT or even a "reversed" Cooley-Tukey for DCT of which I could transform my 5D array into 1D and use a fast DCT?

Exrta bits: We are researching not only the DCT-II but the whole family of trigonometric transforms., insights about any of the DST and DCT are welcome!

I need to implement an encoder which compresses a 5-dimensional structure of 10 bits values. Each dimension has between 4 and 12 elements. If a dimension ever has more than 12 elements, it is partitioned in half. So far, I have been using a separable DCT to do that, but my implementation is very slow: I have implemented a naive DCT transform, which has complexity of $O(N^2)$. I have tried some fast algorithms, but they seem to be slower than the naive approach: a naive DCT for 6 elements is faster the Fast version, but for $N = 4$ and $8$, the fast approach is definitely faster. Also, I cannot force power-of-2 sizes because $2^{15}$ is much bigger than $7^5$.

I remember the Cooley–Tukey FFT algorithm decomposes a sequence of size $N$ into their factors $N_1$ and $N_2$ such that $N = N_1N_2$. I also know that there are specialized algorithms for 2D DCT due to their use in video and image codecs.

My question is: Is there any reference about a fast multidimensional DCT or even a "reversed" Cooley-Tukey for DCT of which I could transform my 5D array into 1D and use a fast DCT?

Exrta bits: We are researching not only the DCT-II but the whole family of trigonometric transforms., insights about any of the DST and DCT are welcome!

I need to implement an encoder which compresses a 5-dimensional structure of 10 bits values. Each dimension has between 4 and 12 elements. If a dimension ever has more than 12 elements, it is partitioned in half. So far, I have been using a separable DCT to do that, but my implementation is very slow: I have implemented a naive DCT transform, which has complexity of $O(N^2)$. I have tried some fast algorithms, but they seem to be slower than the naive approach: a naive DCT for 6 elements is faster the Fast version, but for $N = 4$ and $8$, the fast approach is definitely faster. Also, I cannot force the friendly power-of-2 sizes.

I remember the Cooley–Tukey FFT algorithm decomposes a sequence of size $N$ into their factors $N_1$ and $N_2$ such that $N = N_1N_2$. I also know that there are specialized algorithms for 2D DCT due to their use in video and image codecs.

My question is: Is there any reference about a fast multidimensional DCT or even a "reversed" Cooley-Tukey for DCT of which I could transform my 5D array into 1D and use a fast DCT?

Exrta bits: We are researching not only the DCT-II but the whole family of trigonometric transforms., insights about any of the DST and DCT are welcome!

asked Sep 22, 2020 at 19:23

Cristian Maruan Bosin

21
2

Fast algorithm for n-dimensional DCT

I need to implement an encoder which compresses a 5-dimensional structure of 10 bits values. Each dimension has between 4 and 12 elements. If a dimension ever has more than 12 elements, it is partitioned in half. So far, I have been using a separable DCT to do that, but my implementation is very slow: I have implemented a naive DCT transform, which has complexity of $O(N^2)$. I have tried some fast algorithms, but they seem to be slower than the naive approach: a naive DCT for 6 elements is faster the Fast version, but for $N = 4$ and $8$, the fast approach is definitely faster. Also, I cannot force power-of-2 sizes because $2^{15}$ is much bigger than $7^5$.

I remember the Cooley–Tukey FFT algorithm decomposes a sequence of size $N$ into their factors $N_1$ and $N_2$ such that $N = N_1N_2$. I also know that there are specialized algorithms for 2D DCT due to their use in video and image codecs.

My question is: Is there any reference about a fast multidimensional DCT or even a "reversed" Cooley-Tukey for DCT of which I could transform my 5D array into 1D and use a fast DCT?

Exrta bits: We are researching not only the DCT-II but the whole family of trigonometric transforms., insights about any of the DST and DCT are welcome!