Real to complex #21

Lagrang3 · 2021-08-09T10:11:32Z

In this PR I will push the boost backend routines to perform real to complex transforms.

Lagrang3 · 2021-08-09T10:19:47Z

include/boost/math/fft/bsl_backend.hpp

@@ -138,11 +139,96 @@
    std::size_t my_size{};
  };

+


the bsl rfft backend needs to be implemented

Lagrang3 · 2021-08-09T10:24:19Z

include/boost/math/fft/algorithms.hpp

  template<class complex_value_type>
-  void complex_dft_power2(
+  void complex_dft_power2_dif(


According to the GSL documentation, the rfft can be deduced from the decimation-in-time (DIT) complex fft.
The default complex fft for powers of 2 is implemented as Decimation-in-frequency (DIF), i wrote another complex fft as DIT to test the algorithm before writing the rfft.

Lagrang3 · 2021-08-09T10:26:21Z

include/boost/math/fft/fftw_backend.hpp

@@ -574,28 +574,14 @@
    // -> size(out) >= N
    {
      const std::size_t N = size();
-      vector_t<real_value_type> tmp(out,out+N);


The algorithm that we use for rfft for powers of 2, comes naturally in-place if we use the halfcomplex representation used by FFTW.

Lagrang3 · 2021-08-09T10:27:05Z

include/boost/math/fft/gsl_backend.hpp

@@ -174,6 +174,34 @@ namespace fft { namespace detail {
      gsl_fft_halfcomplex_wavetable *halfcomplex_wtable;
      gsl_fft_real_workspace        *real_wspace;

+      void pack_halfcomplex(real_value_type* out) const


The algorithm that we use for rfft for powers of 2, comes naturally in-place if we use the halfcomplex representation used by FFTW. Hence we have to pack and unpack the halfcomplex returned by GSL to make it compatible with our representation.

Wow, I appreciate your comments about what's going on in the code.

It's a bit unrelated to this comment, but I noticed that we could replace 2*boost::math::constants::pi<RealType>() with boost::math::constants::two_pi<RealType>(). This could eliminate 1 ULP of error.

Lagrang3 · 2021-08-09T10:28:35Z

include/boost/math/fft/real_algorithms.hpp

+  }
+
+  template <class T>
+  void real_dft_power2(const T *in_first, const T *in_last, T* out, int sign)


Here is the real-to-halfcomplex rfft for powers of 2. Changing the sign of the transform has the only effect of reversing the sign of the imaginary part in the result.

This algorithm uses half of the memory and computation than the complex fft of the same size.

real-to-halfcomplex rfft for powers of 2

Cool. I'll give it a try as soon as i get a chance. Thanks Eduardo!

Lagrang3 · 2021-08-09T12:21:20Z

Hi @cosurgi and @ckormanyos. The following error message printed by b2 appears in the CI test and my local runs:

error: No best alternative for ./fft_compile
    next alternative: required properties: <link>static
        matched
    next alternative: required properties: <link>static
        matched
error: No best alternative for ./fft_allocator
    next alternative: required properties: <link>static
        matched
    next alternative: required properties: <link>static
        matched
error: No best alternative for ./fft_pmr_allocator
    next alternative: required properties: <link>static
        matched
    next alternative: required properties: <link>static
        matched
error: No best alternative for ./fft_iterators
    next alternative: required properties: <link>static
        matched
    next alternative: required properties: <link>static
        matched
error: No best alternative for ./fft_non-complex
    next alternative: required properties: <link>static
        matched
    next alternative: required properties: <link>static
        matched
error: No best alternative for ./fft_auxiliary_functions
    next alternative: required properties: <link>static
        matched
    next alternative: required properties: <link>static
        matched

Do you know what could it be?

ckormanyos · 2021-08-09T14:20:10Z

know what could it be

I think it's some missing linker properties in the build rules. John is a lot better with jam files and I'm really not very good at it. I will try and figure out what's going on to give the linker what it needs/expects.

But you got green in CI nonetheless, right?

Cc: @Lagrang3

Lagrang3 · 2021-08-09T14:24:17Z

Yes. I've got green. But those test are being skipped.

Lagrang3 · 2021-08-10T08:13:12Z

include/boost/math/fft/real_algorithms.hpp

+      composite sizes.
+    */
+    using allocator_type = allocator_t;
+    using ComplexType = ::boost::multiprecision::complex<T>; 


Notice here that the real-to-halfcomplex routine for composite sizes is an adaptation of the complex Cooley-Tukey FFT decimation in time. In the algorithm we take advantage of the the complex arithmetic, hence we need to build a complex type from the real T for inner calculations only. This is related to issue #16.

Good note. Thanks for pointing this out.

Lagrang3 · 2021-08-10T08:15:38Z

include/boost/math/fft/real_algorithms.hpp

+      for (long i = 0; i < n; i += len)
+      {
+        //std::cout << "    i = " << i << "\n";
+        for(long k=0;2*k<=len_old;++k)


Here lies the optimization of real-to-halfcomplex. In the complex-to-complex routine this for loop would be for(long k=0;k<len_old;++k). Here instead we use half of the values of k, the other half is redundant.

Lagrang3 · 2021-08-10T08:20:01Z

It remains to be done:

(done) the halfcomplex-to-real for composite sizes,
the real-to-halfcomplex for prime sizes,
the halfcomplex-to-real for prime sizes,
test the correctedness,
(maybe for another PR) provide a halfcomplex class to hide the representation details.

Lagrang3 · 2021-08-10T15:12:58Z

include/boost/math/fft/real_algorithms.hpp

@@ -351,7 +351,7 @@
            const T *in_first, 
            const T *in_last, 
            T* out, 
-            int sign,
+            int /*sign*/,


The sign for real transforms is not important. The default behavior is that the real-to-complex is a forward transform.

Lagrang3 · 2021-08-10T15:14:25Z

include/boost/math/fft/real_algorithms.hpp

@@ -487,6 +487,137 @@
      real_dft_composite_outofplace(in_first,in_last,out,sign,alloc);
  }

+
+  template <class T, class allocator_t>
+  void real_inverse_dft_composite_outofplace(


The halfcomplex-to-real is again the reverse operation of the real-to-halfcomplex.

Lagrang3 · 2021-08-10T15:17:17Z

include/boost/math/fft/real_algorithms.hpp

+      }
+    }
+
+    std::vector<T,allocator_type> tmp(out,out+n);


Because the input is const, we need scratch space to perform the "bit-reversal". In the real-to-halfcomplex function this was not necessary, because the reversal was performed starting with the input data.

Lagrang3 · 2021-08-10T15:19:48Z

include/boost/math/fft/real_algorithms.hpp

+  }
+
+  template<class T,class Allocator_t>
+  void real_inverse_dft_composite_inplace(


These helper functions appear very often, maybe we could wrap them into a single unit.

definitely. Maybe even in a separate header?

ckormanyos · 2021-08-10T16:06:30Z

Hi @Lagrang3 just did a simple ::template dot syntax change to get CI going green

…al-to-complex

cosurgi · 2021-08-10T18:15:29Z

@ckormanyos concluding the CI - we now have full tests both for gcc, including quadmath, and clang ? Do we need msvc, or something else?

ckormanyos · 2021-08-10T18:45:41Z

concluding the CI - we now have full tests both for gcc, including quadmath, and clang ? Do we need msvc, or something else?

Hi @cosurgi yes, definitely. Great point.

I would like to put in some BSL-backend tests for MSVC 14.0 and 14.2 into CI. I believe the BSL stuff is the only kind of backend that MSVC is capable of compiling (and running). I will probably end up making 2 new MSVC-only jobs for 14.0 and 14.2, as these have different standards adherence for the test matrices.

Is there/are thare/can you suggest any file(s) that you could recommend for BSL-only, capable of build and run in MSVC? Or do i need to configure a few of them newly myself?

cosurgi · 2021-08-10T18:51:19Z

I think that test/fft_compile.cpp is a good starting point. It doesn't have (yet) the float128 or MPFR.
Then test/fft_correctedness.cpp once the first one works. Once you start it I will join you there trying to add the files example/fft*pp with MPFR etc. I would prefer that you start modifying the jamfile, it's still my weak point.

cosurgi · 2021-08-10T18:54:33Z

Ahh, by BSL-only, you mean no float128, no MPFR ?

EDIT: In such case I guess the solution is to add #ifdef MPFR_ALLOWED or something like that. And pass defines from the jamfile?

ckormanyos · 2021-08-10T19:00:18Z

Ahh, by BSL-only, you mean no float128, no MPFR

Ughhh. I have to think a bit.

On MSVC CI we will definitely not have:

GSL, FFTW
MPFR, GMP
quadmath (i.e., boost::muptiprecision::float128)

On MSVC CI we might have:

an MPIR (Win* version of GMP) configuration available

On MSVC CI we definitely do have:

the three built-in is_floating_point types
cpp_dec_float, cpp_bin_float

Cc: @cosurgi and @Lagrang3

cosurgi · 2021-08-10T19:15:57Z

Ouch. Not fun. I guess that even getting test/fft_compile.cpp to work will be quite an achievement. First we could comment out MPFR and GSL there and only fundamental types with BSL-only are left.

ckormanyos · 2021-08-10T20:27:51Z

getting test/fft_compile.cpp to work

Yes. I intend to look into this shortly. It shouldn't be that bad if we cut out the fftw_- and gsl_-backends and configure the float128 out of the relevant header files.

What remains, ... you may fairly enough ask? Pretty much plain bsl_-backend stuff for built-in floating-point types and Boost's header-only wide-floats such as cpp_dec_float and cpp_bin_float.

I'll see how your new work(s) look in the world of MSVC... and report back with details.

Cc: @cosurgi and @Lagrang3

ckormanyos · 2021-08-10T20:30:17Z

Hi @Lagrang3 and @cosurgi aside from MSVC potential configuration(s), is this PR (looking very good now) ready to merge to develop in our fork?

Lagrang3 · 2021-08-11T05:10:38Z

Hi @Lagrang3 and @cosurgi aside from MSVC potential configuration(s), is this PR (looking very good now) ready to merge to develop in our fork?

At the latest commit 0f7e425, we have already BSL real-to-complex and complex-to-real fully working for any size. The complexity is always O(n log n), even for prime numbers. The real optimization, ie. half of the computational cost, is valid only for composite sizes. I would like to think how to do the same kind of optimization for Rader's algorithm (prime sizes).
I think anyways up to here we have a good achievement. As a side note, consider that the GSL transforms are O(n^2) for prime sizes.

Let me just push a test for correctness and then we can merge.

ckormanyos · 2021-08-11T05:26:36Z

we have already BSL real-to-complex and complex-to-real fully working for any size. The complexity is always O(n log n), even for prime numbers. The real optimization, ie. half of the computational cost, is valid only for composite sizes. I would like to think how to do the same kind of optimization for Rader's algorithm (prime sizes).
I think anyways up to here we have a good achievement.

Thank you Eduardo! This is an awesome achievement @Lagrang3

Let me just push a test for correctness and then we can merge.

Perfect. Just after your push if you give me a brief chance to add some MSVC CI tests, I'll try. If that's too time-consuming, we will just stick with GCC/clan CI and I can work out a few selected MSVC tests later then

in our convention halfcomplex-to-real is the inverse map of the real-to-halfcomplex. There's no good reason/advantage I can think of for neglecting the 1/N factor.

Lagrang3 · 2021-08-11T06:13:31Z

include/boost/math/fft/fftw_backend.hpp

+      const std::size_t N = size();
+      const real_value_type inv_N = real_value_type{1.0}/N;
+      for(unsigned int i=0;i<N;++i)
+        out[i] *= inv_N;


when transforming halfcomplex-to-real, both GSL and FFTW, return an array which is a 1/N factor of the original real array. This could be in tune with the fact that the reverse FFT is the inverse of the forward FFT up to a 1/N factor. But the actual implementation of the halfcomplex-to-real (at least in BSL) is the reverse graph of the real-to-halfcomplex and the 1/N factor does not appear. I cannot see a good reason why the halfcomplex-to-real cannot be the inverse linear map of the real-to-halfcomplex. In the BSL this comes naturally. With the GSL and FFTW I need to multiply times 1/N.

UPD. I realized that the halfcomplex-to-real in BSL has multiplications by inverse factors of N all over. So it doesn't actually comes naturally like I previously said. @cosurgi @ckormanyos : if you agree I will change the halfcomplex-to-real to return the inverse of the real-to-halfcomplex times N, just to save some multiplications and be in tune with the FFTW and GSL conventions.

@Lagrang3 yes, this is a good idea. Better to make sure that all backends produce the same kind of output. To avoid any kind of frustration for the users who have a nice working code, and only decide to switch to a different backend.

Having to put an if somewhere in the user code to multiply by 1/N, depending on the type of backend, would not be good.

Lagrang3 · 2021-08-11T06:16:24Z

test/fft_real_to_complex.cpp

+  {
+    test_r2c<fftw_rfft<double>>(n,8);
+    test_r2c<gsl_rfft<double>>(n,16);
+    test_r2c<bsl_rfft<double>>(n,4);


Notice that the BSL precision is better than GSL, and as good as the FFTW.

need to multiply times 1/N

I am not totally certain on this, but I believe this is simply (or can be considered) a matter of convention. We might eventually want to scale internally on BSL in order to have the same output behavior on real-tom-half-complex. But, again, I think this can all be documented convention.

BSL precision is better than GSL, and as good as the FFTW

Great job! That is a really difficult one to achieve since FFTW is also known for both its speed, robustness as well as precision. You have really created the foundation of a nice body of work/utilities for powerful FFT math Eduardo!

Nice!

Lagrang3 · 2021-08-11T06:18:42Z

@ckormanyos @cosurgi, this PR is ready to merge.

ckormanyos · 2021-08-11T06:31:42Z

test/fft_real_to_complex.cpp

+  {
+    test_r2c<fftw_rfft<double>>(n,8);
+    test_r2c<gsl_rfft<double>>(n,16);
+    test_r2c<bsl_rfft<double>>(n,4);


need to multiply times 1/N

I am not totally certain on this, but I believe this is simply (or can be considered) a matter of convention. We might eventually want to scale internally on BSL in order to have the same output behavior on real-tom-half-complex. But, again, I think this can all be documented convention.

BSL precision is better than GSL, and as good as the FFTW

Great job! That is a really difficult one to achieve since FFTW is also known for both its speed, robustness as well as precision. You have really created the foundation of a nice body of work/utilities for powerful FFT math Eduardo!

Nice!

ckormanyos · 2021-08-11T06:37:56Z

I wonder what @cosurgi thinks about 1/N scaling?

ckormanyos · 2021-08-11T08:15:21Z

this PR is ready to merge
Done. This is a really nice job @Lagrang3. Complements again Eduardo.

I will augment CI where possible/helpful in a branch of my own.

cosurgi · 2021-08-11T14:31:34Z

I wonder what @cosurgi thinks about 1/N scaling?

I think that the optimal approach is user convenience: make sure that all backends behave in exactly the same way.

If this choice makes some algorithms not-optimal (e.g. a wasted multiplication) then provide a bool template parameter for a particular backend, which would not waste a multiplication and then would not be in sync with the other backends.

ckormanyos · 2021-08-11T15:12:02Z

augment CI where possible/helpful

One single fft_compile.cpp configured for MSVC only as described above and now in develop is running in CI on MSVC 14.0 and 14.2. Gives us at least some sanity checks in MSVC world.

@cosurgi

cosurgi · 2021-08-11T15:39:28Z

@ckormanyos great! I will try to add a couple more tests.

cosurgi · 2021-08-11T15:48:46Z

@ckormanyos I opened #26 for experimenting.

Lagrang3 added 2 commits August 9, 2021 12:06

bsl real to halfcomplex powers of 2

4c3ee6a

Merge branch 'eduardo_r2r' into real-to-complex

95b09d1

Lagrang3 commented Aug 9, 2021

View reviewed changes

Lagrang3 added 4 commits August 9, 2021 14:22

added the raw halfcomplex-to-real routine for powers of 2

ab2fe9e

added brute force real-to-halfcomplex algorithm

350a421

added brute force halfcomplex-to-real

f9ecb7a

added fixed size 2 real transform

e02dd98

added real-to-halfcomplex for composite sizes

8dcb6c1

Lagrang3 commented Aug 10, 2021

View reviewed changes

added halfcomplex-to-real for composite size

5b1b809

Lagrang3 commented Aug 10, 2021

View reviewed changes

Repair template rebind alloc syntax

e8a0160

Lagrang3 added 2 commits August 10, 2021 19:56

::template

295744e

Merge branch 'real-to-complex' of github.com:BoostGSoC21/math into re…

0f7e425

…al-to-complex

Lagrang3 added 2 commits August 11, 2021 08:04

halfcomplex-to-real in the fftw and gsl backends:

9cd754e

in our convention halfcomplex-to-real is the inverse map of the real-to-halfcomplex. There's no good reason/advantage I can think of for neglecting the 1/N factor.

tests for correctness of the real-to-halfcomplex and inverse

53e9caf

Lagrang3 commented Aug 11, 2021

View reviewed changes

ckormanyos approved these changes Aug 11, 2021

View reviewed changes

ckormanyos merged commit 118d059 into develop Aug 11, 2021

Lagrang3 deleted the real-to-complex branch August 14, 2021 07:55

Real to complex #21

Real to complex #21

Conversation

Lagrang3 commented Aug 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lagrang3 Aug 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lagrang3 commented Aug 9, 2021

ckormanyos commented Aug 9, 2021

Lagrang3 commented Aug 9, 2021

Lagrang3 Aug 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lagrang3 Aug 10, 2021 • edited Loading

Choose a reason for hiding this comment

Lagrang3 commented Aug 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ckormanyos commented Aug 10, 2021

cosurgi commented Aug 10, 2021

ckormanyos commented Aug 10, 2021

cosurgi commented Aug 10, 2021 • edited Loading

cosurgi commented Aug 10, 2021 • edited Loading

ckormanyos commented Aug 10, 2021 • edited Loading

cosurgi commented Aug 10, 2021

ckormanyos commented Aug 10, 2021

ckormanyos commented Aug 10, 2021

Lagrang3 commented Aug 11, 2021 • edited Loading

ckormanyos commented Aug 11, 2021

Lagrang3 Aug 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lagrang3 commented Aug 11, 2021

Choose a reason for hiding this comment

ckormanyos commented Aug 11, 2021

ckormanyos commented Aug 11, 2021

cosurgi commented Aug 11, 2021

ckormanyos commented Aug 11, 2021

cosurgi commented Aug 11, 2021

cosurgi commented Aug 11, 2021

Lagrang3 Aug 9, 2021 •

edited

Loading

Lagrang3 Aug 10, 2021 •

edited

Loading

Lagrang3 Aug 10, 2021 •

edited

Loading

Lagrang3 commented Aug 10, 2021 •

edited

Loading

cosurgi commented Aug 10, 2021 •

edited

Loading

cosurgi commented Aug 10, 2021 •

edited

Loading

ckormanyos commented Aug 10, 2021 •

edited

Loading

Lagrang3 commented Aug 11, 2021 •

edited

Loading

Lagrang3 Aug 11, 2021 •

edited

Loading