Introduction of homogeneous boundary conditions

Sine and cosine transforms to solve homogeneous Neumann and Dirichlet problems:

At this point only periodic FFT solvers are available to solve the Poisson problem. Thus we can only recover velocity from vorticity on a periodic domain.

The idea is to implement 3 types of cosine and sine transforms to be able to solve problems with other (homogeneous) boundary conditions on each axis:

Dirichlet-Dirichlet
Dirichlet-Neumann
Neumann-Dirichlet
Neumann-Neumann

In order to do this, we will need to implement \texttt{DCT-I}, \texttt{II} and \texttt{III} and \texttt{DST-I}, \texttt{II} and \texttt{III} on the top of existing FFT libraries. \texttt{FFTW} and \texttt{FFTPACK} already implement those tranforms. In particular, \;\texttt{pyFFTW} has already support of those transform but it has not been merged into the master, see the \texttt{R2R-try-two} issue on the main \texttt{pyFFTW} repository.

The idea for the other FFT backends is to transform all those six problems into a classical Fourier transform of the same size by the way of \mathcal{O}(N) pre and post-processing of input and output data.

The only exceptions are \texttt{DCT-I} and \texttt{DST-I} which require \mathcal{O}(2Nlog(2N)) compute for \mathcal{O}(N) inputs instead of \mathcal{O}(Nlog(N)) for the \texttt{II} and \texttt{III} variants. Efficient \mathcal{O}(Nlog(N)) algorithms exist for those cases but yield \mathcal{O}(\sqrt{N}) error instead of the classical \mathcal{O}(\sqrt{log(N)} achieved via classical radix decomposition. Thus we will trade some performances on \texttt{DCT-I} and \texttt{DST-I} to get comparable accuracy between all implementations. This is the same approach as \texttt{FFTW}. The same goes for \texttt{DCT-IV} and \texttt{DST-IV} but we do not plan to use those transforms. See https://www.dsprelated.com/showthread/comp.dsp/7328-1.php and http://www.fftw.org/fftw-paper-ieee.pdf for the discussion.

References on how to compute real to real transforms from classical FFTs:

S. C. Chan and K. L. Ho, "Direct methods for computing discrete sinusoidal transforms," IEE Proceedings F 137 (6), 433-442 (1990).
Zhongde Wang, "On computing the discrete Fourier and cosine transforms" IEEE Trans. Acoust. Speech Sig. Proc. ASSP-33 (4),1341-1344 (1985).
S. C. Chan and K. L. Ho, "Fast algorithms for computing the discrete cosine transform" IEEE Trans. Circuits Systems II: Analog & Digital Sig. Proc. 39 (3), 185-190 (1992)
J. C. Schatzman, "Accuracy of the discrete Fourier transform and the fast Fourier transform," SIAM J. Scientific Computing, vol. 17, no. 5, pp. 1150-1166, 1996.

New boundary conditions

This issue will deal with the following boundaries:

Symmetric (no slip)
Outflow

Those two new boundary conditions impose homogeneous Neumann or Dirichlet conditions on all scalar fields. Fixed wall boundary conditions are not compatible with the spectral Poisson solver for the velocity (because this breaks the boundary conditions on the vorticity).

Ghost exchange and ghost accumulation on non-periodic domain boundaries

In order to enforce the two new boundary conditions we need to extend \texttt{ghost\_exchange} and \texttt{ghost\_accumulate} methods:

\texttt{SYMMETRIC} (no-slip, wall-like):
- Let d be the direction that is normal to the considered boundary plane (here at the left of the box domain).
- In this boundary plane we have V_d=0 and \frac{\partial V_i}{\partial x_d}=0\;\;\forall i \neq d.
- For all scalars (other than vorticity components) we impose by default \texttt{HOMOGENEOUS\_DIRICHLET} boundary conditions ie. S=0 in the boundary plane (same as V_d).
- In the vicinity of the boundary layer we have V_d \simeq 0 and S \simeq 0.
- ghost_exchange:
  - Here we will exchange ghosts by antisymmetry to cancel all even derivative stencils. We also force the boundary value to be zero (normally it is already the case).
  - For a 2-ghosts layer [S_{-2}\;S_{-1}\;S_0\;S_1\;S_2] becomes [-S_{2}\;-S_{1}\;0\;S_1\;S_2].
  - This implies S=0 and \Delta S=0 on symmetric plane boundaries (ie. no diffusion towards the boundary layer, up to fourth-order stencil, which is the maximum given two ghosts).
- ghost_accumulate:
  - The previous point implies that close the the boundary plane we have V_{d} = \left[-V_{d,2}\;-V_{d,1}\;0\;V_{d,1}\;V_{d,2} \right] and S = [-S_{2}\;-S_{1}\;0\;S_1\;S_2].
  - This means that the scalar quantities are annihilated on the boundary plane.
  - We do not advect scalars from outer ghosts and we throw away everything that is remeshed to the external ghost layer (maybe some epsilons for high CFL ?). If this is not sufficient (ie. not conservative for high CFL numbers) we might need to perform symmetric accumulation from outer to inner ghosts. This would be as if the particles bounced on the boundary wall while being advected.
  - For non-zero relative scalar velocity V_d^r, we consider that nothing can enter the domain at a wall (ie. S_{in}=0, see outflow boundary condition). This means \texttt{ghost\_accumulate} is a noop in a first approximation.
\texttt{OUTFLOW}:
- Let d be the direction that is normal to the considered boundary plane (here at the left of the box domain).
- In this boundary plane we have \frac{\partial V_d}{\partial x_d}=0 and V_i=0\;\;\forall i \neq d.
- For all scalars (other than vorticity components) we impose by default \texttt{HOMOGENEOUS\_NEUMANN} boundary conditions ie. $\frac{\partial S}{\partial x_d}=0$ in the boundary plane (same as $V_d$).
- In the vicinity of the boundary layer we have $V_d = V_{d,0} + \mathcal{O}(dx^2)$ and $S = S_0 + \mathcal{O}(dx^2)$.
- ghost_exchange:
  - Here we will exchange ghosts by symmetry to cancel all odd derivative stencils.
  - For a 2-ghosts layer $[S_{-2}\;S_{-1}\;S_0\;S_1\;S_2]$ becomes $[S_{2}\;S_{1}\;S_{0}\;S_1\;S_2]$.
  - This implies $\frac{\partial S}{\partial x_d}=0$ on outflow plane boundaries up to fourth-order stencil.
- ghost_accumulate:
  - The previous point implies that close the the boundary plane we have $V_{d} = \left[V_{d,2}\;V_{d,1}\;V_{d,0}\;V_{d,1}\;V_{d,2} \right]$ and $S = [S_{2}\;S_{1}\;S_0\;S_1\;S_2]$.
  - This means that the advected scalar quantities can enter or get out of the domain (with a practically constant velocity).
  - Case $(V_{d,0}-V_d^r) \cdot n \geq = 0$: Scalar is going out, we throw away everything that is remeshed to the external ghost layer (we take into account relative advection velocity $V_d^r$).
  - Case $(V_{d,0}-V_d^r) \cdot n \le = 0$: Scalar is going in, we consider a constant scalar quantity $S_{in}$ that is advected at constant velocity $V_b = V_{d,0}-V_d^r$. In a first approximation we will add $S_{in}$ to the $G=\lfloor \frac{V_b\;dt}{dx} \rfloor$ first inner compute points and the good fraction to the $G+1^{th}$ inner compute point.
  - This cannot be implemented in $\texttt{ghost\_accumulate}$ so this will be managed by advection operators. This entails that $\texttt{ghost\_accumulate}$ is a noop.

Spectral transform mapping

Nomenclature:

PER = Periodic boundary condition
HDIR = Homogeneous Dirichlet boundary condition
HNEU = Homogeneous Neumann boundary condition
EVEN INVERSE = Inverse transform used for even derivatives (or integrations)
ODD INVERSE = Inverse transform used for odd derivatives (or integrations)

As we can now specify 5 different boundary conditions per axis we now have to handle:

5^2 = 25 cases in 2D
5^3 = 125 cases in 3D

In fact we have less choices because $\texttt{PERIODIC}$ boundary conditions should be on last transformed axes:

We can not apply a real to real transform to complex data
(R2R-axes, R2C-axe, C2C-axes) -> compute -> (C2C-axes, C2R-axe, R2R-axes)
In 2D we have: 4^2 + 4^1 + 4^0 = (4^3 -1)/3 = 21 cases
In 3D we have: (4^4 -1)/3 = 85 cases

The other cases can be achieved via a change of axes prior to problem setup.

2D Example: Compute the vorticity from velocity (using spectral methods)

Warning: Here we use the (X,Y) notation, in the library boundary conditions are set up as (Y,X) so everything is reversed.

Exactly the same boundary conditions are present when solving the velocity from vorticity using the Poisson solver.

Tasks

/!\ Multithreaded FFTW should use the OpenMP backend (not the classical default threads libraries). This can be enforced in the setup.py of pyFFTW.

pip2 uninstall pyfftw
git clone https://github.com/drwells/pyFFTW 
cd pyFFTW
git checkout r2r-try-two
sed -i 's/\(fftw3[fl]\?_\)threads/\1omp/g' setup.py
pip2 install .

Edited Dec 01, 2018 by Jean-Baptiste Keck

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information