2
$\begingroup$

I have a solution vector

$$\begin{equation}H^{(\text{LR})}:= \frac{1}{\Delta x^{2}}\sum\limits_{n=1}^{N}\sum\limits_{m=1}^{N}\frac{u_{n}u_{n+m} + v_{n}v_{n+m}}{m^{1+\alpha}}\end{equation}$$

where $u_k$ is the real part of a solution of ODEs (resp. $v_k$). Here, the index is periodic, e.g, $n+N \equiv n$.

I want to compute this function for every time-step in my numerical scheme, e.g, $10^{5}$ time steps. My attempt takes longer than actually solving the system of ODEs that this Hamiltonian is for. I've already tried vectorizing the code.. what else can I do?

MWE:

%% HAMILTONIAN MONITORING
u = rand(10000,1024);
tic
u_real = real(u);  % Extract real part
v_imag = imag(u);  % Extract imaginary part
N = width(u);      % Number of spatial points
T = length(u);    % Number of time steps

% Precompute the denominator terms for efficiency
alpha = 1;  % Adjust alpha based on problem
m_vals = (1:N); % Vector of m values
denominator = m_vals.^(1+alpha);  % Vectorized power computation

% Preallocate Hamiltonian vector
HAMILTONIAN = zeros(T, 1);

% Compute Hamiltonian for each time step
for t = 1:T   
    % Compute u_n^2 + v_n^2 for all n (vectorized)
    %u_sq = u_real(t,:).^2;
    %v_sq = v_imag(t,:).^2;
    
    % Compute the sum for all n
    sum_over_n = 0;
    
    % Compute H_temp in a vectorized manner
    for n = 1:N
        % Index shift using mod, avoiding explicit loops
        idx = mod(n + m_vals', N); % Pre-compute the indices
        
        idx(find(idx == 0)) = N; % There is no 0-index. 0-index corresponds to the Nth index. 

        H_summands = (u_real(t,n) * u_real(t,idx)' + v_imag(t,n) * v_imag(t,idx)') ./ denominator';
        
        % Accumulate the sum over n
        sum_over_n = sum_over_n + sum(H_summands);
    end
    
    % Store in Hamiltonian vector
    HAMILTONIAN(t) = sum_over_n;
end

MWE using MATLAB 2024a

$\endgroup$
6
  • 1
    $\begingroup$ That WE is very far from being M. $\endgroup$ Commented Feb 26 at 20:55
  • $\begingroup$ @FedericoPoloni I could remove the various comments, but the data needed to compute the Hamitlonian comes from solving a system of ODEs. $\endgroup$ Commented Feb 26 at 21:01
  • 1
    $\begingroup$ In the comments of your previous related question I gave a representation of this double sum in terms of a quadratic form and a circulant matrix. Circulant matrices can be multiplied very fast using FFTs $\endgroup$ Commented Feb 26 at 22:25
  • 1
    $\begingroup$ In particular, the double sum for a single variable is given by $u^T Au$ (or $\frac{1}{2}u^T(A+A^T)u$), where $A$ is the circulant matrix formed by the vector $a = \frac{1}{\Delta x^2}(N^{-(1+\alpha)}, 1^{-(1+\alpha)},\dots, (N-1)^{-(1+\alpha)})^T$ $\endgroup$ Commented Feb 26 at 22:28
  • $\begingroup$ Is there a reason why you can't omit the ODE solving part and simply give us u = randn(N, 1) or u = transpose(1:n) in the MWE? Any timing will likely not depend on the content of those vectors. $\endgroup$ Commented Feb 26 at 23:10

1 Answer 1

9
$\begingroup$

As a starting point, lets strip down your sum to remove redundant complexity: $$H':= \sum\limits_{n=1}^{N}\sum\limits_{m=1}^{N}\frac{u_{n}u_{n+m}}{m^{1+\alpha}}$$ If you can solve this, returning to the original involves summing the results of two identical problems ($u,v$) and reapplying global scale factors.

Summing over ${u_nu_{n+m}}$ caught my eye as related to the autocorrelation of $u$ (cross-correlation with itself), e.g. $$(u\star{}u)[m] = \sum_{n=1}^{N}{u_nu_{n+m}}$$ (Note that we're really summing over the entire 'universe' of $n$ values, which just happens to be a finite set here; you can choose any set of indices uniquely covering the space, like $0:N-1$ or $1:N$)

The nice things about cross-correlations are

  1. Cross-correlations are closely related to convolutions (a minus sign difference in the formula)
  2. Convolutions (both discrete sums and continuous integrals) have nice properties relating to the Fourier Transform
  3. We can compute (specific) Fourier Transforms absurdly efficiently by using the Fast Fourier Transform

The Convolution Theorem (#2) tells us $$\mathcal{F}\{u\star{}u\}(f)=\mathcal{F}\{u\}(f)\cdot(\mathcal{F}\{u\})^*(f)=\Big\vert\mathcal{F}\{u\}(f)\Big\vert^2$$ (The sign flipping (#1) between correlation and convolution paired with properties of the Discrete Fourier Transform for real signals leads to the complex conjugate) This leaves $$u\star{}u=\mathcal{F}^{-1}\Big\{\Big\vert\mathcal{F}\{u\}(f)\Big\vert^2\Big\}\tag{1}\label{1}$$

With this direction in mind, we can rearrange our simplified $H'$ to leverage this property. $$H'= \sum\limits_{m=1}^{N}\frac{1}{m^{1+\alpha}}\sum\limits_{n=1}^{N}u_{n}u_{n+m}$$

Now we see that inner sum nicely matching our Convolution Theorem formula $\eqref{1}$, and we're almost ready to write some simple Matlab code. One nuance is that Discrete Fourier Transform is often defined in zero based indexing (even in Matlab's fft effectively does this), and the vector $u\star{}u$ routinely starts at lag $m=0$, not $m=1$. Because $m$ in that context is a shift, not just an index, this discrepancy can't just be brushed aside citing differences in indexing conventions. The fact that we'll be using these initial-zero fft implementations means we'll need to do some compensating to artificially start at $m=1$.

Okay, let's pull all this together (with #3) into some code:

abs2 = @(x) x .* conj(x);
autoCorr_u = ifft(abs2(fft(u))); % apply eq 1
shiftedAutoCorr_u = circshift(autoCorr_u,-1); % account for array starting at m=1
mList = 1:N; % use same dim as u, i.e. transpose if needed
H_prime = sum( shiftedAutoCorr_u ./ mList.^(1+alpha) );

Note that the circular symmetry/cyclic indexing properties of your problem made using the fft function easier than normal. Cases lacking that finite-domain-periodicity usually require some compensation like zero padding to hide the fact that the fft is fundamentally periodic and cyclic.

In Matlab, you can abuse vectorization to have your many time steps increment along a separate dimension, and run all these operations with basically the same code presented.

The interplay between correlations/convolutions and the Fourier transform is an exceptionally powerful tool, and is well worth being attentive to opportunities to exploit it.

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.