1
$\begingroup$

I have the output of numpy.fft.fft, calculated with 15000 samples at a rate of 500Hz - giving me bins of size $\frac{1}{30}$. Rather than having 15000 samples, I'd rather have say, 50 or 100. I've thought of a couple methods to merge data into new bins (below) to both sum or average the values (shown below) but I'm not sure what kind of difference these methods would have?

Alternatively, I guess I could use the inverse fft and regenerate it with a smaller window?

Here's a code snippet to generate similar data:

n_samples = 15000
sampling_freq = 500
sampling_rate = 1 / sampling_freq
dummy_data = np.arange(n_samples)
nyquist = 0.5 * (1 / sampling_rate)
fft_freqs = np.fft.fftfreq(n=n_samples, d=sampling_rate)
fft = np.fft.fft(dummy_data)

And here's an example of how I'm generating new bins, for summing and averaging.

# Number of bins, for pos / neg
# Total bins is 2*num_bins+1, to ensure a bin at 0Hz
num_bins = 10

# Use indexing to generate sorted arrays
pos_mask = np.where(np.logical_and(fft_freqs >= min_freq, fft_freqs <= 99999999999))
neg_mask = np.where(np.logical_and(fft_freqs >= -99999999999, fft_freqs < 0))
neg_freqs = fft_freqs[neg_mask]
pos_freqs = fft_freqs[pos_mask]
neg_ffts = fft[neg_mask]
pos_ffts = fft[pos_mask]
sorted_ffts = np.concatenate((neg_ffts, pos_ffts))
sorted_freqs = np.concatenate((neg_freqs, pos_freqs))

min_freq = -1 * nyquist
max_freq = nyquist
bin_size = max_freq / num_bins
half_bin_size = bin_size / 2
bin_border = np.concatenate([
    np.linspace(min_freq, -1 * half_bin_size, num_bins + 1, endpoint=True),
    np.linspace(half_bin_size, max_freq, num_bins + 1, endpoint=True)
])
new_bin_centers = (bin_border[1:] + bin_border[:-1]) / 2
print(len(new_bin_centers))

# Finds where sorted freqs change from one bin to the next
bin_border_idxs = np.searchsorted(sorted_freqs, bin_border)

# Number of elements in each new bin
bin_lens = np.diff(bin_border_idxs)

# Generate new bins
# Sum
new_bin_vals_sum = np.where(bin_lens == 0, np.nan, np.add.reduceat(sorted_ffts, bin_border_idxs[:-1]))
# Average
new_bin_vals_avg = np.where(bin_lens == 0, np.nan, np.add.reduceat(sorted_ffts, bin_border_idxs[:-1]) / bin_lens)

To clarify - I'm not trying to expand the frequency range by changing the bin size, or anything like that. I also just have the output, not the raw data. I want to get a better understanding of how to best manipulate the data, what impacts that may have, and if there's any other considerations I may have missed.

$\endgroup$
1
  • $\begingroup$ Hi and welcome to our forum. The answer to your questions really depends on what you are planning to do with the data. Re-binning will result in a loss of information and what information to retain depends a lot on what your application needs. For example in audio it's common to reduce FFT into 3rd octave or octave spectra . This may or may not be a good fit for your requirements. $\endgroup$ Commented Mar 8 at 12:08

1 Answer 1

0
$\begingroup$
  • When combining FFT bins, you're essentially reducing the frequency resolution of your data. Your original data has a resolution of 1/30 Hz per bin, and you're creating wider bins that contain multiple original bins. So if you sum two, the resulting bin will have the information contained within a 1/15 hz range. If you sum every 300 bins, you’ll get 50 bins, each representing a 10hz range.

  • Both summing and averaging are valid approaches, but they have different interpretations:

    • Summing preserves the total power in each frequency range
    • Averaging gives you the mean amplitude in each frequency range
$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.