- Home
- Knowledge Base
- Raven Documentation
- Raven Workbench
- Raven Expedition
- Spectrogram Parameters in Raven Workbench
- Home
- Knowledge Base
- Raven Documentation
- Raven Workbench
- Raven Annotate
- Spectrogram Parameters in Raven Workbench
Spectrogram Parameters in Raven Workbench
Spectrogram parameters determine how audio signals are transformed into the visual frequency-over-time representation known as a spectrogram. This document explains the available settings, their technical impact on resolution, and how to choose the best configuration for your data.
This applies to Raven Expedition and Raven Annotate.
Accessing Spectrogram Parameters
You can open the Configure Spectrogram Parameters dialog in several ways:
- Toolbar: Click the Gear icon (Configure Spectrogram Parameters) located in the Color Toolbar.
- Keyboard Shortcut: Press
Ctrl+G(orCmd+Gon macOS). - View Menu: Go to the View menu and select Spectrogram Parameters… (if available in your Raven version).
Parameter Definitions
1. Window Type
Before performing a Discrete Fourier Transform (DFT), Raven applies a "window function" to each segment of audio. This tapers the edges of the segment to reduce "spectral leakage" (the appearance of energy at frequencies that aren't actually present in the signal).
- Hann: A balanced, general-purpose window (default).
- Hamming: Similar to Hann; provides a slightly narrower mainlobe but higher sidelobes.
- Blackman: High sidelobe rejection, useful for detecting weak signals near strong ones.
- Kaiser: An adjustable window where the Beta parameter controls the trade-off between mainlobe width and sidelobe levels.
- Rectangular: No tapering (boxcar). Maximizes frequency resolution but introduces significant leakage.
2. Window Size
The number of audio samples used for each individual transform.
- Larger Window: More samples per transform. Increases Frequency Resolution but decreases Time Resolution.
- Smaller Window: Fewer samples per transform. Increases Time Resolution but decreases Frequency Resolution.
3. DFT Size
The number of points used in the FFT algorithm.
- The DFT size must be greater than or equal to the Window Size.
- If the DFT size is larger than the window size, Raven "zero-pads" the windowed data. This results in an interpolated spectrum with more frequency bins, which can make the spectrogram appear smoother (though it does not add "true" resolution).
- Raven supports DFT sizes that are powers of two (e.g., 512, 1024, 2048).
4. Hop Size / Overlap
The "step" between successive windows.
- Hop Size: The number of samples the window moves forward for each step.
- Overlap: The percentage of the window that is shared with the previous step (e.g., a 50% overlap means the hop size is half the window size).
- Impact: Smaller hop sizes (higher overlap) produce more time-steps, resulting in a smoother-looking spectrogram along the time axis at the cost of increased computation time.
- Recommendation: A 50% overlap is generally a good starting point, providing a balance between visual appearance and performance.
Resolution and Trade-offs
The fundamental trade-off in spectrogram analysis is governed by the Uncertainty Principle: you cannot have perfect resolution in both time and frequency simultaneously.
Frequency Resolution
Frequency resolution is the ability to distinguish two closely spaced frequencies.
- Formula:
Resolution (Hz) = Sample Rate / Window Size - Example: At a 48,000 Hz sample rate and a 1024-sample window, your resolution is ~46.9 Hz. Increasing the window size to 2048 improves resolution to ~23.4 Hz.
Time Resolution
Time resolution is the ability to distinguish two closely spaced events in time.
- Formula:
Resolution (seconds) = Window Size / Sample Rate - Example: At 48,000 Hz and a 1024-sample window, each vertical slice represents ~21.3 ms. Decreasing the window size to 512 improves time resolution to ~10.7 ms.
Choosing the Best Settings
The "best" settings depend entirely on the characteristics of the sound you are analyzing.
1. Identify Your Target Signal
- Tonal / Constant Frequency (e.g., whale moans, pure whistles): Priority is Frequency Resolution. Use a larger window size (e.g., 2048 or 4096) to see the precise frequency and harmonics.
- Transient / Rapidly Changing (e.g., clicks, drum beats, rapid trills): Priority is Time Resolution. Use a smaller window size (e.g., 256 or 512) to avoid blurring the events together in time.
2. Consider the Sample Rate
The same window size (in samples) behaves differently at different sample rates.
- High Sample Rate (e.g., 192 kHz): A 1024-sample window is very short in time (~5.3 ms), favoring time resolution.
- Low Sample Rate (e.g., 8 kHz): A 1024-sample window is quite long (~128 ms), favoring frequency resolution.
3. General Recommendations
A 50% overlap is generally a good starting point for all sound types, providing a balance between visual appearance and performance.
| Sound Type | Sample Rate | Priority | Recommended Window Size (samples) | Recommended Window Size (ms) |
|---|---|---|---|---|
| Bird Song (Tonal) | 48 kHz | Frequency | 512 – 1024 | 10 – 21 ms |
| Marine Mammal (Moans/Calls) | 8 kHz | Frequency | 256 – 512 | 32 – 64 ms |
| Human Speech | 44.1 kHz | Balanced | 512 | 12 ms |
| Marine Mammal Clicks, Ultrasonic Bat Calls | 200 kHz | Time | 256 – 512 | 1 – 3 ms |
Tip: If your spectrogram looks "blurry" vertically, increase the window size. If it looks "blurry" horizontally (events smeared together), decrease the window size.