Wavelets are mathematical functions that decompose signals into different frequency components while preserving time information. Unlike Fourier transforms which only tell you what frequencies exist, wavelets tell you both what frequencies exist and when they occur. This makes them ideal for analyzing non-stationary signals like financial time series, audio, and sensor data.

What is the difference between VectorWave and IronWave?

VectorWave is our Java wavelet library (Java 25+) with SIMD acceleration via the Vector API; ideal for JVM-based applications and enterprise systems. IronWave is our Rust wavelet library optimized for ultra-low-latency applications with sub-microsecond performance; perfect for high-frequency trading and embedded systems. Both provide the same core wavelet transforms (DWT, MODWT, SWT, CWT) with production-grade reliability.

When should I use MODWT instead of DWT?

Use MODWT (Maximal Overlap Discrete Wavelet Transform) when you need shift-invariance, which is critical for financial time series analysis, volatility estimation, and any application where the alignment of features matters. MODWT works with any signal length and provides better statistical properties. Use DWT when you need compression or the most compact representation, as it is critically sampled with no redundancy.

Which wavelet should I use for financial analysis?

For financial time series, Daubechies Db4 with MODWT is the most common choice, offering a good balance of time and frequency localization. Use Haar for jump detection and microstructure analysis. For smoother trend analysis, consider Symlets (Sym4) which have better phase properties. For volatility clustering detection, Morlet wavelets with CWT provide rich time-frequency information.

How do I get started with wavelet analysis?

Start with our Quick Start guides: for Java applications, see the VectorWave Quick Start at https://docs.morphiqlabs.com/docs/vectorwave/getting-started/quick-start. For Rust applications, see the IronWave Quick Start at https://docs.morphiqlabs.com/docs/ironwave/getting-started/quick-start. Both guides walk you through installation, basic transforms, and common use cases like denoising and multi-scale analysis.

What is wavelet denoising and how does it work?

Wavelet denoising removes noise from signals while preserving important features like edges and trends. It works by transforming the signal into wavelet coefficients, applying a threshold to remove small coefficients (which typically represent noise), and reconstructing the signal. Common thresholding methods include Universal, SURE, Minimax, and BayesShrink. VectorWave and IronWave both provide built-in denoising functions with multiple threshold options.

Streaming Wavelet Transform and Denoising

This document provides detailed technical information about VectorWave's streaming implementations, including architecture, performance characteristics, and usage guidelines.

Where SIMD lives

SIMD/Vector API acceleration is available via the optional vectorwave-extensions module (Java 25 + incubator). The core streaming implementations are scalar Java 25. For SIMD‑accelerated batch and streaming facades, add the extensions dependency and run with --add-modules jdk.incubator.vector --enable-preview.

Overview
Architecture
Streaming Denoiser
Performance Analysis
Implementation Details
Configuration Guide
Best Practices
Troubleshooting

Overview

VectorWave provides comprehensive streaming support for real-time signal processing applications. The streaming implementations are designed with three key principles:

Bounded Memory: O(1) memory complexity regardless of stream length
Low Latency: Sub-microsecond processing times for real-time applications
Flexible Trade-offs: Configurable quality vs. performance characteristics

Key Features

Real-time wavelet transforms with configurable block sizes
Zero-copy streaming implementation with ring buffer
Dual-implementation streaming denoiser (Fast vs. Quality)
Adaptive threshold and noise estimation
Overlap-add processing with multiple window functions
Multi-level streaming decomposition
Memory pooling for reduced GC pressure
Reactive streams using Java Flow API

Architecture

Component Hierarchy

StreamingWaveletTransform (interface)
├── StreamingWaveletTransformImpl (basic streaming)
├── OptimizedStreamingWaveletTransform (zero-copy with ring buffer)
├── SlidingWindowTransform (continuous sliding window)
└── MultiLevelStreamingTransform (multi-level decomposition)

StreamingDenoiserStrategy (interface)
├── FastStreamingDenoiser (real-time optimized)
└── QualityStreamingDenoiser (quality optimized)

Supporting Components:
├── RingBuffer (lock-free SPSC buffer)
├── StreamingRingBuffer (windowed ring buffer with overlap)
├── OverlapBuffer (overlap-add processing)
├── NoiseEstimator (adaptive noise estimation)
├── StreamingThresholdAdapter (adaptive thresholding)
└── SharedMemoryPoolManager (memory management)

Data Flow

Input Samples → Buffer → Window → Transform → Threshold → Inverse → Output
                  ↑                     ↓
                  └─── Overlap-Add ←────┘

Streaming Denoiser

Implementation Comparison

Feature	Fast Implementation	Quality Implementation
Latency	0.35-0.70 µs/sample	0.2-11.4 µs/sample
Throughput	1.37-2.69 M samples/s	0.088-5.0 M samples/s
Memory	~22 KB	~26 KB
SNR vs Batch	-4.5 to -10.5 dB	+1.5 to +7.3 dB
Real-time	Always	Only without overlap
Best For	Audio, sensors, trading	Scientific, medical, offline

Selection Comparison

Implementation	When to Use	Key Features
StreamingWaveletTransformImpl	General streaming	Standard overlap support
OptimizedStreamingWaveletTransform	Performance critical	Zero-copy, ring buffer
FastStreamingDenoiser	Real-time denoising	Ultra-low latency
QualityStreamingDenoiser	Quality priority	Enhanced SNR

Factory Pattern

The StreamingDenoiserFactory provides three selection modes:

// Explicit selection
StreamingDenoiserStrategy fast = StreamingDenoiserFactory.create(
    StreamingDenoiserFactory.Implementation.FAST, config);

StreamingDenoiserStrategy quality = StreamingDenoiserFactory.create(
    StreamingDenoiserFactory.Implementation.QUALITY, config);

// Automatic selection based on configuration
StreamingDenoiserStrategy auto = StreamingDenoiserFactory.create(config);

Automatic Selection Logic

The factory uses these criteria for AUTO mode:

Overlap + Adaptive: Selects FAST (real-time priority)
Block size < 256: Selects FAST (low latency priority)
No overlap: Selects QUALITY (both can be real-time)
Default: FAST (real-time priority)

Zero-Copy Streaming Implementation

OptimizedStreamingWaveletTransform

The OptimizedStreamingWaveletTransform provides true zero-copy processing using a lock-free ring buffer:

// Basic usage
OptimizedStreamingWaveletTransform transform = new OptimizedStreamingWaveletTransform(
    wavelet, PaddingStrategies.PERIODIC, blockSize
);

// With overlap configuration
OptimizedStreamingWaveletTransform transform = new OptimizedStreamingWaveletTransform(
    wavelet, 
    PaddingStrategies.PERIODIC, 
    blockSize,
    0.5,  // 50% overlap
    8     // buffer capacity multiplier
);

Key Benefits

Zero-Copy Processing: Uses WaveletTransform.forward(double[], int, int) to process array slices directly
50% Memory Bandwidth Reduction: Eliminates array copying during transforms
Lock-Free Ring Buffer: Single-producer, single-consumer design for minimal overhead
Configurable Overlap: 0-100% overlap support for time-frequency trade-offs
Automatic Backpressure: Exponential backoff when buffer is full

Performance Characteristics

Metric	Value
Latency	< 0.5 µs/sample
Memory bandwidth	50% reduction vs copying
GC pressure	Minimal (buffer reuse)
Thread safety	SPSC lock-free
Overlap support	0-100%

Performance Analysis

Latency Breakdown

For a typical streaming denoiser processing cycle:

Operation	Fast (µs)	Quality (µs)
Buffer management	0.05-0.10	0.10-0.20
Windowing	0.00	0.05-2.00
Forward transform	0.15-0.30	0.30-0.60
Thresholding	0.05-0.10	0.10-0.20
Inverse transform	0.10-0.20	0.20-0.40
Total	0.35-0.70	0.75-3.40

Memory Usage

Memory allocation per instance:

FastStreamingDenoiser:
- Input buffer: blockSize × 8 bytes
- Transform buffers: 2 × blockSize × 8 bytes
- Statistics: ~200 bytes
- Noise estimator: blockSize × 8 bytes
- Total: ~22 KB (for blockSize=256)

QualityStreamingDenoiser:
- Extended buffers: 1.5 × blockSize × 8 bytes
- Overlap buffers: 2 × blockSize × 8 bytes
- Window cache: ~2 KB
- Additional: ~2 KB
- Total: ~26 KB (for blockSize=256)

Quality Metrics

SNR improvement compared to noisy input:

Noise Level	Fast	Quality	Batch (reference)
0.1 (low)	+2.5 dB	+7.0 dB	+12.0 dB
0.3 (medium)	+1.0 dB	+5.5 dB	+8.0 dB
0.5 (high)	-0.5 dB	+3.0 dB	+5.0 dB

Implementation Details

Fast Implementation

// Simplified processing flow
void processFast(double[] samples) {
    // 1. Buffer samples
    buffer.add(samples);
    
    // 2. When block is full
    if (buffer.isFull()) {
        double[] block = buffer.getBlock();
        
        // 3. Direct transform (no windowing)
        TransformResult result = transform.forward(block);
        
        // 4. Apply threshold
        threshold(result.getCoefficients());
        
        // 5. Inverse transform
        double[] denoised = transform.inverse(result);
        
        // 6. Emit result
        publisher.submit(denoised);
    }
}

Quality Implementation

// Simplified processing flow with overlap
void processQuality(double[] samples) {
    // 1. Add to overlap buffer
    overlapBuffer.add(samples);
    
    // 2. Process overlapping blocks
    while (overlapBuffer.hasBlock()) {
        double[] extendedBlock = overlapBuffer.getExtendedBlock();
        
        // 3. Apply window function
        applyWindow(extendedBlock);
        
        // 4. Transform extended block
        TransformResult result = transform.forward(extendedBlock);
        
        // 5. Apply threshold with smoothing
        adaptiveThreshold(result.getCoefficients());
        
        // 6. Inverse transform
        double[] denoised = transform.inverse(result);
        
        // 7. Overlap-add reconstruction
        double[] output = overlapBuffer.overlapAdd(denoised);
        
        // 8. Emit result
        publisher.submit(output);
    }
}

Adaptive Processing

Both implementations support adaptive noise estimation and thresholding:

// Noise estimation (MAD-based)
double estimateNoise(double[] coefficients) {
    double[] sorted = Arrays.copyOf(coefficients, coefficients.length);
    Arrays.sort(sorted);
    double median = sorted[sorted.length / 2];
    
    double[] deviations = new double[coefficients.length];
    for (int i = 0; i < coefficients.length; i++) {
        deviations[i] = Math.abs(coefficients[i] - median);
    }
    
    Arrays.sort(deviations);
    double mad = deviations[deviations.length / 2];
    
    return mad / 0.6745; // Gaussian normalization
}

// Adaptive threshold with smoothing
double adaptThreshold(double newEstimate) {
    double rate = isIncreasing ? attackTime : releaseTime;
    currentThreshold = currentThreshold + rate * (newEstimate - currentThreshold);
    return currentThreshold;
}

Configuration Guide

Basic Configuration

StreamingDenoiserConfig config = new StreamingDenoiserConfig.Builder()
    .wavelet(Daubechies.DB4)           // Wavelet type
    .blockSize(256)                    // Power of 2
    .overlapFactor(0.5)                // 0.0 to 0.875
    .thresholdMethod(ThresholdMethod.UNIVERSAL)
    .thresholdType(ThresholdType.SOFT)
    .build();

Advanced Configuration

StreamingDenoiserConfig config = new StreamingDenoiserConfig.Builder()
    .wavelet(Symlet.SYM8)
    .blockSize(512)
    .overlapFactor(0.75)
    .levels(3)                         // Multi-level decomposition
    .thresholdMethod(ThresholdMethod.SURE)
    .thresholdType(ThresholdType.HARD)
    .adaptiveThreshold(true)           // Enable adaptation
    .attackTime(0.1)                   // Fast attack
    .releaseTime(0.5)                  // Slower release
    .useSharedMemoryPool(true)         // Share memory pool
    .noiseBufferFactor(4)              // Noise estimation buffer
    .build();

Configuration Parameters

Parameter	Range	Default	Description
`blockSize`	32-8192	256	Processing block size (must be power of 2)
`overlapFactor`	0.0-0.875	0.0	Overlap between blocks
`levels`	1-10	1	Decomposition levels
`adaptiveThreshold`	true/false	false	Enable adaptive thresholding
`attackTime`	0.01-1.0	0.1	Threshold increase rate
`releaseTime`	0.01-10.0	0.5	Threshold decrease rate
`noiseBufferFactor`	1-32	4	Buffer size for noise estimation

Best Practices

1. Choose the Right Implementation

// Real-time audio processing
if (latencyRequirement < 1_000) { // < 1ms
    use(Implementation.FAST);
}

// Scientific analysis
if (qualityRequirement > latencyRequirement) {
    use(Implementation.QUALITY);
}

// General purpose
use(Implementation.AUTO); // Let factory decide

2. Optimize Block Size

Smaller blocks (64-256): Lower latency, reduced frequency resolution
Medium blocks (256-1024): Balanced performance
Larger blocks (1024-4096): Better frequency resolution, higher latency

3. Configure Overlap Appropriately

// No overlap for lowest latency
.overlapFactor(0.0)

// 50% overlap for smooth reconstruction
.overlapFactor(0.5)

// 75% overlap for highest quality (may prevent real-time)
.overlapFactor(0.75)

4. Memory Pool Usage

// Multiple instances sharing memory
for (int i = 0; i < numChannels; i++) {
    configs[i] = new StreamingDenoiserConfig.Builder()
        .useSharedMemoryPool(true)  // Share pool
        .build();
}

// Single instance or isolation needed
config = new StreamingDenoiserConfig.Builder()
    .useSharedMemoryPool(false)  // Dedicated pool
    .build();

5. Resource Management

// Always use try-with-resources
try (StreamingDenoiserStrategy denoiser = factory.create(config)) {
    // Process data
    denoiser.process(samples);
} // Automatic cleanup

// Or explicit cleanup
denoiser.close(); // Releases memory pool

6. Performance Monitoring

// Monitor performance metrics
StreamingStatistics stats = denoiser.getStatistics();
if (stats.getMaxProcessingTime() > targetLatency) {
    logger.warn("Processing time exceeded target: {} > {}",
        stats.getMaxProcessingTime(), targetLatency);
}

// Check buffer levels
if (denoiser.getBufferLevel() > blockSize * 0.8) {
    logger.warn("Buffer filling up, possible overload");
}

Troubleshooting

Common Issues

1. High Latency Spikes

Symptoms: Occasional processing delays exceeding target

Causes:

GC pauses
Thread contention
CPU throttling

Solutions:

// Use memory pooling
.useSharedMemoryPool(true)

// Reduce block size
.blockSize(128)

// Disable overlap
.overlapFactor(0.0)

// Use FAST implementation
Implementation.FAST

2. Poor Denoising Quality

Symptoms: Insufficient noise reduction or signal distortion

Causes:

Block boundary artifacts
Inappropriate threshold
Wrong wavelet choice

Solutions:

// Use QUALITY implementation
Implementation.QUALITY

// Enable overlap
.overlapFactor(0.5)

// Use adaptive thresholding
.adaptiveThreshold(true)

// Choose appropriate wavelet
.wavelet(Daubechies.DB4) // Good general purpose

3. Memory Issues

Symptoms: OutOfMemoryError or high GC activity

Causes:

Too many instances
Large block sizes
Memory leaks

Solutions:

// Share memory pools
.useSharedMemoryPool(true)

// Ensure proper cleanup
try (StreamingDenoiserStrategy denoiser = ...) {
    // Use denoiser
} // Automatic cleanup

// Monitor active instances
SharedMemoryPoolManager.getInstance().getActiveUserCount()

Performance Tuning Checklist

Profile First: Use benchmarks to establish baseline
Start Simple: Begin with FAST implementation, no overlap
Measure Impact: Monitor latency and quality metrics
Adjust Gradually: Change one parameter at a time
Validate Results: Ensure quality meets requirements

Debug Output

Enable detailed logging for troubleshooting:

// Get current metrics
double latency = denoiser.getPerformanceProfile().expectedLatencyMicros();
double snr = denoiser.getPerformanceProfile().expectedSNRImprovement();
long memory = denoiser.getPerformanceProfile().memoryUsageBytes();

// Log statistics
logger.debug("Samples: {}, Blocks: {}, Avg time: {} ms, Max time: {} ms",
    stats.getSamplesProcessed(),
    stats.getBlocksEmitted(),
    stats.getAverageProcessingTime(),
    stats.getMaxProcessingTime());

// Monitor adaptive parameters
logger.debug("Noise level: {}, Threshold: {}",
    denoiser.getCurrentNoiseLevel(),
    denoiser.getCurrentThreshold());

References

Mallat, S. (2008). A Wavelet Tour of Signal Processing (3rd ed.)
Donoho, D. L. (1995). De-noising by soft-thresholding
Johnstone, I. M., & Silverman, B. W. (1997). Wavelet threshold estimators

Table of Contents​

Overview​

Key Features​

Architecture​

Component Hierarchy​

Data Flow​

Streaming Denoiser​

Implementation Comparison​

Selection Comparison​

Factory Pattern​

Automatic Selection Logic​

Zero-Copy Streaming Implementation​

OptimizedStreamingWaveletTransform​

Key Benefits​

Performance Characteristics​

Performance Analysis​

Latency Breakdown​

Memory Usage​

Quality Metrics​

Implementation Details​

Fast Implementation​

Quality Implementation​

Adaptive Processing​

Configuration Guide​

Basic Configuration​

Advanced Configuration​

Configuration Parameters​

Best Practices​

1. Choose the Right Implementation​

2. Optimize Block Size​

3. Configure Overlap Appropriately​

4. Memory Pool Usage​

5. Resource Management​

6. Performance Monitoring​

Troubleshooting​

Common Issues​

1. High Latency Spikes​

2. Poor Denoising Quality​

3. Memory Issues​

Performance Tuning Checklist​

Debug Output​

References​

Table of Contents

Overview

Key Features

Architecture

Component Hierarchy

Data Flow

Streaming Denoiser

Implementation Comparison

Selection Comparison

Factory Pattern

Automatic Selection Logic

Zero-Copy Streaming Implementation

OptimizedStreamingWaveletTransform

Key Benefits

Performance Characteristics

Performance Analysis

Latency Breakdown

Memory Usage

Quality Metrics

Implementation Details

Fast Implementation

Quality Implementation

Adaptive Processing

Configuration Guide

Basic Configuration

Advanced Configuration

Configuration Parameters

Best Practices

1. Choose the Right Implementation

2. Optimize Block Size

3. Configure Overlap Appropriately

4. Memory Pool Usage

5. Resource Management

6. Performance Monitoring

Troubleshooting

Common Issues

1. High Latency Spikes

2. Poor Denoising Quality

3. Memory Issues

Performance Tuning Checklist

Debug Output

References