Thumbnails

RawCull processes Sony ARW (Alpha Raw) image files through two mechanisms:

  1. Thumbnail Generation: Creates optimized 2048×1372 thumbnails for the culling UI
  2. Embedded Preview Extraction: Extracts full-resolution JPEG previews from ARW metadata for detailed inspection

Both systems integrate with a hierarchical two-tier caching architecture (RAM → Disk) to minimize repeated file processing. The system has been refactored to maximize memory utilization and minimize unnecessary evictions.


Thumbnail Specifications

Standard Dimensions

All thumbnails are created at a fixed size to ensure consistent performance and caching:

PropertyValue
Width2048 pixels
Height1372 pixels
Aspect Ratio~1.49:1 (rectangular)
Color SpaceRGBA
Cost Per Pixel6 bytes (configurable 4–8)
Memory Per Thumbnail16.86 MB base + ~10% overhead = ~19.4 MB

Why 2048×1372?

Original ARW dimensions:  8640× 5760 pixels (typical Sony Alpha)
                            ↓
            Downsampled by factor of ~4.2x
                            ↓
        2048×1372 thumbnails
                            ↓
    Perfect balance:
    - Large enough for detail recognition
    - Small enough for reasonable memory footprint
    - Maintains original aspect ratio

ARW File Format

Structure

Sony ARW files are TIFF-based containers with multiple embedded images:

ARW File (TIFF-based)
├── Index 0: Small thumbnail (≤256×256px)
├── Index 1: Preview JPEG (variable resolution)
├── Index 2: Maker Notes & EXIF Data
└── Index 3+: Raw Sensor Data

Image Discovery

The extraction system uses CGImageSource to enumerate all images:

let imageCount = CGImageSourceGetCount(imageSource)

for index in 0 ..< imageCount {
    let properties = CGImageSourceCopyPropertiesAtIndex(imageSource, index, nil)
    let width = getWidth(from: properties)
    let isJPEG = detectJPEGFormat(properties)
}

JPEG Detection

Identifies JPEG payloads using two markers:

  1. JFIF Dictionary: Presence of kCGImagePropertyJFIFDictionary
  2. TIFF Compression Tag: Compression value of 6 (TIFF 6.0 JPEG)
let hasJFIF = (properties[kCGImagePropertyJFIFDictionary] as? [CFString: Any]) != nil
let compression = tiffDict?[kCGImagePropertyTIFFCompression] as? Int
let isJPEG = hasJFIF || (compression == 6)

Dimension Extraction

Retrieves image dimensions from multiple sources in priority order:

1. Root Properties: kCGImagePropertyPixelWidth
2. EXIF Dictionary: kCGImagePropertyExifPixelXDimension
3. TIFF Dictionary: kCGImagePropertyTIFFImageWidth
4. Fallback: Return nil if none available

Thumbnail Creation Pipeline

Source File Processing

When a user opens a RawCull project with ARW files:

ARW File (10-30 MB)
    ↓
[RAW Decoder]
    - Load raw sensor data
    - Apply Bayer demosaicing
    - Color correction
    ↓
Full Resolution Image (RGB, 3 bytes/pixel)
    ↓
[Resize Engine]
    - Maintain aspect ratio
    - Bilinear or lanczos filtering
    ↓
2048 × 1372 RGB Thumbnail
    - 16.86 MB uncompressed
    - 6 bytes/pixel (including alpha)

Extraction Process

private nonisolated func extractSonyThumbnail(
    from url: URL,
    maxDimension: CGFloat,  // 2048 for standard size
    qualityCost: Int = 6     // Configurable 4-8 bytes/pixel
) async throws -> CGImage

Phase 1: Image Source Creation

let options = [kCGImageSourceShouldCache: false] as CFDictionary
guard let source = CGImageSourceCreateWithURL(url as CFURL, options) else {
    throw ThumbnailError.invalidSource
}
  • Opens ARW file via ImageIO
  • kCGImageSourceShouldCache: false prevents intermediate caching

Phase 2: Thumbnail Generation

let thumbOptions: [CFString: Any] = [
    kCGImageSourceCreateThumbnailFromImageAlways: true,
    kCGImageSourceCreateThumbnailWithTransform: true,
    kCGImageSourceThumbnailMaxPixelSize: maxDimension,
    kCGImageSourceShouldCacheImmediately: false
]

guard var image = CGImageSourceCreateThumbnailAtIndex(
    source, 0, thumbOptions as CFDictionary
) else {
    throw ThumbnailError.generationFailed
}
OptionValuePurpose
kCGImageSourceCreateThumbnailFromImageAlwaystrueAlways create, even if embedded exists
kCGImageSourceCreateThumbnailWithTransformtrueApply EXIF orientation
kCGImageSourceThumbnailMaxPixelSize2048Constrains to 2048×1372
kCGImageSourceShouldCacheImmediatelyfalseWe manage caching

Phase 3: Quality Enhancement (Optional)

If costPerPixel ≠ 6, the image is re-rendered with appropriate interpolation:

let qualityMapping: [Int: CGInterpolationQuality] = [
    4: .low,
    5: .low,
    6: .medium,   // Default, balanced
    7: .high,
    8: .high
]

Phase 4: Return Thread-Safe Image

return image  // CGImage is Sendable, safe for actor boundary

CGImage is returned (not NSImage) because it is Sendable and can cross actor boundaries safely.

Phase 5: Storage (in Actor Context)

let nsImage = NSImage(cgImage: image, size: NSSize(...))
storeInMemoryCache(nsImage, for: url)  // RAM cache immediately

Task.detached(priority: .background) { [cgImage] in
    await self.diskCache.save(cgImage, for: url)
}

Two-Tier Cache

Cache Tiers

┌─────────────────────────────────────────────┐
│          Thumbnail Requested                │
└────────────────┬────────────────────────────┘
                 │
                 ▼
        ┌────────────────────┐
        │  Memory Cache?     │
        │  (NSCache)         │
        └────────┬───────────┘
                 │
       ┌─────────┴──────────┐
       │ HIT (70.2%)        │ MISS (29.8%)
       ▼                    ▼
    Return from       Disk Cache?
    Memory            (FileSystem)
                           │
                    ┌──────┴──────┐
                    │ HIT          │ MISS
                    │ (29.8%)      │
                    ▼              ▼
                 Read from     Decompress
                 Disk, Add     Original ARW,
                 to Memory     Create Thumbnail

    Performance: ~instant    ~instant      ~100-500ms
                 (in-memory)  (disk I/O)    (CPU-bound)

Tier 1: RAM Cache (NSCache)

Managed by SharedMemoryCache actor with dynamic configuration:

let memoryCache = NSCache<NSURL, DiscardableThumbnail>()
memoryCache.totalCostLimit = dynamicLimit  // Based on system RAM
memoryCache.countLimit = 10_000             // High; memory is limiting factor

Characteristics:

  • LRU Eviction: Least-recently-used thumbnails removed when cost limit exceeded
  • Protocol: Implements NSDiscardableContent for OS-level memory reclamation
  • Thread-Safe: Built-in synchronization by NSCache
  • Cost-Aware: Respects pixel memory, not item count
  • Hit Rate: 70.2% (observed in typical workflows)

Tier 2: Disk Cache

// Location: ~/.RawCull/thumbcache/[projectID]/
// Format: JPEG compressed at 0.7 quality
// Size: 3-5 MB per thumbnail (82-91% compression)

Characteristics:

  • Hit Rate: 29.8% (complements memory cache)
  • Latency: 50-200 ms (disk I/O + decompression)
  • Persistence: Survives app restart
  • Automatic Promotion: Disk hits loaded to memory for next access

Disk cache representation formats:

FormatSizeAdvantages
PNG3-5 MBLossless, fast decode
HEIF2-4 MBBetter compression, hardware acceleration
JPEG1-2 MBFastest, good for fast browsing

Storage location: ~/.RawCull/thumbcache/[projectID]/

Embedded Preview Extraction

For detailed inspection, RawCull can extract full-resolution JPEG previews directly from ARW metadata, providing superior quality compared to generated thumbnails.

Selection Strategy

The system selects the widest JPEG from all images embedded in the ARW:

for index in 0 ..< imageCount {
    let properties = CGImageSourceCopyPropertiesAtIndex(imageSource, index, nil)
    if let width = getWidth(from: properties), isJPEG(properties) {
        if width > targetWidth {
            targetIndex = index
            targetWidth = width
        }
    }
}

Sony typically stores higher-quality previews at later indices, so the widest JPEG maximizes quality.

Thumbnail vs. Full Preview

AspectThumbnailFull Preview
SourceGeneric ImageIO (may use embedded or generate)ARW embedded JPEG specifically
Quality ControlParameter-driven (cost per pixel)Full resolution preservation
DownsamplingAutomatic via CGImageSourceThumbnailMaxPixelSizeConditional, only if needed
Use CaseCulling grid, rapid browsingDetailed inspection, full-screen
PerformanceFast (200-500 ms)Medium (500 ms–2s with decode)

Downsampling Decision

let maxPreviewSize: CGFloat = fullSize ? 8640 : 4320

if CGFloat(embeddedJPEGWidth) > maxPreviewSize {
    // Downsample to reasonable size
} else {
    // Use original size (never upscale)
}
  • If embedded JPEG is larger than target: downsample to preserve memory
  • If embedded JPEG is smaller: preserve original (never upscale)
  • fullSize=true: 8640px threshold (professional workflows)
  • fullSize=false: 4320px threshold (balanced quality/performance)

Resizing Implementation

private func resizeImage(_ image: CGImage, maxPixelSize: CGFloat) -> CGImage? {
    let scale = min(maxPixelSize / CGFloat(image.width), maxPixelSize / CGFloat(image.height))
    guard scale < 1.0 else { return image }  // Already smaller

    // Draw into new context with .high interpolation
    context.interpolationQuality = .high
    context.draw(image, in: CGRect(x: 0, y: 0, width: newWidth, height: newHeight))
    return context.makeImage()
}

JPEG Export

@concurrent
nonisolated func save(image: CGImage, originalURL: URL) async {
    // Saves alongside original ARW as .jpg at maximum quality (1.0)
    let options: [CFString: Any] = [
        kCGImageDestinationLossyCompressionQuality: 1.0
    ]
}

Concurrency Model

Actor-Based Architecture

All extraction systems use Swift actors for thread-safe state:

actor ScanAndCreateThumbnails { }
actor ExtractSonyThumbnail { }
actor ExtractEmbeddedPreview { }
actor DiskCacheManager { }

Benefits:

  • Serial execution prevents data races
  • State mutations are automatically serialized
  • No manual locks required
  • Safe concurrent calls from multiple views

Isolated State

actor ScanAndCreateThumbnails {
    private var successCount = 0
    private var processingTimes: [TimeInterval] = []
    private var totalFilesToProcess = 0
    private var preloadTask: Task<Int, Never>?
}

Concurrent Extraction Without Isolation Violation

ImageIO operations are nonisolated to avoid blocking the actor:

@concurrent
nonisolated func extractSonyThumbnail(from url: URL, maxDimension: CGFloat) async throws -> CGImage {
    try await Task.detached(priority: .userInitiated) {
        let source = CGImageSourceCreateWithURL(url as CFURL, options)
        // ...
    }.value
}

Cancellation Support

func cancelPreload() {
    preloadTask?.cancel()
    preloadTask = nil
}

Error Handling

Extraction Errors

enum ThumbnailError: Error {
    case invalidSource
    case generationFailed
    case decodingFailed
}

Error Recovery

Batch Processing (non-fatal — continues to next file):

do {
    let cgImage = try await ExtractSonyThumbnail().extractSonyThumbnail(from: url, ...)
    storeInMemoryCache(cgImage, for: url)
} catch {
    Logger.process.warning("Failed to extract \(url.lastPathComponent): \(error)")
}

On-Demand Requests (returns nil; UI shows placeholder):

func thumbnail(for url: URL, targetSize: Int) async -> CGImage? {
    do { return try await resolveImage(for: url, targetSize: targetSize) }
    catch { return nil }
}

Performance Characteristics

Typical Timings (Apple Silicon, 40-50 ARW files, 16 GB Mac)

OperationDurationNotes
File discovery<100 msNon-recursive enumeration
Thumbnail generation (1st pass)5-20 sFull extraction
Thumbnail generation (2nd pass)<500 msAll from RAM cache
Disk cache promotion100-500 msLoad + store to RAM
Embedded preview extraction500 ms–2 sJPEG decode + optional resize
Single thumbnail generation200-500 msCPU-bound ARW decode/resize
JPEG export100-300 msDisk write + finalize

Memory Usage per Configuration

ScenarioCache AllocationThumbnail CapacityHit RateUse Case
Light editing5 GB~25760-70%Casual culling
Production10 GB~51570-75%Typical workflow
Professional16 GB~82475-80%Large batches

Quality/Performance Tradeoff

Cost Per Pixel | Memory Per Image | 10 GB Capacity | Quality      | Speed
───────────────────────────────────────────────────────────────────────
4 bytes        | ~15 MB           | ~667           | Good         | Fast
6 bytes        | ~19.4 MB         | ~515           | Excellent    | Balanced
8 bytes        | ~25.8 MB         | ~387           | Outstanding  | Slower

Concurrency Impact

Processor Cores | Max Concurrent Tasks | Benefit
───────────────────────────────────────────────
4-core Mac      | 8 tasks              | 2-3x faster
8-core Mac      | 16 tasks             | 4-6x faster
10-core Mac     | 20 tasks             | 6-8x faster

Data Flow Summary

User initiates bulk thumbnail load
    ↓
[ScanAndCreateThumbnails.preloadCatalog()]
    ├─ Discover files (non-recursive)
    ├─ For each file (concurrency controlled):
    │   ├─ Check RAM cache
    │   │   ✓ HIT (70%): Return immediately
    │   │   ✗ MISS (30%):
    │   ├─ Check disk cache
    │   │   ✓ HIT: Load and promote to RAM
    │   │   ✗ MISS:
    │   ├─ Extract thumbnail:
    │   │   ├─ Open ARW via ImageIO
    │   │   ├─ Generate 2048×1372 thumbnail
    │   │   ├─ Apply quality enhancement (optional)
    │   │   └─ Wrap in NSImage
    │   ├─ Store in RAM (immediate)
    │   └─ Schedule async disk save (background)
    └─ Return success count

On detailed inspection:
    ↓
[JPGPreviewHandler.handle(file)]
    ├─ Check if JPG exists
    │   ✓ YES: Load and display
    │   ✗ NO:
    ├─ Call ExtractEmbeddedPreview
    │   ├─ Find all images in ARW
    │   ├─ Identify widest JPEG
    │   ├─ Decide: downsample or original?
    │   ├─ Decode JPEG
    │   └─ Return CGImage
    └─ Display full preview

Apple Frameworks Used

FrameworkKey APIsPurpose
ImageIOCGImageSource, CGImageDestinationImage decoding, thumbnail generation, embedded preview extraction
CoreGraphicsCGContext, CGImageRendering, resizing, interpolation
AppKitNSImage, NSCacheDisplay-ready images, LRU cache
FoundationURL, ProcessInfoFile operations, system memory query
Concurrencyactors, task groups, async/awaitSafe parallel processing
CryptoKitInsecure.MD5Disk cache filename generation
OSLogLoggerDiagnostics and monitoring
Last modified March 14, 2026: update (6f708a2)