본문으로 건너뛰기
CHOI HONGSU
1 min read

Selective Bloom RendererFeature

 

URP Bloom isolated per layer — RT memory reduced 90% + pixel processing reduced 35–40%

  • −90%VFX FPS RT 메모리 (~5–10 MB → ~0.5 MB)
  • −35~40% 픽셀 처리량 (1080p 기준)
  • 레이어 격리 UI·캐릭터·배경에 Bloom 미영향

Problem

Approach

Selective Bloom RendererFeature 자체 구현

Chosen

Pros

  • Per-layer isolation possible, direct control of RT / pixel cost

Cons

  • Maintenance is on us, must keep up with URP version upgrades

URP 기본 Bloom 튜닝

Pros

  • Battle-tested implementation, no maintenance burden

Cons

  • No layer isolation, memory / pixel cost unchanged

Why Selective Bloom RendererFeature 자체 구현: Since the requirement "Bloom only on effects" is clear, the full-screen mip-chain structure is overkill. The minimal pipeline that fits: render only the BloomFX layer to a separate RT via LayerMask → 2-pass blur → Additive composite.

Architecture

Implementation

  1. 01

    Step 1 — RendererFeature: layer-isolated rendering

    The LayerMask filter culls anything outside the BloomFX layer. Downsample steps (downsampleSteps) are adjustable 0~3 in the inspector.

  2. 02

    Step 2 — Separable 2-pass Blur

    3-tap × V/H separable blur. Iterations expand the radius. Two RTs ping-pong to minimize memory.

    The offset vector (_BlurOffset) is computed once in the vertex shader and passed via V2F → reduces fragment cost.

  3. 03

    Step 3 — Additive Composition + HDR Tonemapping Control

    • Preserves color ratios via a BT.709 luminance-based threshold
    • _HighlightIntensity controls peak intensity (squared curve)
    • _WhiteOut makes only very bright regions burn out to white (white-hot)

Validation

Comparison at 1080p

ItemURP Default BloomSelective Bloom
Pixel throughput~4.0M~2.6M (−35%)
Blur sample count9-tap dual filter3-tap × 2-pass = 6-tap
RT count6+ (mip chain)2 (ping-pong)
RT memory~5–10 MB~0.5 MB (−90%)
Composite scopefull screenlimited to BloomFX layer
Downsample steps5+ fixed0~3, adjustable in inspector
Material parametersa subset like Threshold·IntensityThreshold·Highlight·WhiteOut·blur weights and more directly exposed

Tradeoffs & Future Work

Tradeoffs

  • Forced layer separation: objects targeted by BloomFX must be classified into a dedicated layer. Migrating an existing project incurs the cost of reorganizing layers.
  • Additional DrawRenderers: when there are many objects on the BloomFX layer, the extra draw calls can rise. Negligible for effects on the order of tens per screen.
  • No mip chain: very large bloom radii require increasing blurIterations, which scales cost linearly. Better suited for point-light/effect emission than wide-area glow.
  • Responsibility for URP version upgrades: self-maintenance is required when APIs like Blitter.BlitCameraTexture or ScriptableRenderPass change.

Conclusion

URP default Bloom is tuned for cinematic and console scenarios where the whole screen emits light. For mobile scenarios that only need effect emission, this pipeline is overkill, and a custom RendererFeature with layer isolation and a minimal-pass structure is a better fit.

Selective Bloom RendererFeature · FlashGambit · Choi Hongsu · Hongsu