WebXR Hit Testing & Depth Sensing
xrvrarwebxrFor AR experiences to feel real, virtual objects need to interact with the physical world. They should sit on tables, be occluded by walls, and respond when the user taps a surface. WebXR provides two APIs for this: Hit Testing (where does the user point?) and Depth Sensing (what's the shape of the real world?).
Hit Testing Fundamentals
Hit testing answers a simple question: given a ray from the user's controller or gaze, where does it intersect a real-world surface?
Requesting Hit Test Support
const session = await navigator.xr.requestSession('immersive-ar', {
requiredFeatures: ['hit-test']
});
Creating a Hit Test Source
There are two types of hit test sources:
Non-transient — tied to a fixed reference space. Best for gaze-based interaction:
const hitTestSource = await session.requestHitTestSource({
space: viewerReferenceSpace
});
Transient — tied to an input source (controller, hand). Updates as the input moves:
const transientSource = await session.requestHitTestSourceForTransientInput({
profile: 'generic-trigger',
offsetRay: new XRRay({
x: 0, y: 0, z: 0 // origin, direction
}, {
x: 0, y: 0, z: -1, w: 0 // -Z = forward
})
});
The offsetRay parameter lets you adjust where the hit test ray originates relative to the input source. For controllers, the default is along the controller's pointing direction. For hands, you might want to offset to the palm or index finger tip.
Performing Hit Tests
function onXRFrame(time: number, frame: XRFrame) {
// Non-transient hit results
const nonTransientResults = frame.getHitTestResults(hitTestSource);
for (const result of nonTransientResults) {
const pose = result.getPose(referenceSpace);
if (pose) {
// pose.transform.position = intersection point in world space
// pose.transform.orientation = surface normal
renderHitMarker(pose.transform.position);
}
}
// Transient hit results
const transientResults = frame.getHitTestResultsForTransientInput(transientSource);
for (const result of transientResults) {
const inputSource = result.inputSource;
const pose = result.results[0]?.getPose(referenceSpace);
if (pose) {
// Intersection tied to this specific input source
handleInputInteraction(inputSource, pose);
}
}
}
Each hit test result contains:
getPose(referenceSpace)— the intersection point and surface normalresult.inputSource— for transient results, which input source triggered it
Interpreting Results
function renderHitMarker(position: DOMPointReadOnly, normal?: DOMPointReadOnly) {
marker.position.set(position.x, position.y, position.z);
if (normal) {
// Orient marker to match surface normal
const up = new THREE.Vector3(0, 1, 0);
const surfaceNormal = new THREE.Vector3(normal.x, normal.y, normal.z);
const quat = new THREE.Quaternion().setFromUnitVectors(up, surfaceNormal);
marker.quaternion.copy(quat);
}
marker.visible = true;
}
Depth Sensing
While hit testing tells you where surfaces are at specific points, depth sensing gives you a full depth map of the environment.
Requesting Depth
const session = await navigator.xr.requestSession('immersive-ar', {
requiredFeatures: ['depth-sensing'],
depthSensing: {
usagePreference: ['cpu-optimized', 'gpu-optimized'],
dataFormatPreference: ['luminance-alpha', 'float32']
}
});
Reading the Depth Buffer
Depth information is available per-view (left and right eyes):
function onXRFrame(time: number, frame: XRFrame) {
for (const view of frame.views) {
const depthInfo = frame.getDepthInformation(view);
if (!depthInfo) continue;
// Raw depth values in meters
const depthData = depthInfo.getDepthInMeters();
// Buffer dimensions
const width = depthInfo.width; // e.g., 256
const height = depthInfo.height; // e.g., 176
// Access a specific pixel (center of frame)
const centerU = Math.floor(width / 2);
const centerV = Math.floor(height / 2);
const centerDepth = depthData[centerV * width + centerU];
// centerDepth is distance in meters from the camera
// Convert depth pixel to world position
const worldPos = depthInfo.getDepthInWorldPosition(centerU, centerV);
if (worldPos) {
// worldPos is a DOMPointReadOnly in the reference space
}
}
}
CPU vs GPU Depth Access
| Method | Access Pattern | Best For |
|---|---|---|
getDepthInMeters() | CPU — Float32Array | Spatial queries, physics |
getDepthInWorldPosition() | CPU — DOMPointReadOnly | Converting single pixels to world space |
Depth texture (via XRWebGLBinding) | GPU — WebGL texture | Occlusion rendering, shader effects |
Converting Depth to World Space
The raw depth buffer is in the camera's clip space. Converting to world space requires the projection matrix:
function depthUVToWorld(
u: number, v: number,
depthInMeters: Float32Array,
width: number, height: number,
view: XRView
): THREE.Vector3 | null {
const depth = depthInMeters[v * width + u];
if (depth <= 0) return null;
// Normalized device coordinates
const ndcX = (u / width) * 2 - 1;
const ndcY = (v / height) * 2 - 1;
// Using the view's projection matrix inverse
const projMatrix = new THREE.Matrix4().fromArray(view.projectionMatrix);
const invProj = projMatrix.invert();
const clipPos = new THREE.Vector4(ndcX, ndcY, -1, 1);
const worldPos4 = clipPos.applyMatrix4(invProj);
worldPos4.multiplyScalar(1 / worldPos4.w);
// Scale to actual depth
const direction = new THREE.Vector3(worldPos4.x, worldPos4.y, worldPos4.z).normalize();
const worldPosition = direction.multiplyScalar(depth);
// Transform by view transform
worldPosition.applyMatrix4(
new THREE.Matrix4().fromArray(view.transform.inverse.matrix)
);
return worldPosition;
}
Alternatively, use the convenience method (available on some runtimes):
const worldPos = depthInfo.getDepthInWorldPosition(u, v);
Occlusion Rendering
Occlusion is what makes AR look realistic — virtual objects disappear behind real surfaces. Without occlusion, objects always render on top of the real world, breaking the illusion.
GPU-Based Occlusion with Depth Texture
The most performant approach samples the depth texture in the fragment shader:
// Fragment shader
uniform sampler2D uDepthTexture;
uniform mat4 uProjectionMatrix;
uniform mat4 uViewMatrix;
uniform mat4 uModelMatrix;
varying vec3 vWorldPosition;
void main() {
// Transform fragment world position to clip space of the depth camera
vec4 clipPos = uProjectionMatrix * uViewMatrix * vec4(vWorldPosition, 1.0);
vec3 ndc = clipPos.xyz / clipPos.w;
// Convert to UV coordinates
vec2 uv = ndc.xy * 0.5 + 0.5;
// Sample the real-world depth
float realDepth = texture2D(uDepthTexture, uv).r;
// Compare with the fragment's depth
float fragDepth = ndc.z;
// If fragment is behind the real surface, discard it
if (fragDepth > realDepth + 0.005) {
discard;
}
// Otherwise render normally
gl_FragColor = vec4(1.0);
}
Three.js Integration
For Three.js, create a custom shader material:
const occlusionMaterial = new THREE.ShaderMaterial({
uniforms: {
uDepthTexture: { value: depthTexture },
uProjectionMatrix: { value: new THREE.Matrix4() },
uViewMatrix: { value: new THREE.Matrix4() }
},
vertexShader: `
varying vec3 vWorldPosition;
void main() {
vec4 worldPos = modelMatrix * vec4(position, 1.0);
vWorldPosition = worldPos.xyz;
gl_Position = projectionMatrix * viewMatrix * worldPos;
}
`,
fragmentShader: `
uniform sampler2D uDepthTexture;
uniform mat4 uProjectionMatrix;
uniform mat4 uViewMatrix;
varying vec3 vWorldPosition;
void main() {
vec4 clipPos = uProjectionMatrix * uViewMatrix * vec4(vWorldPosition, 1.0);
vec3 ndc = clipPos.xyz / clipPos.w;
vec2 uv = ndc.xy * 0.5 + 0.5;
if (uv.x < 0.0 || uv.x > 1.0 || uv.y < 0.0 || uv.y > 1.0) {
gl_FragColor = vec4(1.0);
return;
}
float realDepth = texture2D(uDepthTexture, uv).r;
float fragDepth = ndc.z;
if (fragDepth > realDepth + 0.005) discard;
gl_FragColor = vec4(0.5, 0.8, 1.0, 1.0);
}
`,
transparent: true
});
Performance of Occlusion
| Approach | Quality | Performance | Complexity |
|---|---|---|---|
| Depth texture shader | Good | ✅ Best | Medium |
| CPU depth read + stencil | Medium | ❌ Slow | Low |
| Mesh reconstruction | Excellent | ✅ Good (once built) | High |
The depth texture approach is the recommended starting point — good visual quality with minimal overhead.
Mesh Reconstruction
Some runtimes (Quest 3, ARKit) provide mesh reconstruction — a full triangle mesh of the environment:
// Check for mesh reconstruction support
const meshSet = frame.worldInformation?.meshSet;
if (meshSet) {
for (const mesh of meshSet) {
const geometry = new THREE.BufferGeometry();
geometry.setAttribute('position',
new THREE.BufferAttribute(mesh.positions, 3)
);
geometry.setIndex(mesh.indices);
// Use for occlusion, physics, or navigation
}
}
Mesh reconstruction is the most accurate form of environment understanding, but it's also the most computationally expensive. Use it sparingly and with geometry Level of Detail (LOD).
Practical Combination: Placement + Occlusion
The real power comes from combining all these features:
class ARPlacementSystem {
private hitTestSource: XRHitTestSource;
private depthTexture: WebGLTexture | null = null;
async placeObject(frame: XRFrame): Promise<boolean> {
// 1. Find surface via hit test
const results = frame.getHitTestResults(this.hitTestSource);
if (results.length === 0) return false;
const pose = results[0].getPose(referenceSpace);
if (!pose) return false;
// 2. Place the object
const anchor = await frame.createAnchor(pose.transform, referenceSpace);
this.attachObjectToAnchor(anchor);
// 3. Enable occlusion for the placed object
this.enableOcclusion(frame);
return true;
}
private enableOcclusion(frame: XRFrame) {
for (const view of frame.views) {
const depthInfo = frame.getDepthInformation(view);
if (depthInfo) {
// Update the depth texture uniform
// ... (GPU depth texture binding via XRWebGLBinding)
}
}
}
}
Series Cross-References
- Building a WebXR App with Three.js — Full application scaffold with hit test and depth integration
- WebXR Anchors & Plane Detection — Combine hit tests with anchors for persistent placement
- WebXR Hand Tracking — Use hand gestures as hit test input sources