Raytracing in Curved Spacetime: The Camera

In flat-space ray tracing, you fire rays through a screen and call it a day. In curved spacetime, even defining “which direction is forward” requires building a local reference frame from scratch. Here’s how.

This is Part 3 of the series. Part 1 covered the basic setup; Part 2 covered the difficulties of tracing straight lines in non-Cartesian coordinates. Now for one of the key elements: the camera.

In Euclidean space, setting up rays is straightforward: fire them from an origin through a screen and associate each one with a pixel (see Generating Camera Rays). For a GR raytracer, there are two complications:

The spacetime is curved.
We want to model physical entities like photons.

How to set up the camera

The geodesic equation tells us how a photon travels once it is moving. But we still need to answer a prior question: what are the initial conditions? In other words, which rays does the camera fire, and in which directions?

The local observer frame

The geodesic equation is written in global coordinates. These are coordinates that cover the whole spacetime. A camera, however, is a local device. It sits at one point in spacetime and measures directions relative to itself. We therefore need a way to translate between the global coordinate description and what the observer locally perceives.

This is done via a tetrad, which is a set of four orthonormal vectors ${e_t, e_x, e_y, e_z}$ attached to the observer’s position in our case. Think of it as a local coordinate system carried by the observer, analogous to setting up a small flat patch of space around you. The vector $e_t$ points along the observer’s worldline (their notion of “forward in time”), while $e_x$ (right), $e_y$ (up), and $e_z$ (forward) span the local spatial directions.

Orthonormality here is defined with respect to the spacetime metric $g_{\mu\nu}$:

\[g_{\mu\nu}\, (e_a)^\mu (e_b)^\nu = \eta_{ab}\]

where $\eta_{ab}$ is the flat Minkowski metric. This means that locally, at the observer’s position, everything looks like flat spacetime, which is precisely the equivalence principle.

The tetrad depends on both the position in spacetime and the observer’s state of motion. For a different spacetime geometry (Schwarzschild, Kerr, …) the tetrad takes a different form, but its role is always the same: bridge between the curved global picture and the observer’s flat local picture.

Example: stationary observer in flat spacetime

The simplest case is a stationary observer in flat spacetime using Cartesian coordinates. The four-velocity is just

\[u^\mu = (1, 0, 0, 0)\]

and a natural choice for the spatial tetrad vectors is

\[e_x^\mu = (0, 1, 0, 0), \quad e_y^\mu = (0, 0, 1, 0), \quad e_z^\mu = (0, 0, 0, 1).\]

Each spatial vector is orthogonal to the four-velocity, $e_x^\mu u_\mu = 0$, and they are orthonormal among themselves.

Generating rays

The approach here follows Seeing relativity – I. In flat space, a pinhole camera works by imagining a screen in front of the camera. Each pixel on the screen corresponds to a direction, and a ray is fired along that direction. We can do the same here, but in the observer’s local frame.

For a pixel at position $(i’, j’)$ on the screen (coordinates of the pixels on the screen, centered on the optical axis and scaled by the field of view), we construct a spacelike vector pointing toward that pixel:

\[\mathbf{w} = e_z + i'\, e_x + j'\, e_y\]

This maps the screen coordinates to a direction on the unit sphere. Stereographic projection is a natural choice because it maps circles on the sky to circles on the screen. The black hole silhouette, for example, always appears circular regardless of where it sits in the frame (see Sec. II of Seeing relativity I). Following Eq. 5, we construct a unit spacelike direction:

\[N = -e_z + \frac{2\,\mathbf{w}}{-\mathbf{w} \cdot \mathbf{w}}\]

Verify: $N \cdot N = -1$, i.e., $N$ is a unit spacelike vector. To trace this pixel, we fire a ray in direction $N$. Its 4-momentum is

\[p^\mu = (e_t)^\mu + N^\mu\]

which is indeed null: $p \cdot p = e_t \cdot e_t + 2\,e_t \cdot N + N \cdot N = 1 + 0 + (-1) = 0$. This is the ray’s initial 4-momentum, future-directed and pointing outward from the camera.

The interactive visualization below shows how the pixel grid maps to directions on the unit sphere, seen head-on along the camera’s forward axis. Notice how equal-size pixels on the sensor do not map to equal solid angles on the sky: the dots crowd together near the edges of the sphere.

These components $p^\mu$, expressed in the global coordinate system, are the initial conditions fed into the geodesic equation. From here the numerical integrator takes over and traces the photon’s path through the curved spacetime.

Orientation and motion

Two more physical ingredients are worth noting.

The camera can be oriented in any direction by rotating the spatial tetrad vectors $e_x$, $e_y$, $e_z$ before computing the rays. This works the same way as rotating a camera in flat space.

If the observer is moving relative to the coordinate frame, the tetrad must be Lorentz-boosted accordingly. A moving observer sees a different sky than a stationary one: stars appear shifted toward the direction of motion, an effect known as relativistic aberration. The boost is applied to all four tetrad vectors before ray generation, so this effect is automatically included.

Example: observer moving along the $z$-axis

Consider an observer moving along the $z$-axis with velocity $v$. The Lorentz boost gives the four-velocity

\[u^\mu = (\gamma, 0, 0, \gamma v), \quad \gamma = \frac{1}{\sqrt{1 - v^2}}.\]

The spatial tetrad vectors $e_x$ and $e_y$ are unaffected by the boost, but $e_z$ picks up a timelike component:

\[e_x^\mu = (0, 1, 0, 0), \quad e_y^\mu = (0, 0, 1, 0), \quad e_z^\mu = (\gamma v, 0, 0, \gamma).\]

Checking: $e_z^\mu u_\mu = 0$ and $e_z^\mu (e_z)_\mu = -1$ still hold. The mixing of the time and space components in $e_z$ is what produces aberration: rays that were straight ahead for the stationary observer are now tilted in the boosted frame.

Initial conditions in the Schwarzschild geometry

With the general framework in place, let’s apply it to the geometry we actually care about. For the concrete case of Schwarzschild spacetime, we need to specify an explicit tetrad. The Schwarzschild metric in spherical coordinates $(t, r, \theta, \varphi)$ is

\[\mathrm{d}s^2 = f(r)\,\mathrm{d}t^2 - \frac{\mathrm{d}r^2}{f(r)} - r^2\mathrm{d}\theta^2 - r^2\sin^2\theta\,\mathrm{d}\varphi^2, \qquad f(r) = 1 - \frac{2M}{r}.\]

A first guess for the tetrad would be to simply normalise the coordinate basis, which gives a timelike vector proportional to $\partial_t$ and spatial vectors proportional to $\partial_r$, $\partial_\theta$, $\partial_\varphi$. This works, but only outside the horizon where $f(r) > 0$. Inside the horizon $f(r)$ changes sign, so $\partial_t$ becomes spacelike and $\partial_r$ becomes timelike. The tetrad no longer has the right signature.

A better choice, following Seeing relativity I, is the tetrad associated to a freely falling observer starting from rest at infinity with zero angular momentum. This is known as the rain frame (Seeing relativity I calls it a Zero Angular Momentum Observer, or ZAMO). Its explicit components in Schwarzschild coordinates are

\[e_t^\mu = \left(\frac{1}{f(r)},\ -\sqrt{\frac{2M}{r}},\ 0,\ 0\right), \qquad e_r^\mu = \left(-\frac{\sqrt{2M/r}}{f(r)},\ 1,\ 0,\ 0\right),\] \[e_\theta^\mu = \left(0,\ 0,\ \frac{1}{r},\ 0\right), \qquad e_\varphi^\mu = \left(0,\ 0,\ 0,\ \frac{1}{r\sin\theta}\right).\]

In practice the observer placed in front of the scene is not necessarily freely falling. The rain-frame tetrad therefore serves as a reference frame: the actual observer velocity is applied on top of it via a Lorentz boost, exactly as described in the moving observer example above. The boost transforms the freely-falling frame into the observer’s own frame before the rays are generated.