Projection Explorer

drag to orbit · scroll to zoom · click to inspect · shift+drag to select

Choose a dataset to beginTap to inspect · Full interactive controls on desktop

Beyond the Visible

00Why This Exists

This tool makes information loss visible. The parallel coordinates panel shows every dimension the projection discards; switch from three principal components to two and watch 11% of the explained variance disappear from the readout while the point cloud barely changes shape. The spatial minimap shows where projected clusters actually sit on the ground. The morphing animation between PCA and t-SNE makes the geometric tradeoff tangible: watch global structure warp as local neighborhoods tighten. And the essay you are reading now is woven into the interface, with links that drive the tool as you read.

The purpose is pedagogical, not analytical. This is a tool for understanding what projection does to data: what it preserves, what it destroys, and why the choice of projection is never neutral. It was built for an earth science education context, where the student who learns to ask "what are we not seeing?" will carry that question into every dataset, every map, and every mapping they encounter afterward.

01The Signature of a Soybean

A single pixel from the Indian Pines scene is not a color. It is 200 numbers: reflectance measurements spanning wavelengths from deep violet to shortwave infrared, most of them invisible to the human eye. Each number records how much light at that wavelength bounced off a 20-meter patch of ground in northwestern Indiana. After removing bands corrupted by atmospheric water absorption, 177 channels remain in the interactive.

A soybean plot, a cornfield, a stand of oaks, an asphalt road, and a gravel rooftop all have unique signatures, reflecting different amounts at different wavelengths. These differences are subtle and spread across all 200 measurements. No single band separates corn from soybeans. The information lives in the pattern across all of them simultaneously.

You cannot look at 200 dimensions. You have to choose which three to show, and that choice destroys everything else. This tool makes that destruction visible, and reversible.

02Why You Can't Just Look

Two hundred axes produce 19,900 possible pairs of scatter plots. You could spend a week inspecting them. Or you could ask a more precise question: which single view captures the most structure?

That question is answered though principle components analysis (PCA). Switch to PCA and the tool finds the three directions through 200-dimensional space along which the data varies most. These directions, the principal components, are not wavelength bands. They are weighted combinations of all 200 bands, computed from the covariance structure of the data itself.

The variance metric tells you the cost: if three components capture 90.1% of the variance, you are ignoring 9.9%. Whether that 9.9% matters depends on what you are looking for. If the spectral difference between corn and soybeans lives entirely in that discarded fraction, the projection has erased the distinction you came to find.

03The Axes Nobody Drew

PC1 of the Indian Pines scene is not "Band 47." It is a direction, a specific linear combination of all 200 bands, that maximizes the spread of the data. For vegetation scenes, it tends to capture overall brightness and the vegetation red-edge contrast. PC2 might separate soil moisture levels. PC3 might isolate mineral absorption features. The interpretation depends on the data.

Try swapping which components map to X, Y, and Z. Watch the points morph smoothly from one arrangement to another. Each arrangement is a different shadow of the same 200-dimensional object. The morphing makes the shadow-casting visible: you are watching the projection hyperplane rotate.

04What the Variance Explains

The variance metric reads 90.1% and the projection looks clean: tight clusters, clear separation. The temptation is to stop there.

High explained variance is reassuring but misleading. Two clusters can overlap completely in the top three principal components and separate cleanly in PC7. Variance measures spread, not separation.

Color the points by land cover class. If the colors form tight, well-separated clusters, the projection is preserving the structure you care about. If corn and soybeans overlap in the same diffuse cloud, the distinction exists in dimensions the projection discarded. Open the parallel coordinates and the full 200-dimensional picture appears. Lasso a cluster in 3D and watch which lines light up: the pattern across all 200 bands that defines that group.

05Neighborhoods vs. Distances

Switch to t-SNE and the global geometry warps. Points spread along principal component axes pull into tight, well-separated islands. t-SNE makes a fundamentally different bet: instead of preserving global distances, it preserves local neighborhoods. Points nearby in 200-dimensional space should be nearby in 3D.

Watch the morph. Clusters that were loosely arranged along variance axes snap into compact groups. But the relative distances between clusters, which were meaningful in PCA, are now arbitrary. Two clusters far apart in t-SNE space might be close in the original data. t-SNE tells you what belongs together. PCA tells you what is far apart. Neither view is complete.

06The Signature of a Scientist

Load the knowledge graph and the dimensions change meaning entirely. Here, each dimension is not a wavelength but a relationship. "Connection to Einstein" is an axis. "Connection to Bohr" is another. A hundred and twenty scientists are embedded in 127 dimensions that encode who influenced whom, who collaborated, who shared an institution, a field, or an era.

PCA on this graph finds the axes of maximum relational variance. Color by field and physics, mathematics, biology, chemistry separate. Mostly. But watch for the exceptions. Von Neumann, a mathematician, may cluster closer to the physicists than to the pure mathematicians. If so, the projection has found something the labels missed: his relational fingerprint is more similar to Fermi and Oppenheimer than to Euler or Gauss.

The same question the cornfield raises, reframed.

07Reading the Projection

Some visual grammar for reading projected views:

A tight cluster means those points are similar across many dimensions simultaneously. The tighter the cluster, the more redundant their high-dimensional signatures. An outlier has a signature that no other point shares: an unusual mineral, a scientist with a unique relational position.

When two clusters merge during a morph, the distinction between them existed only in the dimensions the old projection showed. When a single cluster splits, the new projection has revealed structure that was hidden.

The parallel coordinates are the reality check. Select a cluster and examine the full-dimensional signatures. If the polylines are similar, the cluster is real. If they vary wildly, the cluster is a projection artifact: points that happen to land in the same place in 3D but differ in the other 197 dimensions. This grammar is transferable to any dimensionality reduction you encounter.

08What the Projection Destroys

Open the spatial minimap alongside the 3D view. Two points side by side in PCA space may be kilometers apart on the ground. Spatially adjacent points may land in different projection clusters.

Every projection is a compression. Three dimensions from two hundred means 98.5% of the axes are invisible. The parallel coordinates show what is hidden. The minimap shows the inverse. You can see the gap, measure it, and by swapping axes, partially recover it. But you can never see all of it at once. Three dimensions is all you get.

09Compression All the Way Down

Toggle the projection method from PCA to t-SNE and watch the point cloud reorganize. Nothing about the data changed. Everything about what you can see in it did.

RGB is a three-dimensional projection of the visible electromagnetic spectrum. A photograph compresses a 3D scene into 2D. A word compresses an experience into a symbol. Every representation is a projection, a mapping from higher to lower dimensionality that preserves some structure and destroys the rest.

Load the word embeddings and that last compression becomes literal. Each word is a point in 300-dimensional space, its coordinates learned from billions of sentences. Words that appear in similar contexts land near each other. Color by category and watch semantic neighborhoods emerge: animals with animals, emotions with emotions, countries with countries. Filter down to just colors and rotate the projection, a tight cluster disperses and outliers appear. Select them individually and consider why certain color words aren't clustered with the rest. The similarity was an artifact of the viewing angle, not the data. Toggle on t-SNE and the outlier distances are even more pronounced.

The Color-a-Pixel essay asked what happens when you compress color. This tool asks the same question about everything else: spectra, economies, social networks, the geometry of language itself. Something survives. Something is lost. And the choice of what to preserve is never neutral.

Projection Explorer

Load a dataset to begin

Choose dataset…▾

Spatial Distribution

esc ✕

Projection Explorer

Navigate high-dimensional data projected into 3D space. Each point represents a row in the dataset; each dimension is a measured variable. The projection compresses many dimensions into three axes you can see and rotate.

Navigation

DragOrbit the view (globe-like) ScrollZoom in / out ClickInspect a single point Shift + DragRectangle selection (lasso) SelectionCamera orbits and zooms to the centroid of selected points. Deselect to return to full extent.

Projection Methods

PCAPrincipal Component Analysis — axes of maximum variance. Choose which 3 components map to X, Y, Z. The variance metric shows how much information the current axes capture. t-SNEPreserves local neighborhoods. Points nearby in high-dimensional space stay nearby in 3D. Precomputed (nonlinear, non-invertible). ManualMap any 3 raw dimensions to X, Y, Z. For spectral data, these are literal wavelength bands.

Interface Controls

↻Toggle auto-rotation 2DCollapse Z axis (flatten projection) ◎Toggle navigation trackball sphere ⫼Toggle parallel coordinates panel — shows all dimensions simultaneously. Selected points highlight. SizePoint radius

Color Legend

Bottom-left. Click to choose which metadata field colors the points. Legend entries show the mapping. Multi-column layout for compact display.

Parallel Coordinates

When open, every dimension is a vertical axis. Each point traces a polyline across all axes. Click on an axis to select the nearest point. Drag horizontally to zoom into a band of dimensions. Scroll to zoom centered on cursor. Shift+scroll to pan. Double-click to reset zoom. Selected points from any view highlight their polylines here.

Spatial Mini-Map

Appears automatically for datasets with spatial coordinates (x/y metadata). Shows where each point is located on the ground. Lasso-selected points highlight in both the 3D projection and the mini-map — connecting abstract projection structure back to physical location.

Parallel Coordinates

Color by… ▴

—Points

—Dimensions

—Projection

—Variance

Projection

Components

—▴

Size