collection of notes on virtual reality (vr) and stereoscopic image recording and viewing.
stereoscopy involves using two distinct images, one for each eye, to replicate the separate perspectives that each eye would naturally perceive. this technique primarily enhances depth perception. various animals employ different methods to perceive depth, such as analyzing light distribution, projection patterns, and parallax effects. a particularly significant method involves the relative distance of objects as projected onto eyes that are slightly apart. it is important to understand that eyes do not capture three-dimensional information; they receive flat, two-dimensional images. this limitation is why stereoscopy is sometimes referred to as "2.5d" rather than true "3d", as it does not provide full three-dimensional perception.
photographs and videos using one image per eye typically capture only the position and orientation of the camera at the time of recording. consequently, when viewing the result, moving the head will not reveal more of the objects sides. this limitation is another reason why the term "stereoscopic" is often preferred over "3d" which might be misleading. however, in the case that images are dynamically projected from three-dimensional data, such as in computer-generated graphics, and the display device supports head tracking, stereoscopy can simulate a viewing experience that realistically responds to head movements.
wikipedia
consumer recording devices are still quite rare in 2024.
box cameras
panasonic lumix dc-bgh1e or the z cam e2. see also: youtube: z cam e2 3d rig
either can not be mounted closer than 93 mm between the lens centers
designed for synchronization using a cable (genlock in)
beam splitter
an attachment to lenses or dual lens
two horizontally offset openings that direct the light onto halves of the image sensor
side-by-side: two cameras mounted next to each other
option
half-mirror: two cameras mounted horizontally offset and also at a vertical right angle (atop or below) with an added mirror to redirect the frontal image to the second camera. a half-mirror lets light pass on one side and reflects it on the other.
for cameras that are too large to be mounted side by side
mirrors necessarily decrease image quality, at least by a small amount, and will require some extra care to be kept clean
the interpupillary distance and the ratio to the distance between stereo images as recorded and displayed has a major influence on realistic depth perception.
if the distance between stereo images does not align correctly with the viewers interpupillary distance (ipd), prominent objects may appear as if viewed with crossed eyes.
commonly used are two images taken side-by-side with a horizontal distance close to the interpupillary distance of the viewer. 65mm is a common compromise.
the combined image will retain the same frame dimensions of each individual image overlayed. a rectangular frame with added depth is possible. images can also be taken with a greater field of view using so-called fisheye lenses. this can be used to simulate standing in front of or in a space that wraps around, and allowing the viewer to move the head to look around. this requires a resolution high enough so that the image portion that is actually viewed at any given moment is still detailed enough. a common format for videos with 180 or 360 degrees of view is side-by-side with barrel projection.
the same image compression and file formats as for monoscopic videos are used. the difference to monoscopic videos is that two images are encoded side-by-side as one larger frame.
common formats:
by image placement
by projection
barrel: 180 or 360 degrees of view projected on bent rectangular surface. see distortion
fisheye: 180, 190, or 360 degrees projected onto circles or (half-) spheres
more generally called head-mounted displays.
tracking in virtual reality refers to the continuous determination of the spatial position of objects in three-dimensional space. this information is used to simulate the visual changes perceived when moving through a three-dimensional virtual environment.
two primary tracking systems are commonly employed:
outside-in tracking
inside-out tracking
full-body tracking
full-body tracking refers to the simultaneous tracking of the head, arms, and legs, enabling more immersive and accurate interactions in virtual environments
pcvr
standalone and hybrid
very limited processing power, comparable to smartphones
it is important to know that most software assumes and requires two controllers and active tracking.
bigscreen beyond
directed at apple users because it requires a 200+ euro iphone to buy
steamvr support
4320x2160p
oled
extremely compact and lightweight. ~143x52x49mm, 127g
exact dimensions of the parts are public. 3d printed modifications are possible
after comparing the vive pro with vive controllers, the hp reverb g2, and valve index, i chose the vive pro.
htc vive pro
alternatives
valve index
hp reverb g2
inside-out tracking which is really not on par with outside-in tracking
vive pro 2
quest two
standalone
inside-out tracking
resolution
refresh rate
weight and size
glare
field of view
cables
since cables can usually be lead to go behind the head, this is not the biggest issue. however, cables pose issues with rotational movements. the players position in the real room can slowly rotate over time while playing and the cable gets in the way or is moved around. the cable also adds some weight
near- or farsightedness remains relevant even when viewing stereoscopic images. corrective lenses, such as glasses, must be used to ensure that the images do not appear blurry.
this occurs because, in vr headsets, the screens displaying the images are positioned at a fixed distance from the eyes. the lenses within the headset magnify these images, simulating a farther distance. however, the eyes ability to focus on these images depends on the individuals visual acuity. if someone is near- or farsighted, their eyes cannot properly focus on the magnified images without corrective lenses, resulting in blurred vision.
applications where added depth may work well: waves, spraying water, underwater environments, caves, interior design, dense urban areas, viewing products online, art installations and sculptures, anything where the volume of objects is of interest, 3d animated movies
applications where added depth may not work as well: fast-moving objects towards the viewer, unsightly views displayed too close, empty rooms and areas, foreground obstructions, excessive parallax and change of distance focus