collection of notes on virtual reality, vr, and stereoscopic image recording and viewing.
stereoscopy involves using two images, one for each eye, to replicate what two eyes would separately perceive. this technique primarily enhances depth perception. animals use various methods to perceive depth, such as analyzing the distribution of light, projection patterns, and parallax effects. one particularly significant method is the relative distance of objects as projected onto eyes that are slightly apart. it is important to note that eyes do not capture three-dimensional information. instead, they receive flat images. this limitation is why stereoscopy is sometimes referred to as 2.5d rather than full 3d, as it does not provide true three-dimensional perception.
photos and videos typically capture only the position and direction of the camera at the time of recording. consequently, when viewing an image, moving the head will not reveal more from the sides of the object. this limitation is why the term stereoscopic is often preferred over 3d which might be misleading. however, if images are dynamically projected from three-dimensional data, as may be the case for computer generated graphics, and the display device supports head tracking, stereoscopy can simulate a viewing experience that responds to head movements realistically.
communities
consumer recording devices are still rare in 2023.
box cameras
panasonic lumix dc-bgh1e or the z cam e2. see also: youtube: z cam e2 3d rig
either can not be mounted closer than 93 mm between the lens centers
designed for synchronization using a cable (genlock in)
beam splitter
an attachment to lenses or dual lens
two horizontally offset openings that direct the light onto halves of the image sensor
side-by-side: two cameras mounted next to each other
option
half-mirror: two cameras mounted horizontally offset and also at a vertical right angle (atop or below) with an added mirror to redirect the frontal image to the second camera. a half-mirror lets light pass on one side and reflects it on the other.
for cameras that are too large to be mounted side by side
mirrors necessarily decrease image quality, at least by a small amount, and will require some extra care to be kept clean
the interpupillary distance and the ratio to the distance between stereo images as recorded and displayed has a major influence on realistic depth perception.
if the distance between stereo images is not correctly aligned to the viewers pupil distance, prominent objects may appear similar to looking with crossed-eyes.
near- or far-sightedness persists even with stereoscopic images. glasses or other types of correcting lenses have to be used when viewing so that images do not appear blurry.
required for basic human depth perception are two images taken side-by-side with a horizontal distance close to the interpupillary distance of the viewer. 65mm is a common compromise.
the combined image will retain the same frame dimensions of each individual image overlayed. a rectangular frame with added depth is possible. images can also be taken with a greater field of view using so-called fisheye lenses. this can be used to simulate standing in front of or in a space that wraps around, and allowing the viewer to move the head to look around. this requires a resolution high enough so that the image portion that is actually viewed at any given moment is still detailed enough. a common format for videos with 180 or 360 degrees of view is side-by-side with barrel projection.
the same image compression and file formats as for monoscopic videos are used. the difference to monoscopic videos is that two images are encoded side-by-side.
common formats:
by image placement
by projection
barrel: 180 or 360 degrees of view projected on bent rectangular surface. see distortion
fisheye: 180, 190, or 360 degrees projected onto circles or (half-) spheres
more generally called head-mounted displays.
linux support: the author failed to get any vr applications to function under linux.
tracking refers to the continual determination of the spatial position of three-dimensional objects. this information can be used to simulate the visual changes perceived when moving through a three-dimensional environment.
two tracking systems have been established:
outside-in
inside-out
cameras or other types of sensors inside the headset receive the light or other signals from the controllers to calculate their position
usually requires a well lit room
controllers usually have to be in view of the headset sensors/cameras
the headset builds an internal model of the room to calculate the headset position in the room
full-body tracking refers to tracking the position of the head, arms, and legs at once.
pcvr
standalone and hybrid
very limited processing power comparable to smartphones
promising upcoming device: bigscreen beyond
directed at apple users because it requires a 200+ euro iphone to buy
steamvr support
4320x2160p
oled
extremely compact and lightweight. ~143x52x49mm, 127g
exact dimensions of the parts are public. 3d printed modifications are possible
note that most software assumes and requires two controllers and active tracking.
after comparing the vive pro with vive controllers, the hp reverb g2, and valve index:
htc vive pro 1
valve index
hp reverb g2
inside-out tracking which is really not on par with outside-in tracking
vive pro 2
quest two
standalone
inside-out tracking
resolution
refresh rate
weight and size
glare
field of view
cables
since cables can usually be lead to go behind the head, this is not the biggest issue. however, rotational movements are significantly limited. the players position in the real room can also rotate slowly over time while playing and the cable gets in the way or is moved around. the cable also adds weight
applications where added depth may work well: waves, caves, interior design, viewing products online, anything where the volume of objects is of interest, 3d animated movies
applications where added depth may not work well: fast moving objects towards the viewer, unsightly views displayed too close, empty rooms