Hololens 2 Display Evaluation (Part 6: Microsoft’s FUD on Photographs)

Introduction (Marketing FUD)

Microsoft recently put up an FAQ page with the topic heading. Why am I unable to take an accurate photograph of my HoloLens 2 display?. The purpose of this FAQ seems to be to discourage and discredit talking picture taking of the Hololens 2 (HL2) through the optics rather than to provide useful information. The FAQ includes incorrect and misleading information. I would classify it as “Marketing FUD” used to cast doubt on any pictures that are taken.

This blog is read by many key people in the AR industry, including many at Microsoft, and reaches about 20,000 people a month. The prior articles in this series have publicly posted by far the most “through the lens” photographs of the HL2’s display, so it does seem like it could be an indirect attack.

I agree (and demonstrate) that holding a smartphone up to the HL2 will not produce representative photos. But as will be demonstrated, with just the smartphone mounted on a tripod and a neutral density filter, anyone can take representative pictures of the HL2. It does not require laboratory instruments and cameras as the FAQ states. While the FAQ calls for millimeter accuracy, there is no way that the HL2 is held one’s head with such precision, so neither does the camera have to meet this requirement.

I have lots of pictures to share and even a short video clip that demonstrates the flicker associated with the HL2. The article leads with results that have pictures to show and later goes into the issues with shooting photographs. To make it clear when quoting the FAQcyan-colored italic text will be used. In testing out the various assertions in the FAQ, I will be going through them in a different order than they are presented in the FAQ.

After publishing this article, I’m going to put the HL2 issues to rest for a while and get back to work writing about MicroLEDs. Inevitably MicroLED topics will draw comparisons to laser beam scanning and the HL2.

Reflection Test

Before I get into FAQ issues, I want to share with you a problem I notice every time I power on the HL2, namely, I seen the four squares of the Microsoft logo with its glowing reflections. I then generated a test pattern with the quad squares in nine locations around the visible image.

Microsoft Knows the HL2’s Uniformity is Poor

Microsoft knows that the color and brightness uniformity across the HL2’s is poor. The FAQ’s recommendation to developers is to adjust their content as best they can to hide the problems. Quoting from the FAQ page:

Users will have the best experience when avoiding white backgrounds. Dark mode is a design principle used by apps to use black or dark colored backgrounds. The system settings default to dark mode and can be adjusted by going to Settings > System > Color.

Developers are advised to follow dark mode design guidance:

• Developer design guidelines for HoloLens displays

• Recommended font sizes

When a hologram requires a white background, keep the size of the hologram smaller than the display’s full field of view. This size allows users to put the hologram in the center of the display.

The dark mode is good practice for most AR applications, not just with the HL2. But how are developers supposed to avoid users moving around in a 3-D environment causing content to be small and in the center?

Following the link to Microsoft’s “Recommended font sizes” will lead you to a section that recommends font sizes in the 0.65° to 0.8° or 14.47 to 17.8 points when viewed from 45cm to be clearly legible or about double the 8-point font size claimed in Microsoft’s 2019 announcement (link to my discussion of this claim here). Microsoft also says to “Avoid using light or semilight font weights for type sizes under 42 pt since thin vertical strokes will vibrate and degrade legibility.” This wiggling is a by-product of the 4-Way interlacing discussed in Part 1 of this series.

Pictures from Microsoft of a White Screen

Microsoft has sent a separate document to customers about what customers should expect when looking at a white display. The pictures below are taken from that document with the “not normal” display (left below) and “normal” display (right below). Their “normal” display appears to have similar color characteristics to the HL2 this blog is using.

FAQ’s Emperor’s New Clothes Argument

The FAQ makes what I call the emperor’s new clothes argument when they say, “The HoloLens 2 display is designed to be viewed by the human eye.” The paragraph finishes with “Compared to the human eye, cameras see environments differently, and below are some factors that may impact any inconsistency between what a camera captures and what a user sees.” But then this is true of any picture taken of any display device.

Yes, it is a bit tricker with the HL2 than most other displays, but it is possible to get representative pictures as this blog has done repeatedly. I have yet to see a developer, or anyone from Microsoft, say that the photos on this blog are vastly different than what the eye sees.

The “the eye works differently than a camera” argument is often used by charlatans to claim that they have some magic leap 😃 in technology but that they can’t show it to you to prove that it works. “You have to see it yourself to believe it” argument lets them control the a. When I see that argument given, I immediately assume there is some form of deception going on. It is like a used car dealer or a politician saying to trust them.

Taking representative pictures with a smartphone

The biggest problem with smartphone cameras is that shooting with a low enough shutter speed to capture all the interlaced fields will cause the picture to be horribly overexposed. A smartphone on automatic will set the shutter speed too high (to reduce the exposure) and capture one field or less.

To shoot a picture with the iPhone 11Pro Max, ND8 (3-stop) neutral density filter was taped over the iPhone camera lenses. The iPhone and HL2 were both on tripods with the iPhone positioned for the best possible image. The iPhone’s “2X” lens has a 6mm focal length and a fixed f2 lens for an entrance pupil (aperture) of 6mm/2=3mm (more on the entrance pupil diameter later). ProCam iPhone application was used to give control over the shutter speed and ISO (the aperture/f-number is fixed on most if not all smartphone cameras).

The iPhone picture compares favorably in terms of artifacts with the Olympus 25mm shot at f8, which has a similar 3.125mm aperture. The photo below was cropped to match the iPhone picture for comparison. The “shutter speed” on the iPhone was set to 1/20th of a second to average out the scanning process, but there were still two shutter-lines/roll-bars captured that are artifacts of the camera that don’t show up with a regular camera.

The phone below was taken with the iPhone’s 1X lens that is 4.3mm at f1.8 and an entrance pupil of 2.39mm with a 1/20th shutter speed. This image has a wider FOV and thus shows more of the problems in the corners where the iPhone’s 2X lens cuts them off. Overall, even with the smaller entrance pupil, the image quality is not degraded significantly (more on the entrance pupil later).

Below for comparison is a handheld “snapshot,” letting the iPhone control everything with no ND filter. I tried to get a good picture, but it is impossible hand-holding (this was the best of several I took). The iPhone picked a 1/120th shutter speed and thus captured only one field. To get a good representative picture, both the HL2 and the camera need to be on tripods and carefully aimed.

False Impression By the HL2’s Mixed Reality Capture

HL2’s “mixed reality capture” was used to see how what Microsoft wants you to think the HL2’s view looks like versus reality. The capture combines a pristine image that has been made slightly translucent with a camera on the headset of the real world. The captured image is much better than the HL2 produces, and it shows a wider FOV than is actually displayed by the headset.

On the HL2’s captured image below, a dotted red rectangle of what can be actually seen on the HL2. Obviously, it captures a much cleaner image of the test pattern than the HL2 can produce with its display. FYI, the picture shows the blackout cloth with masking tape markers to help SLAM locking.

Below is a second capture where the window was enlarged to see how much the “simulated” view was considerably bigger than the virtual image on the HL2. The red rectangle shows the are visible in the HL2’s display versus the captured image.

The mixed reality capture images and videos are what Microsoft likes to show and encourages others to share. The pictures on this blog are vastly closer to how the HL2 display looks.

No evidence of “Active Color Correction” Claim

The introductory paragraph states, “The device has an active color correction system that adapts to a user’s eyes.” They then repeated this fiction in the bulled point copied below:

Eye movement. The display adapts to the movement of a user’s eye to adjust colors. What is shown on the display may differ depending if the user is looking at the center, the edge, or the corner of the display. A single image capture could at best only show what the display looks like for the axis that matches an eye gaze direction.

I have found zero evidence, nor have I seen Microsoft or anyone else show any proof or papers that there is any active color correction despite Microsoft’s declaration that it is being used. I have been looking for this from day one and have not found any evidence.

I went so far as to set up a camera to video the mirror image seen from the front of the HL2 while I moved my eyes around. The HL2 projects forward a mirror image of what it presents to the eye. Because of the HL2’s shield, the lens could not get close enough to see the whole image without significant vignetting (darkening around the corners). Still, no matter how much I moved my eyes, the colors never changed that I could see both with my eyes or in watching the video of the display from the outside.

I would love to see some papers or some proof from Microsoft that the HL2 is doing some form of “Active Color Correction.”

Video From Hololens 2 Demonstrating Flicker

While the FAQ does not address this obvious flicker with the HL2, I could see the flicker in the video I shot to look for the (apparently) non-existent “active color correction.” So I decided to shoot a video from the “normal” side. This video was shot at 120Hz and played back at 60Hz and gives a fair representation of what the flicker looks like to the eye.

This blog back in February 2019 in Hololens 2 First Impressions: Good Ergonomics, But The LBS Resolution Math Fails!, stated that the HL2 was going to have flicker. It turned out that the HL2 uses 4-Way interlacing, as was proven with photographic evidence in Hololens 2 Display Evaluation (Part 1: LBS Visual Sausage Being Made), which makes the flicker problem worse.

For those that want a quick look, you can click on the YouTube video below (play at full size on YouTube and while playing, use right-click -> loop to make it loop). I want to warn you that the YouTube compression adds noise to the video. If you’re going to see a cleaner video, you can download it HERE from a Google Drive and then play it on almost any video player (NOTE: do not bother watching the Google Drive automatically generated video associated with the file as it is messed up by the compression, you need to download to video to watch a clean version). The clip is only 11 seconds long. I cut the footage between “roll-bars,” and it is best observed with loop/repeat when playing it back. While not perfect, I think this video gives a fair impression as to the visible flicker inherent with the HL2’s laser beam scanning (LBS) display process.

Camera exposure time does not have to be and exact multiple of 1/120th as the HL2’s FAQ claims

The FAQ does some incorrect hand-waving about the exposure time copied below:

Camera exposure time. The exposure time of the camera needs to be an exact multiple of 1/120th of a second. The HoloLens display frame rate is 120 Hz. Due to the way the HoloLens 2 draws images, capturing a single frame is also not enough to match a human’s visual experience.

It is important to be aware of the fact that the HL2 is doing 4-way interlacing at 120Hz such that the whole image takes at least 1/30th of a second to build up. If you freeze the video image above, you will see a single one of the four fields.

When I wanted to capture a single field, I took a series of pictures at 1//125th of a second asynchronously to the HL2. I picked out the ones that didn’t have a roll bar in them. To capture all the interlaced fields, I usually shoot at 1/15th of a second or slower to capture two or more of each of the four types of interlace to averaged out any roll bar effects.

Aperture size does make a difference, but the HL2’s FAQ exaggerates the issue

Quoting the FAQ below:

Cameraaperture size. The camera’s aperture size must be at least 3 mm to capture an accurate image. Cell phone cameras with small apertures integrate light from a smaller area than the human eye does. The device applies color correction for patterns observed by larger apertures. With small apertures, uniformity patterns are sharper and remain visible despite color corrections applied by the system.

When they express the aperture in millimeters, it is reasonable to assume they are talking about the diameter of the entrance pupil. A will be demonstrated, there is a fall-off in image fidelity with entrance pupils (apertures) smaller than 3mm. One should note that the human eye’s pupil size varies from about 2mm to 8mm due to brightness and from person to person.

By definition, the entrance pupil of a lens is given by the focal length divided by the f-number. For example, a 17mm lens at f4 has an aperture diameter of 17mm/4=4.25mm. An iPhone Pro “2X” lens has a 6mm focal length fixed at f2 for an aperture diameter of 3mm.

Aperture size study

Below is a study on the effect of aperture (entrance pupil) size using a 17mm lens from f2 (8.5mm aperture diameter) through f22 (0.77mm aperture diameter). The first two images below show the extreme cases of a wide aperture and a tiny aperture. These pictures were taken using an ND16 (4 camera stop) filter, so there would be a slow shutter speed at f2 with a mix of shutter speeds and ISO levels to keep the exposure constant.

The gallery of images below show the cases in between to the two extremes with entrance pupils between 6.1mm and 1.06mm. You can click on any image in the gallery to see it full size. The banding on the sides starts to become pronounced with the diameter below about 2mm (f8 with a 17mm focal length lens). The increase in banding is gradual and not some sharp cutoff with an entrance pupil of 3mm, as the FAQ seems to suggest.

Camera Relief Study

Near-eye displays have an eyebox that is designed to be viewed from relatively short distances. “Eye relief” is the distance from the last optical surface to the front of the eye’s cornea. In the case of the HL2, with the shield pulled all the way down, the eye relief is about 18mm to 23mm. There will be considerable variance based on the person’s head shape. The human eye’s “entrance pupil” is approximately 3mm behind the cornea (more on entrance pupil and the eye later), so the entrance pupil of the eye is about 21 to 26mm behind the shield.

Lenses are complex and so the distance from the front surface of the lens or the distance to the sensor in the camera does not tell how it behaves optically. The “entrance pupil” distance is a better measure and the one I will use below.

Perhaps more importantly, I judged the images against what I was seeing with my own eyes against the pictures. I think any of the pictures from 20mm entrance pupil distances or shorter are reasonably close to what my eye sees.

For the study below, the camera’s lens starts by being positioned as close as it could get without touching the HL2 frame. The series of pictures were taken, moving the camera back in 2mm increments with constant camera settings.

There is more vignetting (darkening in the corners) and banding that gradually occurs as the lens moves away from the shield beyond about the 20mm picture. These images agree with what is seen with the eye at similar relief distances. No matter how close one’s eye gets to the shield, there is vignetting, particularly in the upper left and right corners of the HL2’s display.

Camera Horizontal Positioning Study

Below is a sequence of shots taken with the lens in the optimum position vertically and then moving the camera from as far left (Left-0mm picture) as it would get to 10mm to the right in 2mm increments. As the camera moves to the right, the left side of the image becomes brighter. But past about 6mm right, the right side of the image becomes darker, and the left side becomes more (falsely) colorful. The sweet spot seems to be 2mm to 4mm from the left-most position (an why a small camera lens is essential).

Vertical Positioning

Vertically, I found it best for the camera’s lens to be located as far up as possible without touching the headset when the lens is near the shield. This position also happens to look close to what my eye sees. If the lens is positioned too high, then the bottom seems dark and more (falsely) colorful, and if positioned too low, the top gets dark and with more color.

Binocular Viewing

Binocular viewing. The HoloLens 2 display is designed to be viewed with both eyes. The brain adapts to seeing two images and fuses them together. Images of only one display ignore the information from the other display.

This point is, at best, a half-truth (or less). Whether it is “designed to be viewed with both eyes,” it does not fix the problem that both displays are not very good. Additionally, humans with normal vision have one dominant eye that will work to override the other when there is a difference.

As noted in Hololens Display Evaluation (Part 3: Color Uniformity), a person with normal vision sees is the combined image from the left and right eyes (photos from that article below). Combining both eyes will result in a bit wider FOV and a slightly more uniform image, but it is not dramatically better. The most significant improvement is in the far left and right sizes, where the left display has more vignetting on the right and the right display more vignetting on the left.

From Part 3: Showing the left and right display and the combined image (click for a larger version)

Both the left and right displays vignette in the upper corners, and so it does with both eyes. Indeed, the errors in one display are not “canceled” out by the others.

Micromovement bunk in the FAQ

The FAQ has some mysticism about micromovements and requiring a laboratory setup:

At the same time, if the device moves at all—even micromovements—the system reprojects the image on the display to stabilize holograms. Capturing multiple frames while keeping the HoloLens from moving usually requires a laboratory setup.

You do want to have both the HL2 and camera on tripods or other firm bases. I had both the HL2 and the camera on separate tripods, so both the HL2 and camera didn’t move, and camera movements would not affect the HL2. To capture all the fields for figuring out the 4-way interlacing in Hololens 2 Display Evaluation (Part 1: LBS Visual Sausage Being Made), about a dozen photos were taken to capture various fields at 1/125th of a second. All the images laid nearly perfectly on top of each other.

As has been shown, it is possible to get a representative image of the HL2 without needing a “laboratory setup.” This statement is just more FAQ FUD to scare people away from taking pictures or believing pictures that are posted.

Camera Versus the Human Visual System+

Quoting the FAQ:

Image correction. Typical digital cameras and smartphone cameras apply a tone reproduction curve (TRC) which boosts contrast and color to provide a snappier outcome. When applied to a HoloLens 2 display, this tone curve amplifies non-uniformities.

The camera making the image “snappier” is unfounded if the camera is set up and positioned correctly. There is an issue of the human seeing color and brightness relativistically across the FOV, whereas the camera is “objective/fixed” across a given photo before any processing is applied.

It would be fair to argue that the human visual system and its relative nature might tend to hide problems with the HL2’s uniformity. But after looking at hundreds of pictures from the thousands taken, the image presented on this blog are representative of what the eye sees.

Human Brightness Subjectivity

The eye sees color and brightness relative to other parts of the image. Grey, on a light background, looks black, whereas the same shade of grey on a dark background looks white. Believe it or not, the rectangle in the middle (below) is a single shade of grey.

The human vision system will even impose brightness difference based on context. The famouse Adelson Checker Shadow illusion (below left), and the corner shadow illusion (below right) show how the brain tries to make things “look right.” The A and B squares in the Alelson illusion and the top and bottom surfaces in the corner shadow illusion are identical shades of grey.

The human vision system will even impose brightness difference based on context. The famouse Adelson Checker Shadow illusion (below left), and the corner shadow illusion (below right) show how the brain tries to make things “look right.” The A and B squares in the Alelson illusion and the top and bottom surfaces in the corner shadow illusion are identical shades of grey.

All the circles above are the exact same color as circle on the right

David Novick of the University of Texas at El Paso has a Color Illusion Page, which includes the image on the right. All the circles are the same color, but the human vision system is influenced by the surrounding colors to see them differently. If you look at the full-size image (click on the thumbnail), the fact the color of the circles are the same will be more apparent.

HL2 Versus a Camera

Since the color uniformity of the HL2 is abysmal, it is impossible to determine the “white balance point” of the HL2. It is certainly no on the black body curve used for setting the white point. So I had to estimate a white point that included adjusting the red down.

The brightness fall-off from the upper center of the HL2’s display to the outsides is much worse than it appears to the eye when measured objectively. There is a definite “hot spot” in the center of the HL2’s display about 1/3rd of the way down. The human eye does a “dynamic For the pictures used in this article, I shot in camera raw and then used Photoshop’s raw “auto” option which has a similar effect to what the eye does to bring up the brightness while maintaining contrast in the shadows. The net result was very close to what my eye sees.

David Novick of the University of Texas at El Paso has a Color Illusion Page, which includes the image on the right. All the circles are the same color, but the human vision system is influenced by the surrounding colors to see them differently. If you look at the full-size image (click on the thumbnail), the fact the color of the circles are the same will be more apparent.

HL2 Versus a Camera

Since the color uniformity of the HL2 is abysmal, it is impossible to determine the “white balance point” of the HL2. It is certainly no on the black body curve used for setting the white point. So I had to estimate a white point that included adjusting the red down.

The brightness fall-off from the upper center of the HL2’s display to the outsides is much worse than it appears to the eye when measured objectively. There is a definite “hot spot” in the center of the HL2’s display about 1/3rd of the way down. The human eye does a “dynamic For the pictures used in this article, I shot in camera raw and then used Photoshop’s raw “auto” option which has a similar effect to what the eye does to bring up the brightness while maintaining contrast in the shadows. The net result was very close to what my eye sees.

There are many good articles on how the eye and camera differ including Cambridge in Colour

Absurd Entrance Pupil Requirement in the FAQ

The FAQ repeats its point about the aperture size (entrance pupil) but then makes an absurd statement about the pupil has to be in front of the lens (highlighted).

Camera entrance pupil. The entrance pupil of the camera should be at least 3 mm in diameter to capture an accurate image. Otherwise, the camera captures some high frequency patterns not visible to the eye. The position of the entrance pupil both needs to be in front of the camera and positioned at the eye relief distance to avoid introducing aberrations and other variations to the captured image.

I can’t figure out why one time they use aperture size (in millimeters) and then later entrance pupil to talk about what should be the same thing. The position of the entrance pupil is often (incorrectly) called the nodal point or no-parallax point of the lens (definition here and how to find the distance here).

It makes sense to consider the entrance pupil of a “normal” lens for the relief distance. But the “The position of the entrance pupil both needs to be in front of the camera” appears to be a pretense. It eliminates all “normal” lenses and would eliminate the human eye. The eye has its entrance pupil about 3mm behind the cornea (see right from telescopeѲptics). The cornea has about 70% of the refractive power of the eye. Just like most camera lenses, there is a front lens(es), followed by a diaphragm (eye’s iris), and rear lens(es).

For “simple” short focal length prime (single focal length) non-macro, the entrance pupil is a few millimeters behind the first lens surface. For a longer, zoom, or macro lens, the entrance pupil could be many millimeters behind the front of the lens and can cause vignetting of the eye box. For the Olympus 17mm f1.8 lens used in the distance study to follow, the entrance pupil is only about 12 millimeters behind the front filter mount (that limits how close the lens can get to the HL2’s shield).

Lenses with pupils in front of the lens or at infinity (telecentric lenses) are used in scientific instruments. The goal of taking pictures of the HL2 is to show what it looks like to the eye, not like a measurement instrument. So using a lens with the pupil in front of the lens would be at best counterproductive.

You might also note in the above figure that the focus distance of the eye is shown as 22.2mm and 22mm is often given as the focal length of the eye. But since the eye is curved, has multiple elements, and filled with fluid, most sources say the effective focal length of the eye is closer to 17mm. The lenses I have used for most of my pictures have 17mm and 25mm which are close to the focal length of the eye.

For a good video discussing the size of the entrance pupil watch the YouTube video: Aperture & f-stop Myths Debunked: The Importance of the Entrance Pupil . An article on locating the position of the entrance pupil can be found at Finding the Entrance Pupil.

Microsoft on Shooting a Picture of the HL2

The FAQ makes with a False Assertion quoted below:

All said, it is still possible for specialized industrial cameras to capture representative images from the HoloLens 2 display. Unfortunately, smartphone, consumer, and professional cameras will not capture images that match what a user sees on HoloLens 2.

This statement is not valid and was shown earlier, one can capture a representative picture with a cell phone camera. Indeed, larger “professional” interchangeable lenses (such as by Canon, Nikon, and Sony) cannot get to the right position. But the smaller Micro-Four-Thirds (M4/3) lenses, such as by Olympus, are (just) small enough to be able to get to the right position. The advantage of the M4/3 system over a cell phone type camera is that it gives full control over aperture and focal length. I bought the M3/4 camera and lenses for the express purpose of photographing AR displays as I already own Canon DSLR lenses (only a few of which are shown) and bodies. It turns out my favorites for shooting the HL2 were the Olympus 17mm and 25mm f1.8 lenses that are reasonably close to the human eye’s approximate focal length.

Eye position does need to be reasonable precise

Quoting the FAQ,

Eye position. The HoloLens 2 display is designed specifically for the eye position of the user. The HoloLens 2 employs eye tracking technology to adapt to the user’s eye position. A camera that that is mispositioned by a few millimeters can lead to image distortion. Accurate positioning with a camera is difficult and needs to match the exact location and eye relief for which the device is performing color correction.

Camera position. Cameras that meet the requirements to view the HoloLens 2 display are larger, and it is difficult to position the camera close enough to the HoloLens 2 display to observe the color corrected image. If the camera is in the wrong place, the color correction may negatively impact the capture of the HoloLens 2 display.

The statements above are generally accurate in terms of the position of the camera being important. If the lens is too large, then it can’t get to the right place for getting a representative image, which is why I used the smaller Micro 4/3rd lenses and camera discussed earlier. For shooting pictures, I have both the HL2 and the camera on independent tripods. I then positioned the camera to give the best image. I went back and forth, comparing what the camera was taking to what my eye saw when the camera was positioned optimally. I made every effort only to show problems that my eye was seeing.

In addition to having a ball head to aim the camera, I then used a “2-axis macro focusing rig” to give me fine positioning of the camera in X&Y and adjusted the vertical axis by raising the tripod center column. I also used some longer Acra-Swiss-type plates to get things “close.” It is possible to position the camera with just a tripod and a ball head using feedback from the live image on the camera, but the rig and the Arca-Swiss plates make it much faster.

Conclusion –

While I agree that holding a smartphone up to the HL2 will not produce good results, representative pictures can be achieved. It is not the all or nothing that the FAQ suggests. While the FAQ calls for millimeter accuracy, the HL2 is not held one’s head with such accuracy. If anything, the pictures this blog presents are better than typically occur as the camera is positioned for the best possible image.

As stated repeatedly, my goal for a “good” picture is one that looks like it does to my eye. I make every effort to position the camera in the place that gives the best image and even bought lenses to improve the image quality. Any editing or processing of the images is to make them look more like what it does to the eye.

Do You Need a Good Image for AR?

Most of the issues shown in this article primarily are as a result of the HL2 optics. As discussed in Hololens 2 Display Evaluation (Part 1: LBS Visual Sausage Being Made), the image quality of the laser beam scanning (LBS) display itself is horrible.

By any objective measure, the HL2’s image is terrible. It is nothing like the concept videos the “Mixed Reality Capture” pictures and videos supported by the HL2. The resolution is poor, the image flickers, and you can’t depend on the colors and the colors change across the field of view. In short, you would not want to watch a movie on an HL2.

As bad as the HL2’s image quality is, the endorsements from the HL2’s customers and developers suggest it still has “enterprise” applications as a way to merge virtual images with the real world. In these applications, acceptable image quality may not matter.

Why So Much on the Hololens 2?

It comes down to Microsoft being the most prominent player in AR Today (at least with the fall of Magic Leap) and it was a technically interesting subject. There had not been a serious analysis of the HL2’s image quality and its many problems. For a huge established company, they also “fibbed” a lot about the capabilities of the HL2 in ways that could be tested and proven to be false. I’m more than willing to defend, and if necessary, correct, anything written on this blog. I don’t think Microsoft could say the same about the HL2.

Appendix – Camera Options – Not Many These Days

As stated in the article, you can get some representative pictures of the HL2 with a smartphone camera. You will want an ND8 or, better yet, an ND16 filter and a software application that lets you control the ISO and shutter speed. But thanks to the success of cell phones as cameras, the rest of the camera market is dying off. In case you are not aware, the digital camera market has been falling rapidly since its peak around 2011, causing a shakeout. So the FAQ is almost right in that there are very few consumer cameras other than smartphones that will work.

The market seems to be consolidating around mirrorless interchangeable lens full-frame (35mm) cameras that may have an APS-C size sensor in the lower-cost models and all-in-one vlogging cameras that support blogging.

Olympus and the whole 4/3rds system appears to be one of the systems that will eventually go away. I’m using the Olympus body and lens because it is the smallest and only interchangeable lens system that would fit (the Canon M-series and Sony E-mount are too big). Olympus also has some simple prime lenses that support wide apertures. It takes good pictures, supports RAW and gives full control. It was the first with “in body stabilization” (and a good one) that lets me handhold pictures when on-the-go at conferences (remember the good old days of live conferences?). The “kit” 14-42mm lens by itself works good (although the primes are noticeably sharper).

The “prosumer” cameras with built-in lenses market seem all but gone except for vlogging oriented cameras, and there are very few that would work well. I have not tried, for example, the Sony ZV-1, but on paper, it has (just barely) a wide enough aperture and a small enough lens. As the ZV-1 is targeting video rather than still pictures, it looks like all the controls are buried in menus, and it would be a bear to use. Also ZV1 cost about $250 more than an Olympus OD-M10iii with its 14-42mm lens.

I was willing to buy another camera including lenses, but the Olympus 3/4rds OD-M line (either OD-M10 or OD-M5 lines as the other bodies are too big) still looks like the best camera for shooting AR headsets. It is somewhat uniquely positioned.

Karl Guttag
Karl Guttag
Articles: 244

16 Comments

  1. So what you’re saying Karl (for my nutshell) is unless you blow $400k on a helmet system such as a F35 fighter pilot helmet, Rainbow’s End / sci-fi AR is not working in any meaningful way any decade soon for consumers.

    • I don’t think the consumer would be happy with the image in the $400K F35 Helmet. Consumers have high expectations for image quality set by today’s flat displays and at very low-cost expectations.

      There do appear to be applications that can cost-justify AR today. These are ones where AR can improve operator efficiency and/or improve safety.

  2. Are we by any chance done with Hololens/MVIS article series?
    Don’t get me wrong, it’s a good read, but I think the Kura “LED pixel strip scanning” and similar topics like LetinAR “pin-mirrors” are a bit more interesting since they are new ideas.
    LED pixel strip scanning was done in VirtualBoy but for AR and pin-mirrors I’d assume you would need much more luminance.
    Besides if you want to scan an 8K image, that’s ~4000 scanlines VS VirtualBoy’s 224. I think that means ~18 brighter pixel strip needed in the first place before we factor in AR optics light loss and frame duty cycle (the pixel stip scan at 60Hz doesn’t take full 16.67ms). A lot of interesting things to discuss here I think.

    • I’m done with dealing with Hololens and LBS directly for now. But it is inevitable that it will come up in dealing with MicroLEDs and other displays and optics.

      With AR headsets there is much more to it than just generating an image. Some of the key issues (not all of them):

      1. You have to get that image to appear at the desired focus point, typically in the range of 2 meters (ignoring vergence accommodation issues which are real). In the case of waveguides, you have to collimate the image (make it focus at infinity) and then turn around and move the focus from infinity back to about 2 meters (if you want it there) between the waveguide and the eye (Magic Leap did this with the exit gratings, Hololens has a lens after the waveguide and a compensating lens for the real world before the waveguide, other just leave the image focused at infinity).
      2. The light “coupled” into the combining optics. This is a HUGE issue for emissive displays like OLED (which fail miserably) and even MicroLEDs when using pupil expanding waveguides (essentially all flat “waveguides.”
      3. You want an eye box that is big enough that the user can see it even if the glasses are not on “perfectly.” Bigger eye boxes and wider FOV are even more inefficient.
      4. You have to combine the virtual image with the real world. This is usually very inefficient particularly if the headset is going to support transparency.

      Various optical solutions do these operations in different orders. From what I am told the efficiency of coupling a MicroLED into a diffractive waveguide reduces the nits by over 10,000 to 1 (give or take depending on the FOV and eyebox). If you start with 1 million nits from the MicroLED, you might get well less than 100 nits out. So you can forget about scanning a MicroLED into a waveguide. The pin-mirror approach has it drawback in image quality, affect on the real world view, and efficiency.

      Let’s just say that a lot of companies are “very optimistic” and that this optimism might be driven by the need to raise money.

      It is going to take many articles to walk through all the above.

  3. Thanks Karl,

    I believe this info from JBD regarding their microLEDs is public now but I’m having a hard time finding their slides on Google right now.
    In case this is useful, JBD have a microlens in front of each pixel emitter. Due to the relatively small pixel emitter to pixel pitch size (about 40% fill) this allows to reduce the pixel beam angle to ~60 degrees. It’s still not great but from what I’ve been told an order of magnitude more efficient than your standard lambertian emitter. They have also claimed 5 million Nits possible with their larger pixel 720p green panels although the current real quote is more modest at 1.5 million.
    What my understand is is that if we assume 5 million Nits and we want to scan a LED strip into 4000 scanlines, that’s 5 million/4000 = 1250 Nits for the whole image before factoring in optical losses from waveguide. Is this a fair assumption?

    I think due to the smaller surface area of a LED strip a larger heatsink may be used to overdrive the LEDs but I don’t know how much that will help, just an idea.

    I don’t know how well the diffractive waveguide issues such as efficiency or FOV with emissive displays translates to “pin mirror” “waveguides”. This is what Kura claims to be using. I’ve only had a short discussion about this with an optical engineer and we concluded the individual pin mirrors probably act as a pupil duplicator (I may be buchering the correct term) and while it may work, how seamless the combined image will look, especially as your gaze changes is questionable.

    Thanks.

    • First The X-million nits already takes into account the microlens array. That is part of how they get the nits that high. By reducing the emission from 180 to 60 degrees, simplistically, thus should increase the nits by about 9x (3-squared). You still have to consider the size of the whole display versus the size of the entrance of the waveguide; this area difference creates a large loss due to etendue. Then there is the pupil expansion going from the entrance grating to the exit grating the brightness is reduced by the ratio of these areas. Then to top it off, you have 3 sets of diffraction losses with most diffractive waveguides which I think is another 10x or more.

      So if you started with 1250 nits into a waveguide, you might get about a 1/10th of nit out. With a scanning display, the etendue of the display is the size of the scanned display. Note that Kura is planning on having multiple rows of LEDs (they don’t say how many) so they would be getting N rows (they don’t give a number) of LEDs contributing to the brightness. Each row of LEDs would contribute to all the pixels (MicroLEDs can switch very fast, at least in theory).

      When you look at pin mirror displays, you have some coupling losses into the optics. Assuming the pin mirror are each fully reflective, you then have the loss of the area of the pin mirrors that will contribute to the image versus the area of the exit area of the light. To support an eyebox, not all the pin mirrors will contribute light that is seen, lets say for the sake of discussion about 50% of the pin mirrors will get light into the eye at any time. Then you have to look at the area of the mirror versus the gaps between mirrors. My understanding is that gaps between pin mirrors should be less than the width of the human pupil or there will be gaps in the image as the eye moves. So the pupil size sets the gap between mirrors. Early LetinAR devices had big pin-mirror but later ones and the ones demoed by Kura had smaller mirrors (I think in the order of 1mm or less) I would guess this area loss is on the order of 5:1 with big mirrors that will cause a person to see grey spots to about 10-20 to 1 with smaller pin mirrors.

      You are correct that the pin mirrors do a form of pupil replication. Two mirrors next to each other have almost the same image (a subset of several mirrors has the whole FOV but there is a lot of overlap of the image from mirror to mirror). BTW, the Lumus waveguides “slates” also are doing replication.

      The net is that the pin-mirror is probably still several orders of magnitude more efficient than a diffractive waveguide but it has the issues of mirrors dots being visible causing grey spots and diffraction in your vision, particularly as the human eye’s pupil becomes smaller with brighter light. So Kura’s plan of using N-rows of MicroLEDs, in theory, could be bright enough. The issues with Kura is whether they can get the N-row by 8,000 (or even 4,000 or 2,000, or 1,000) MicroLED built and then the complications of the scanning. Then you have the issues just mentioned of the pin mirror image fill factor and grey dots in the real world.

      Hopefully the above makes some sense.

      • Hi Karl,
        Thanks. You should reuse these comments in your articles in the future.

        I don’t think there’s much complication in manufacturing a microLED strip as long as they don’t try doing it themselves and partner with someone like JBD for it.
        I also don’t see difficulty in scanning a strip. The mirror would be relatively slow than what you need for a laser dot scanner and Hololens2 already uses one.

        That said I made a rookie mistake in my old post, my initial 1250 Nits estimation was based on a 5 million Nit microLED panel. Of course if we assume a strip of the same microLEDs that thinner strip can’t itself be 5 million Nits as well like a panel does.

        So while a strip with more than one row or several evenly spaced strips would increase the brightness of the final image, I don’t think we can start off my assuming these strips can be 5 million Nits themselves or even combined.

        Due to price considerations and what production cost JBD has estimated, for a consumer headset the strips could have a total of VGA amount of pixels per color, or ~300,000 per color. For enterprise headsets (few 1000$) a 1920×1080 equivalent would be possible, or ~2,000,000 pixels.
        For 8K scanned display if you have pixel strips with total of 2,000,000 pixels, that’s 2,000,000/4000 = 500 strips. I think this means losing 4000/500 = 8x the Nits to scan a complete 8K image and strips again can’t be 5 million Nits themselves. I think each strip would be 1250 Nits, 500 strips would be 625,000 but the scanned 8K image would get that down to ~78,000. Unlike the claimed numbers, the quoted numbers by JBD are a more modest 1,5 million Nits panels, but only per color.
        I don’t think these values are bad at all for pinmirrors or birdbath optics, unless I’m missing something.
        There’s a lot of bs both with startups and tech giants but at least the good brightness with scanning a strip doesn’t seem impossible unlike what I assumed initially with a single pixel row.

        Pin-mirrors are another story. I agree with noticeable seams in the image. But I think there’s as bug of an issue here: I’m not sure the images formed by each pinmirror would have correct offset and perspective to form a correct combined undistorted image on the retina, even if we ignore the seams between those individual images. I tried a simple test with a DIY 3d printed pinhole array glasses. The images not only get distorted on the edges of the pinholes but also distort as your eye rotates. But this was just a quick test, the pinmirror system may be more intricate than a simple pinhole. In one of their diagrams the pinmirrors even had curvature to them.

        Thanks.

  4. So do we have any other consumer version AR gadgets to wait before Apple’s 2023 plan? Digilens support LCoS combines waveguide, Vuzix’s new glasses looks cool, Ostendo’s Nanotechnology sounds intresting but no one knows any details.

    Someone said, neither LBS or Micro LED would lead us into Neal Stephenson’s Snow Crash, we have to wait until MIT nano and KOPIN’s Liquid-crystal-based integrated optical phased arrays (VIPER) comes out in the future.

    Does anyone know anything about the VIPER here?

    • It’s not clear that even Apple will make AR Glasses a consumer product by 2023. Digilens modular design seems like more of an impractical concept. There are issues with using MicroLEDs with diffractive waveguides. Ostendo has working full-color monolithic MicroLEDs (the first I know of to have them) but nobody knows if they can scale to production and I don’t know if their optics will work that combines 3 devices for wider FOV. I certainly don’t believe that LBS is a good display technology. I don’t know about the VIPER specifically, but phase LCOS is unproven in terms of high resolution (Hololeye as some demos uses lasers and phase LCOS but the image quality is not that good https://holoeye.com/spatial-light-modulators/).

      If you get away from the consumer product, then AR has a growing set of applications. It has proven to save time and thus the money to pay for itself. Also in some life for death applications like medicine and military. But consumers will want very high image quality at a very low cost and at the same time be lightweight, small, fashionable, with a long battery life and the list goes on.

  5. Binocular Viewing – you may find some of the devices are ‘bluer’ on one side and ‘reder’ on the other side – then when you combine two together you get ‘whiter’ spot at the center. But of course, it causes dizziness….:D

    • Both sides vignette in the upper corners so they don’t help the other. I also don’t see were where side has the opposite/complementary color issues with the other so they could cancel out. The left display is a little brighter on its left side and the right display is a litter brighter on its right side; so the combined effect helps a little bit. Looking with one eye, the other eye, and both eyes, the color uniformity does not improve tremendously. Yes, it is a little better with both eyes, but it does not solve the problem.

    • As a display, the HL2’s LBS is very poor by today’s standards. If you went to a TV store and saw it next to the worst TV in the store you would say it is horrible by comparison. The resolution is poor, the image flickers and blinks. The laser engine is pretty large compared to other display options.

      HL2 is succeeding in spite of having a terrible display. First of all the applications it is going after don’t require a good display. By giving up on looks and concentrating more on function, HL2 is useful for industrial applications. The sales pitch is something like, “A person with salary, benefits, facilities, and equipment, cost about $100/year. If the HL2 can save just a few percent per year, of that person’s time it pays for itself). The Key is the recognition that the display can be bad and the form factor is crap compared to Oakley sunglasses and the product can be useful. The market seems to be for jobs where the person needs to keep their hands free yet needs some basic information and/or some guidance.

  6. “… the system reprojects the image on the display to stabilize holograms.” — What does that even mean? It’s complete bunk. It’s something i’d expect to come out of Elon Musk’s mouth or a tweet, not an official Microsoft FAQ. Who even wrote that nonsense?

    • I think they were trying to say that if the headset senses any movement, it is going to reposition the image in response to it. At least that is how I read it.

      Most of the “FAQ” was marketing bunk to discourage picture taking.

Leave a Reply

%d