304 North Cardinal St.
Dorchester Center, MA 02124
304 North Cardinal St.
Dorchester Center, MA 02124
In the last article, I showed what the real world looks like through the Magic Leap One (ML1). For this article, am going to share some pictures I took through the ML1 optics displaying test patterns.
Above left is a crop of the original test pattern scaled by 200% compared to a picture of the same portion of the test pattern taken through the ML1 (for reference, the whole test pattern linked to here). This test pattern with the various features is a tough but fair way to check out different image quality aspects. The single and two-pixel wide features are meant to test the resolution of the display. A hole was left in the larger pattern to allow an iPhone 6s Plus displaying part of the test pattern to show through as a reference. There is additional information on how the picture was shot in the appendix at the end of this article.
Most of the Magic Leap demos have colorful but smaller objects which work as both “eye candy” as well as serving to hide the lack of color uniformity across the FOV. The use of faces with skin tones in the test pattern is there because people are more sensitive color of skin. The test pattern has large solid white objects across the FOV to identify any color shifting.
I used the Helio web browser to display the images, and some of the image resolution issues could be due to the way the ML1’s Helio browser scales images in the 3-D space. I tried capturing the test patterns and displaying them in the ML1 gallery, and the results were considerably worse. I viewed the same test pattern with Hololens with its browser, and it is noticeably sharper than the ML1 although the Hololens is a bit “soft” as well. It would be good at some time to go back and separate the browser scaling issues from the optics issues, but then again, this is the way the ML1 as a whole normally displays 2-D images.
I have looked at detailed content on the two different ML1s, and none of it is sharp, so I think that these images fairly represent the image quality of the ML1. Even if the scaling engine on the ML1 were poor, the degree of flare/glow and chroma aberrations which are caused by optics would suggest the resolution of the ML1 optics is low.
I only tested the “far focus” (beyond ~36 inches) mode as it would have been very difficult to test the near depth plane focus mode. I could sense that the near focus plane was sharper than the far focus plane and as the diagrams from the Magic Leap patent applications suggest (see right). The far focus planes go through the near focus plane exit gratings to get to the eye which might be part of the problem. I would have liked to have tested the near focus plane as well, but there was no way to scale the test pattern that would work, nor was there a way I knew of to keep the headset in near focus “mode.”
The pictures below were taken through the ML1’s right eye optics with my annotations in red, green, and orange. You may want to click on the images to see detail. To be fair, closeup camera images will show flaws that may not be noticed by the casual observer. Generally, projected images look worse than direct view displays because of imperfections in the optics, but in the case of the ML1, the diffractive waveguides appear to limit the resolution.
While there are differences between how the human eye and a camera “sees” an image, it gives a reasonably good representation of what the eye sees. A camera is “objective/absolute” whereas the human visual system is more subjective/adaptive and judges things like brightness and color relative to a local area and makes the background in the picture seem darker than it does “live.” The artifacts and issues shown in the photo are visible to the human eye.
Overall the color balance is good in the center of the image. You will notice a color shift in the skin tones of the two faces in the test pattern, but it is not terrible until you get to the outer 15% of the image where there is significant color shifting to blue and blue-green as can be seen in the photo.
Issues with the ML1 Image:
In order to see more detail, the picture on the left has the camera zoomed in by over 2X to give more than five camera samples per ML1 pixel (click in the picture to see it at full resolution). The part where the iPhone shows through has been copied and moved to line up with the text in the ML1’s image. The iPhone’s image shows what the text should look like if the ML1 could resolve the image.
Text on the ML1 by any is noticeably soft. The ML1 is less sharp than Hololens even and less sharp than Lumus’s waveguides by an even wider margin. The one pixel wide dots and 45-degree lines are barely visible.
I was expecting the color uniformity problems and image flare/glow based on my experiences with other diffractive waveguides. The color in the center of the FOV is reasonably good on the ML1.
But I just can’t get past the soft/blurry text. I first noticed this with the text in Dr. G’s Invaders teaser (on the right) which is why I set out to get my own test pattern on the ML1. I don’t know yet how much this softness is caused by the dual focus planes but I suspect it is a reason why the ML1 is blurrier than Hololens.
As some time in the future, I hope to be able to bypass the 3-D scaling to directly drive the display to better isolate the optical from any scaling issue. I would also be curious if I could lock the device into “close focus plane mode” and test that mode independently. As I was currently driving the ML1, very soon after I take my eye away from the ML1, it switches back into far focus plane mode (which is why I did not run a test in the near focus plane mode). If someone wants to help with this effort, please leave a note in the comments or write to email@example.com.
I used an Olympus OM-D E-M10 Mark III mirrorless camera. I specifically chose this camera for taking pictures of headsets due to its size and functionality. On this camera, distance from the center of the lens to the bottom of the camera is less than the distance from my eye’s pupil to the side of my head so that it will fit inside a rigid headset with the lens centered were my pupil would have been. In portrait mode, it has 3456 pixels wide by 4608 pixels tall which is over two camera samples per pixel of the ML1’s spec’ed 1280 by 960-pixel LCOS device. The camera has 5-axis optical image stabilization which greatly helps in taking hand-held shots which I was required to do.
The “far focus” of the ML1 is set to ~5 feet (~1.5 meters). I put a test pattern on this website and use the ML1’s Helio browser to bring up the image. I then moved the ML1 headset back and forth until the test pattern filled the view which occurred when the virtual image was about 4 feet away.
The picture on the right shows the setup of the iPhone when viewed from an angle. It gives you an idea of the location of the virtual image relative to the phone. This picture was taken by the camera through the ML1, and only the red annotations were added later.
From other experiments, I knew the “far focus” of the ML1 is about 5 feet. I set up an iPhone 6s Plus in a “hole” in the test pattern put there to view the phone. To have the phone in focus at the same time as the virtual image, I set the phone behind and adjust the phone’s location until both the phone and the ML1 images were in focus as seen by the camera. I then scaled the iPhone’s display to have the text the same size as the displayed on the ML1 as seen by the camera. In this way, I could show what the text in high-resolution test pattern should have looked like through the camera, and it verifies that the camera was capable of resolving single pixels in the test pattern.
The iPhone’s brightness set to 450 cd/m2 (daytime full-brightness) so that it could be seen after being reduced by 85% as seen through the ML1, so the net was only about 70 cd/m2. I took the picture in camera RAW and then white balanced based on the white in the center of the ML1’s image which makes the iPhone’s display look a bit shifted toward green. The picture was shot at 1/25th of a second to average out any field sequential effects.
For reference, the image on the left is a pass-through frame capture taken by the ML1 from about the same place. With pass-through, the ML1’s camera and the exposure of the test pattern can be set independently. In this image, the ML1’s camera appears to have focused on the far background which puts the iPhone out of focus, but you can get a feeling for how bright the iPhone was set.
Interestingly, I saw some different scaling artifacts in this pass-through image than I saw in the image in the camera; in particular, thin black lines on a white background tend to disappear.
The pass through’s image is biased to favor white over black. Looking at the 1 pixel wide features under the “Arial 16 point,” the black 1-pixel dots and lines are all but lost, and even the two pixel wide ones to their left are almost gone.
I would like to thank Ron Padzensky for reviewing and making corrections to this article.
In case this is useful, the magic leap sdk for Unreal engine is available and I suspect you could use it to place various test images and 2d/3d artifacts in a 3D scene to explore the optics perfomance further as well as examine the performance of the structured light sensor, eye tracking and the controller. https://docs.unrealengine.com/en-us/Platforms/AR/MagicLeap
I’ve got some experience with Unreal Engine and Hololens and happy to help if I can.
There are standard metrics to evaluate eye pieces image quality. To begin with, the most relevant standard graph would be “through focus MTF” at frequency that corresponds to the eye resolution (1 MOA)
I don’t know of a standard metric and I don’t think the manufactures would want one :-).
You have 3 basic stages; the display device itself, the “projection optics” (usually lenses, mirrors, or a combination), and the combiner (waveguide, partial mirror, or some kind of birdbath with a curved mirror and a beam splitter). Each of these has different problematic areas. Some technologies have issues with resolution, some with color aberrations, some with in-focus reflections, and others with out-of-focus reflections (flare, glow, etc.). There is not one number such as you have with MTF that you can use that totally describes these different issues.
Personally, I like to understand which of the “stages”, the display device, projection optics, or combiner, is causing which issues. You can sometimes figure it out by understanding the technology used (mirrors for example don’t usually cause chroma aberration) and things like waveguides cause direction specific issues where projection optics are usually radial.
Another factor complicating things with AR mixed reality displays is that they tend to lock things in the real world and then scale everything in the virtual world. This then makes it tough to figure out whether resolution issues are optical or due to scaling. Even when you think you are putting up a 100% image (supposedly not scaled), I have found that the headset appears to be scaling it.
According to the standard model of optical design (the standard model is a concenzus in the optical design comunitee), the performance of optical systems depends on the optical elements nominal specifications and manufacturing tolerances. Ray tracing simulations are used in the design phase to predict the performance, given the optical elements. The through focus MTF is one such performance probe. The through focus is required to accounts for the eye accomodation (which is abscent in the camera setup, as you mentioned). Brightness uniformity (Aka Vignetting), chromatic abberations (indeed, as you mentioned, the lateral are the relevant for the eye, which is again more forgiving than a camera) and every other parameter has a corresponding standard probe (optical design is a very mature discipline). Your analysis is probably the best that can be done with a camera and common sense. But in order to really compare apples to apples on a standard scale, the standard parameters are to be measured.
I think I was answering a different question. I was talking about showing the end result and dealing with a product that I am trying to analyze. Manufacturers are not going to give computer models to work from and often I can’t control even the input the way I would want. Heck, most of the manufacturers don’t even give decent spec’s and even play “hide the spec’s that are not good for us.”
I’m not a reverse engineering firm and don’t have the resources and equipment to tear down several units so I can do a detailed analysis of the various components and reverse engineer computer models. Much of what I do is to show people how they can do basic analysis with just a camera as an “instrument.”
I’m writing for an audience of over 20,000 people that visit this blog in a month. I expect that less than 10% of my audience have detailed optical knowledge, although I know many optical experts read the blog. Much of my audience is made up of AR industry insiders and managers that are not optics experts, business analysts, technical reporters, and people simply interested in AR and VR and technology in general.
I think in the end the standard metrics are a means to an end.
If it looks great on paper but doesn’t connect with people then it is a failure. It could look really awful but if the content is there and people get into it then it is good enough. A single frame of NTSC video looks absolutely godawful but we watched it for 50 years.
The trouble is that Magic Leap isn’t doing so well on the marketing and content fronts.
The price is too high. If I have that much money burning a whole in my pocket I could get a new OLED TV or a home theater setup. I know that will improve my experience of content I have now and content that will come out in the future.
Magic Leap has short demos and interesting-looking trailers, but it doesn’t have a killer launch title (like “Super Mario Brothers” was for the NES.) Without that it is an especially tough sell.
Magic Leap is in a bad box.
The headset is very expensive to make. It is far too expensive for a toy. It blocks too much light to be useful in most of the common industrial applications for AR.
The image quality is far below what people can get from a TV today. The effective resolution is below 640 by 480 pixels because the optics/waveguides blur so much. The contrast is about 300:1 off-on and more like 50:1 ANSI (due to all the scatter in the waveguides).
It is going to be tough to make a compelling game. The whole use model of interacting with the real world is very tough as the real world is very complex and very different from user to user. Every room is unique, with different lighting, sizes, shapes, walls, doors, windows, furniture, etc. There is very little you can assume will be there and not cause a problem. You can make interesting demoware with things bouncing off walls and furniture, but it gets hard from there.
In short, Magic Leap is likely to change the future of displays the same way Segway changed the future of transportation.
Thanks for the analysis
Your blogs are always interesting and educating
Thank you very much for the analysis.
I have one question – I am not sure I understand what is the “ML1 pass through” capture shown in the last photo. Can you please explain it? Is it just the display source image overlaid on a photo taken by the ML1 camera without going through the optics?
The “ML1 pass through capture” was taken using the Magic Leap phone application. The application signals the ML1 to take a picture with its camera of the real world and the ML1 overlays it with what is it supposed to be displaying. You can then download the image that the ML1 makes. So yes it is just a simple overlay, all done by the ML1. It is a “solid” overlay where pixels brighter than about 15/255 are treated as opaque and pixels less than about 15/255 are treated as transparent.
Karl, thank you for an interesting (as always) insight on the display technology.
With all that pains when using combiner optics (bulky mirrors, waveguides full of artifacts) no matter of how many billions thrown away 🙂 isn’t it better now to resort to pure electronic way of combining AR image? Two cameras -> image processor -> microdisplays.
From owning a Glyph I know it can render very lifelike image when fed with high-quality high-dynamic range footage, like shot with DSLR camera. But FOV is limited for sure.
Let’s make so called “Glyph+” engine upscaling current Glyph ~2times. So, 2 times DLP chip diagonal, 2 times x/y resolution, two times FOV about 74 degrees, two times exit pupil, a bit better eye relief. Then pair it with couple of good 4K fast cmos cameras with matching FOV. All real/AR image augmentation have to be done in image processor. Signal processing path have to be tuned to a minimum latency, 30ms glass-to-glass or less, but I don’t see it much a problem, especially with billion in a pocket 🙂
Some limitations, for sure there will be limited 3D clues for the viewer from the outside world, and of course it will be hard to do multiple focal planes for AR. Possibly some tricks like oscillating lens in optical path and sequential scanning can do? On the other side, such device can have very high (true cinematographic) quality for its own image including AR.
With such image quality, and if price can be made manageable, “Glyph+”can be made uniquely useful for a consumer as high-end portable entertainment, with added AR tricks if necessary. Unlike all current offerings which can only generate an ugly picture, at best…
Also it will be uniquely able to distort/modify real world image for the user with unlimited possibilities, something that optical combiners cannot do in principle..
What do you think?
What you are talking about is known as “pass-through AR” (PT-AR). PT-AR has its good and bad aspects as well.
It is good for combining image and for controlling the amount of transparency. It does not have the problem of only being able to “add light”, it can “subtract” light in the real world. You can also get much better registration between the real and virtual worlds.
The bad aspects of PT-AR include:
1. Lag by the camera capture time and any processing time in seeing the real world. This can be a serious problem if you are moving.
2. Focus does not work right – You are stuck with the camera’s focus and potential lag — your eye is constantly moving and sampling
3. Field of view in the real world is limited by what you can see on the display
4. Lack of dynamic range of any display versus the human eye.
5. Angular resolution — may eventually be solvable with Foveated display but current technology either can give a wide FOV or angular resolution but not both.
I often say, “what is easy to do with Pass-Through AR is hard with optical AR and vice versa.”
u r aware of this http://lightfield-forum.com/wordpress/wp-content/uploads/2013/07/nvidia-near-eye-light-field-displays.jpg https://youtu.be/deI1IzbveEQ https://youtu.be/8hLzESOf8SE
it’s also PT-AR but it has a bit different set of problems
[…] I have stated in Part 1 and Part 2 of my review of the ML1, the ML1’s image is exceedingly soft/blurry. The ML1’s […]
[…] I have stated in Part 1 and Part 2 of my review of the ML1, the ML1’s image is exceedingly soft/blurry. The ML1’s optical […]
When measuring resolution of any device supposed to be seen by human – aperture size of the camera becomes very important. If camera have larger aperture – you’re exaggerating aberrations and optical defects vs what’s seen by a human eye.
This Olympus camera has relatively large MFT sensor and to match 2-3mm pupil expected from human in 200 nit illumination – needs to be stopped down significantly.
What was the aperture size used when making test photos?
Don’t want to say that ML1 is not terrible though 🙂
Some very good technical points about the numerical aperture of the camera. I kind of like the Olympus for how close it can mimic the geometry of the eye (of course it has a flat versus curved sensing surface).
In terms of Numerical Aperture, most of the wide shots were taken with a focal length of 14mm and an f-number of 5.6 or above. That would put the effective aperture around 2.5mm. For the zoomed-in shots, I mostly used 42mm, also at F5.6. so the numerical aperture was more like 7.5mm. There is a bit of a trade-off in that I didn’t want to go so slow that I got motion blur. The ML1 will turn off the display if you take your head away for too long which kept me from using a tripod. Thankfully, the Olympus has a good optical image stabilization. But I was afraid to stop the camera down much more for fear of blurring the image.
I thought about special rigs and the like including have a “fake set of eyes” to try and trick the sensors so I could use a tripod, but that could have turned in to a big product.
Just a comment… I really like your blog and the thorough detail of your analysis, but you do seem to have a bias against Magic Leap over other similar diffractive waveguide displays. Just an observation…
One note to consider regarding the military application, they did go with a diffractive waveguide approach by selecting Microsoft for that contract.
I have looked at a lot of diffractive waveguide displays and I am calling it like a see it. A lot of things get studied by the military that don’t work out. We will have to see if diffractive waveguides ever get deployed successfully with the troops, I doubt it.
[…] Magic Leap Review Part 2 – Image Issues […]