The Next Big Thing
I visited the Apple store today to experience the Apple Vision Pro demo, which could very well be the next iPhone moment.
Here are my thoughts:
In summary, Apple Vision Pro represents a significant leap in XR technology, combining high-fidelity visuals with an innovative interaction system. It's not just a new device but a whole new platform that could redefine how we interact with digital content, showcasing Apple's ability to innovate and lead in new technology domains.
The Pro's Micro-OLED display is absolutely top-notch, combined with a 90hz refresh rate, the realism is almost brain-deceiving. Watching videos and 3D movies feels genuinely different from previous devices, suggesting Apple believes only ultra-realistic visuals can make XR applications viable. Probably only Apple would spend so much to convince customers to pay for this.
The Passthrough experience powered by the R1 chip significantly outperforms competitors, and while there's still some noise in the visuals, it's leagues ahead of the Quest3. It's noticeable but good enough to overlook. No doubt, hardware performance will continue to improve, which is Apple's moat, developing a dedicated chip is truly remarkable.
Given the visuals can compare with reality, the focus naturally shifts to photo and video. Spatial Video truly immerses you into wonder, just short of tactile and olfactory experiences. It's hard to explain without firsthand experience, like explaining the internet to someone 100 years ago. One demo was watching a soccer game as if you were sitting above the goal, watching the goalkeeper defend. The narrator mentioned some things money can't buy (but buying the headset can). Thinking about it, if attending the World Cup front row costs a fortune, then $3500 for multiple games in the best seat seems reasonable. This has huge potential in entertainment and film, possibly changing how we sell tickets to sports events. This might also explain why Apple insists on having its streaming platform, Apple TV+.
Apple introduced a whole new interaction system for Vision Pro, arguably the most systematic and suitable for extended use to date. Simply put, Apple has separated the actions of scrolling and clicking into eye tracking and hand gestures. Since the interaction space is 3D, traditional 2D screen interactions don't apply. To overcome the freedom on the Z-axis, Apple ingeniously replaced the mouse with eye focus, meaning you don't have to keep lifting your hands (though you can if you want). However, since the eyes rapidly jitter when focused on something, hand gestures are used to confirm actions and control inputs like clicking, sliding, and pinching, with more gestures likely to be developed. [1]
Based on this, users can envision resting their hands on a desk, lap, or chair to operate the system. For precise input, keyboards, mice, drawing pads, and customized controls represent future possibilities. This is a significant shift from the "raising hands" mode since it's impractical to do so for 8 hours. Users can operate a 3D system without lifting their hands, freeing them for other tasks like typing on a computer, eating, cooking, or exercising. You could even operate another computer system simultaneously without being tied to the Vision Pro.
From a development perspective, Apple has focused on developing the underlying system and establishing fundamental interaction principles, using systematic standards to build a new XR ecosystem similar to the App Store ecosystem. A good example is the learning curve for elderly individuals who have never played VR to learn how to control a controller, which can be quite steep. However, introducing them to gesture interactions based on MacOS and iOS is relatively easy for them to grasp. This is why all demos showcased basic functionalities, gesture operations, web browsing, videos, and apps, rather than flashy 3D games. Currently, the Oculus ecosystem operates in a disjointed manner, with Meta, Unity, and content creators each doing their own thing, leading to incompatible SDKs and poorly executed documentation.
In terms of weight, it's slightly heavier than Quest3, perhaps because I added lenses. But honestly, Quest3 isn't much lighter. Also, Vision Pro gets warm after prolonged use, making the face cushion feel hot, but these are solvable issues. For example, you could lie down to watch movies or work, or change the headband or pad.
Price-wise, many initially thought $3500 was too steep, and rumors suggested manufacturing yields were low, so stock might be limited. But at the store, scalpers were buying in bulk, and staff mentioned a wide age range of buyers, from 60 to teenagers. Several employees even purchased it themselves, sharing they were initially skeptical but were ultimately impressed. When asked about the comparison of one Vision Pro to seven Quest3s, they said seven Toyotas can't compare to one Aston Martin. Apple's brand premium is indeed high.
In summary, most criticisms of Vision Pro revolve around its price. Conversely, even if Meta spent the same amount on hardware, integrating software to match Apple's experience might not be possible. Apple has transferred its experience in phones, computers, systems, apps, headphones, and AI to Vision Pro. I anticipate a surge in XR designer and developer jobs, marking a new era of opportunities.
The question arises: Why, after a decade of Oculus and significant investment by META, did they not opt for high-end pricing and significant breakthroughs in interaction solutions, instead seemingly focusing on hardware upgrades? Apple, on the other hand, has forged its own path. I believe that even after Facebook rebranded to META, the product vision remained unclear, lacking effective integration of software and hardware. In contrast, Apple's vision has shifted the discussion from a VR Metaverse to a reality-integrated XR system. The question now is, what does the future computer system look like in a 3D space? How to build this platform and enable developers to enrich this ecosystem?
今天去 Apple store 體驗了 Apple Vision Pro 的 demo, 可以算是下一個 iPhone moment 了。
記錄下自己的觀點:
總而言之,Apple 在基於超高保真的視覺效果下,把MacOS的系統挪到新的XR平台上。為了達到這個願景,蘋果把所有黑科技堆到頭盔上。為了對應新平台的特性,使用場景,蘋果重做了一整套新的交互方案。或許這就是蘋果破解手機創新困境的解法,一個全新的XR平台,Spatial computing。
Pro 的 Micro-OLED display 絕對是最頂級的了,配上 90hz 的刷新,畫面率逼真度幾乎可以欺騙大腦了。看視屏和 3D 電影在感受上確實和以往其他的裝置不用,Apple 似乎在表達只有夠逼真的畫面才能讓 XR 應用合理。估計也只有蘋果會這樣做砸錢讓客戶買單。
搭配 R1 chip 的 Passthrough 體驗確實吊打市面上其他家的性能,雖然畫面還是有點雜訊感,但遠勝 Quest3 好幾倍。雖然還是肉眼可感受,但足夠好到可以忽略。毫無疑問未來硬體性能肯定會持續提升,這也是蘋果的護城河,要自己幹出一個專用的晶片還是 Apple 牛逼。
既然視覺上和現實能比擬,展示重點自然就是和照片和視屏。Spitial Video 確實讓你深入奇境,只差沒有觸覺和嗅覺的體驗。真的能做到大腦被欺騙的感覺,這點在沒有親生體驗很難用言語解釋。如同你回到 100 年前和古人解釋 Internet 怎麼運作一樣。其中一個DEMO是觀看足球比賽,你坐在在球門上方臨場觀看守門員如何防守地方進攻。當球破門而入的時候,旁白表示有些東西錢都買不到(但買頭盔可以)。事後想想,如果 World Cup 坐第一排得花多少錢買門票,如果$3500讓你連看多場球賽並坐在最好的位置,似乎也挺划算。這塊在娛樂、影視行業肯定潛力巨大,說不定以後球賽改賣 Vision Pro 的 ticket。這可能也解釋了為何 Apple 當成硬要搞自己的串流平台 Apple TV+。
Apple 替 Vision Pro 搞了一套全新的交互,也算是目前最有系統、適合長時間使用的方案。簡單說 Apple 把鼠標的 scroll 和左右鍵的動作拆分在眼部定位和手勢操作上。因為操作空間是3D 的,所以傳統設計給 2D 屏幕的交互不適用在此。為了克服在 Z 軸的自由度,Apple 巧妙的用眼睛的焦點來替代鼠標,也就是說用眼睛來投射空間深度,你就不用一直舉手控制遠近(你想要可以,畢竟手有極限)。但因為人眼在盯著看東西的時候會快速來回氈抖,所以當眼睛鎖定目標後,用的是手來確認動作和控制 input 的數值,如點擊、滑動、pinch,未來肯定還會有更多手勢交互會被開發出來。[1]
基於上一點,你可以預想使用者把雙手坐在桌上、腿上、椅子上,操作系統。如果需要精確的輸入,鍵盤、鼠標、手繪板、定制化 control 都是未來的想象空間。這和現有的 “伸手” 模式有很大的不同,因為你不可能舉手 8 小時。使用者可以做到不用抬手的情況下操作 3D 系統,甚至解放雙手做其他任務,如在電腦上打字、吃東西、煮飯、健身。甚至你可以同時操作另一套電腦系統, 不需要一定得和 Vision Pro關聯。
從開發的角度,Apple 算是把重心放在開發底層系統,建立底層交互原則,用系統性的規範來建立新的 XR 生態,提供一個類似 App store 的生態。一個很好的例子是,讓一個沒玩過 VR 的老年人學習如何控制 controller,Learning curve 還是挺陡峭的。然而讓一個老年人控制基於 MacOS 和 iOS 的手勢交互,其實對他們來說就挺容易上手的。這也是為何所有 demo 展示的都是基礎功能展示、手勢操作、瀏覽網頁、影片、App,而不是酷炫的 3D 遊戲。目前 Oculus 生態都還是各做各的形式,Meta、Unity、Content creator 各搞各的,SDK 之前互相不兼容,Documentation 也做的跟坨屎一樣。
重量上確實比 Quest3 重一點,也可能是我加了鏡片。但老實說 Quest3 也沒輕到哪裡去。另外 Vision Pro 待久了機器發熱,靠著臉的 cushion 確實還挺悶熱的,但這些都是好解決的問題。比如,你可以躺著看電影、工作,或者換個頭戴、墊片。
價格方面來說,很多人一開始覺得 $3500 肯定沒人買,另外傳聞工廠製造良率不足,所以庫存應該不夠,必須得等一段時間。但人到現場發現,一堆代購人手 10 個,店員表示購買人群從 60 到十幾歲都有。甚至好幾個店員自己都掏錢買了,他們表示一開始也都是半信半疑,但體驗完都表示非常震撼。我問店員有人覺得一台 Vision Pro 可以買 7 台 Quest3,但他說七台 Toyota 不能和一台 Aston Martin比。看來蘋果的品牌溢價確實高。
基於以上,目前大部分人對 Vision Pro 挑的毛病都是價格。反過來說,如果 Meta 用同樣的錢,砸出一樣的硬件,但軟體整合或許也未必能做到和蘋果一樣的體驗。 蘋果算是把之前做手機、電腦、系統、App、耳機、AI 的經驗都移植到了 Vision Pro 上了。我預計未來會有大量的崗位招聘 XR designer、XR developer,確實是個新時代的機遇。
那麼問題來了,為何做了十年的 Oculus 和投入大量資源的 META 沒有選擇走高端定價的產品,以及交互方案上沒有重大突破,甚至變成了只是在追器硬件的升級。然而蘋果卻走出了自己的道路?我認為即使 Facebook 把公司名稱換成 META 之後,產品的路徑依然沒有清晰的遠景,並且在軟硬及硬體上缺乏有效的整合。而在 Apple 的遠景中,現在的討論已經從 VR 的Metaverse 變成和現實結合的 XR 系統。即,在一個 3D 空間中,未來的電腦系統是什麼樣子?如何打造這個平台,並讓開發者豐富這個生態。