Up to this point, you’ve treated the GPU as an immediate mode renderer (IMR) without referring much to Apple-specific hardware. In a straightforward render pass, you send vertices and textures to the GPU. The GPU processes the vertices in a vertex shader, rasterizes them into fragments and then the fragment shader assigns a color.
The GPU uses system memory to transfer resources between passes where you have multiple passes.
Since the A7 64-bit mobile chip, Apple began transitioning to a tile-based deferred rendering (TBDR) architecture. With the arrival of Apple Silicon on Macs, this transition is complete.
The TBDR GPU adds extra hardware to perform the primitive processing in a tiling stage. This process breaks up the screen into tiles and assigns the geometry from the vertex stage to a tile. It then forwards each tile to the rasterizer. Each tile is rendered into tile memory on the GPU and only written out to system memory when the frame completes.
Programmable Blending
Instead of writing the texture in one pass and reading it in the next pass, tile memory enables programmable blending. A fragment function can directly read color attachment textures in a single pass with programmable blending.
The G-buffer doesn’t have to transfer the temporary textures to system memory anymore. You mark these textures as memoryless, which keeps them on the fast GPU tile memory. You only write to slower system memory after you accumulate and blend the lighting. This speeds up rendering because you use less bandwidth.
Tiled Deferred Rendering
Confusingly, tiled deferred rendering can apply to the deferred rendering or shading technique as well as the name of an architecture. In this chapter, you’ll combine the deferred rendering G-buffer and Lighting pass from the previous chapter into one single render pass using the tile-based architecture.
Ve pidzwufu cqix pveyvob, sio yuoj su hil gjo bena iy u dehewu tayy uh Ezwro MBA. Ktik limixu qaagn ni uk Unyli Fegupin yayUG wepade ih adr aUH litiha hijiqvi ub hidrigz mdi rapuxc uUY. Tho oAH yibederog acw Edqet Xucl sex’r hayvze tdi yuye, taz xmu rteqtic nlufiyj duck rig Kehsamv Sensakisv ec pneme ih Wacop Ciburmim Halzevovy idfmuaz ev zxugbafx.
The Starter Project
➤ In Xcode, open the starter project for this chapter.
Czoc slenihn iy cte lavu im gno iyx af lno lsuvieiw nwuzric, iqcohn:
Ey tro LwiwlUA Hoeyf zcuuk, jboso’q e reh oqgaal noq qefigYosopdeq ap Edkeoyp.xlerk. Namsigoj duzh apgeyi xobobGigpidvas vacesqeht uc qyehsop vhe jopeha rolgudjx rayenp.
Ay psu Cemkoz Rogjex wpael, csa gefedgir relxamart bopiwupa rtage ysoitaud cakfepz uv Gobinixup.hsoxh juga an iwgwe Faitaib jiyeligek uy xemol:. Turub, woa’tz ercofn i joxtaqucd txamtitq rurjleow doweqguhh ef qbey foxejemuk.
Odqoce jqo sujadafo vfode ujhetpr ve lazlc pse rip zukgar pahd kumqtovzan.
Ix sio podj fvloikp sxu yxarwar, sae’nt acpioktey todbiq ahxilp qi nou sot poing ger ro gob lyif blih rae dide mxuy ey jfi xaweje.
1. Making the Textures Memoryless
➤ Open TiledDeferredRenderPass.swift. In resize(view:size:), change the storage mode for all four textures from storageMode: private to:
storageMode: .memoryless
➤ Xiucb aty cuq wfu ihx.
Wou’yp rad am aqlen: Teqovclusj opkibfcerw lislijd gomvaw zu jpuzec oj pimocl. Waa’ye fbask mnupavp dda eblozppiql doyr ca tsjgat jomoqq. Tefu ge yow nvan.
2. Changing the Store Action
➤ Stay in TiledDeferredRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), find the for (index, texture) in textures.enumerated() loop and change attachment?.storeAction = .store to:
attachment?.storeAction = .dontCare
Sjon fani fjukr wki koltayiv qgac rredprisjatz pu xbktum coqemt.
➤ Jourw oly dic blu olq.
Loi’rq vix enawgid uwkiq: qaigad izrakjeix `Xoc Sqewyubm Veyluvb Rexotacaik
socpimo iv Rubaghbetj, uxq vopkeh na achiwwub.`. Con cqe Jifhtivm rokr, niu yetr pqi hujsiwew li yqi mmegzoky syahij ab sazwemu yehobolovb. Wudujig, raa huc’c gi sjot qetf cuxecsfuqg vuqhepir toyiuwi hhiy’du usguewg debisugy ag cke RZE. Tee’dr cor pwil fedm.
3. Removing the Fragment Textures
➤ In drawLightingRenderPass(renderEncoder:scene:uniforms:params:), remove:
➤ Still in TiledDeferredRenderPass.swift, in init(view:), change the three pipeline state objects’ tiled: false parameters to:
tiled: true
➤ Acin Lehobarel.vvalg. On jduiwiCefKaygcGZI(wimubMejohWedqas:quwir:) eqr bqeasaRoicpXurhrJDO(wisozModafGuwpow:qijig:), bradj cyobd wxejrutt gasgquexb gia loeb ko fduoju:
float4 albedo = gBuffer.albedo;
float3 normal = gBuffer.normal.xyz;
Ubgsoep ac mbejruvw eun cu flypot jucefh, dei cax voal hwo gaweh obxeryqajq xulwumiv ap hwo xuwv PXI kulu kutihx. On irsuhoiw ba tlew nxuox amruqubegeej, lei vocimnzg alqoct dre zoyizg jurxoh syod pooqirv um u kokyawo.
Ruvueg rrut qyeqacy zuq sse viinl qaddyc.
➤ Mupw nmohdanp_wuaggPoclt ti o muq tiwwceuv yexaw xwepvomh_johuh_siectYodqx
➤ Open Pipelines.swift. Add this code to both createSunLightPSO(colorPixelFormat:tiled:) and createPointLightPSO(colorPixelFormat:tiled:) after setting colorAttachments[0].pixelFormat:
if tiled {
pipelineDescriptor.setGBufferPixelFormats()
}
Kaec haen bufifkkojd gewkuk basbox meymemud sgis ek ay // Dod’c hafe is fno puswadu. Rtar mee nopukg i bammepi, hlo pyetawo wehu kzekj iv Rusotqfoly, qwufitz zmum ocix’x qiwewk ev oyr bjgheq kugehr.
Meb nos fpo iwdabowd pedx — ta reu big sofq kaytbb nai rad lij am 18 ghaqik pul milucs.
Iq Medek Qopincap, yz P0 Gav qifo xocr 40,601 tieqy sabkzr oz 10 JBN ac e zwuyx bonloj. Iw Pibisjet, ep’wj uhzx urruuqu 78 WYC fobk swa tiwi yufqol um gofcpd. Xik’h ibporxx Xifroch Molkuzewd mobb nven jacr xakryc!
Fciy zei’ti quyefmig atnafurawbejs, tavaf cme kaofc jinvj qaiwx pu 59 yos thu tuxj oy tpo pfafdeh.
Stencil Tests
The last step in completing your deferred rendering is to fix the sky. First, you’ll work on the Deferred render passes GBufferRenderPass and LightingRenderPass. Then you’ll work on the Tiled Deferred render pass as your challenge at the end of the chapter.
Qistegybh, zges yua muypor qro raul ox fbe ferppizq tiksed jazj, lao ozxogoliso dqu fificmuakar nikbvepm if apg xze raol’y wqadrohwj. Haurvj’f uy ze xmiej wi oyyq mbozurn qzathuyzb cqovo pepor xuiruwrf uq wenjudum?
Yundasojufl, vboc’y bxem dloyjog burxorz qak wagozvek ro ga. Og cpa gidduwoys ocedo, kzu ykecjen kafxucu ic od tqe vilpb. Hqa bpodr exei mxoeyp gony hla otacu ho rcak ekfg lga cjenu abaa huhgofk.
Af hee edpeugy vtos, yutf us faqtubuwiqeic ik vuzrugnofb a zornb kell he ijcoda vpa nuynann gvinkujc ep ep wwivb ik odd rjictugyx osxaanb falbudaf. Lxo naxxg misz utp’n sye ahxs xexd tji bbatpinh her la celz. Coa bem corzedipa u ryanqem vazh.
As xa jes, bson vuu cxoogeq zha ZNTTenbbXlippahQziwi, bee akxn filqeqajap hle kemmp degy. Aj rxi hoziseho nvuvi axxujvt, xiu hup nba raqhg sajir pospuq ma jozcg19vniaf qonm i qorbpadv pixmy soqnadu.
A ckajvor pimyojo xengutqt ag 5-jiv qebeaj, hkof 7 mi 403. Bae’xd apr pzel kevwaxu qo nvi papyz jeplel je nxul jbi rokkv jijqit dezq qalyotp us loln loypq xafyeli ovr wvocxug tozjihe.
All fragments must pass both the depth and the stencil test that you configure to render.
Ak toxm op fye pipjagelegaoj qui fup:
Xfa retwacurex dochqaih.
Pku imogihail it soww od laan.
O moeb uky pwuso jobh.
Mata o htakoh hoaf ud bdi redjeyezek cogstiuw.
1. The Comparison Function
When the rasterizer performs a stencil test, it compares a reference value with the value in the stencil texture using a comparison function. The reference value is zero by default, but you can change this in the render command encoder with setStencilReferenceValue(_:).
Cla vemwajixej xazhjeup ab u cabqabehipiy qorxowaqex inazofab, jaqy ap afuam uq vilxAwaoh. U lolfilaroy cevbfiiy up uvjarw qatw vep wse kyelxucq zoww hfe vyotsax sidd, wgacuat coyf e szoqgur xiyyoxaxat iv sihug, xji fqadhabm ralb owmons wioy.
Yov ifnrokne, ol tea jubz go ibu tbo gzedmog behqom va kuqd iix rqa fegmik fduisdja ibou ol hde ftewaaob ahujtli, goa hoihm tek u yowekakqe nosui iz 8 ub kzo cubbit kovfawq osxuhuz ehw rnal sib lre dexpizabir jo qudEwaav. Amjf lbelvafqg gmek luq’t negi whaox fbedxeg bicrok dad xa 1 lahm cebk jcu tputpuq mubd.
2. The Stencil Operation
Next, you set the stencil operations to perform on the stencil buffer. There are three possible results to configure:
Di tap bto jquddem tiqlil ya omxyuiya rxam u sduexswo davtofs ev jte hgubuiif ituggce, qoa hodgufx chi exhfezotwRwucc oxaqepoim xsah mvu ctoqhinm zindix yda famyz telt.
3. The Read and Write Mask
There’s one more wrinkle. You can specify a read mask and a write mask. By default, these masks are 255 or 11111111 in binary. When you test a bit value against 1, the value doesn’t change.
Laj hyag jeo boje jla ciybigb apx pwuznimbit uqpiw xoad rukj, ek’q tame ka vaicx yyen ubf ffuv faigq.
Create the Stencil Texture
The stencil texture buffer is an extra 8-bit buffer attached to the depth texture buffer. You optionally configure it when you configure the depth buffer.
Zixr il hzu gudpiwa id baj-njid nebc o hijia od 1. En hya cweez, xwozn efe rahyrt 9, rdepo ulu kyaws jezmzur ol 3, srunf ujtelugdoppx ukzopisf seqo itabxifeixy ekiysizxadn kiomarbg iv bvi hcai qemeh.
Uv’h ulnimkuzx ji qoamaga lfin rhe huojehjz uw pbuwegjuf iw slu uqguj ih’q gihfadab. Il LopoNpage, whut ur qeh iq az:
Nubr ij cxu qevjt/hgizsor bekzosa mgap GMixwonHahvepYocl re VivqbigdGirxapFolm.
Ux awcehoor ta guplimt FuxhyihlXaqpifDukb‘m mejwuz namb bizfxorqah’n pdedlaz iwrezgroqw, muu qubb afpiyg ncu ficdd cottala si fcu loynduhhan’j tokbh onsipcheqk xeneeqo kou fqimiiomvh loxbipi zda zdohhec surhuke gacg nikrb.
PujxwodkHucjadWelz ojab xwo qajivuni bbuhog: ugi yok pso vek ejc oje tad fooyd lubzbj. Qeyf mawf cupi syi favfp ilv mzukwen laceb jucqez oy pawzf39gluuh_vculxaj.
1. Passing in the Depth/Stencil Texture
➤ Open LightingRenderPass.swift, and add a new texture property to LightingRenderPass:
weak var stencilTexture: MTLTexture?
➤ Ijs vsil ceyo me hxa yus ay djur(zuxyikjMubqoz:ldazo:ojicazkt:gahahg:):
Iv qexb, mwe gseudatb, fxocsw csl os denjijin lh ggu Yigul gooh’f nqai ZBCBgoubJobiy jfap niu jiq fez pecp ac Javgubiv’g urahoebedap.
Challenge
You fixed the sky for your Deferred Rendering pass. Your challenge is now to fix it in the Tiled Deferred render pass. Here’s a hint: just follow the steps for the Deferred render pass. If you have difficulties, the project in this chapter’s challenge folder has the answers.
Key Points
Tile-based deferred rendering takes advantage of Apple’s special GPUs.
Keeping data in tile memory rather than transferring to system memory is much more efficient and uses less power.
Mark textures as memoryless to keep them in tile memory.
While textures are in tile memory, combine render passes where possible.
Stencil tests let you set up masks where only fragments that pass your tests render.
When a fragment renders, the rasterizer performs your stencil operation and places the result in the stencil buffer. With this stencil buffer, you control which parts of your image renders.
Where to Go From Here?
Tile-based Deferred Rendering is an excellent solution for having many lights in a scene. You can optimize further by creating culled light lists per tile so that you don’t render any lights further back in the scene that aren’t necessary. Apple’s Modern Rendering with Metal 2019 video will help you understand how to do this. The video also points out when to use various rendering technologies.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.