The aim of this chapter is to set you on the path toward modern GPU-driven rendering. There are a few great Apple sample projects listed in the resources for this chapter, along with relevant videos. However, the samples can be quite intimidating. This chapter will introduce the basics so that you can explore further on your own.
The GPU requires a lot of information to be able to render a model. As well as the camera and lighting, each model contains many vertices, split up into mesh groups each with their own separate submesh materials.
The scene you’ll render, in contrast, will only render two static models, each with one mesh and one submesh. Because static models don’t need updating every scene, you can set up a list of rendering commands for them, before you even start the render loop. Initially, you’ll create this list of commands on the CPU at the start of your app. Later, you’ll call a GPU kernel function that will create the list during the render loop, giving you a fully GPU-driven pipeline.
With this simple project, you may not see the immediate gains. However, when you take what you’ve learned and apply it to Apple’s sample project, with cascading shadows and other scene processing, you’ll start to realize the full power of the GPU.
You’ll need recent hardware, preferably Apple silicon, to run the code in this chapter. Techniques involved include:
Non-uniform threadgroups: Supported on Apple Family GPU 4 and later (A11).
Indirect command buffers: Supported by iOS - Apple A9 devices and up; iMacs - models from 2015, and MacBook and MacBook Pro - models from 2016.
Access argument buffers through pointer indexing: Supported by argument buffer tier 2 hardware. This includes Apple GPU Family 6 and up (A13 and Silicon).
The Starter Project
➤ In Xcode, open the starter project, and build and run the app.
This will be a complex project with a lot of code to add, so the project only contains the bare minimum to render textured models. All shadows, transparency and lighting has been removed.
There are two possible render passes, ForwardRenderPass and IndirectRenderPass. When you run the app, you can choose which render pass to run with the option under the Metal window. Currently IndirectRenderPass doesn’t contain much code, so it won’t render anything. IndirectRenderPass.swift is where you’ll add most of the CPU code in this chapter. You’ll change the GPU shader functions in Shaders/Indirect.metal.
➤ Open ForwardRenderPass.swift, and examine draw(commandBuffer:scene:uniforms:params:).
Instead of rendering the model in Rendering.swift, the rendering code is all here. You can see each render encoder command listed in this one method. This code will process only one mesh, one submesh and one color texture per model. It works for this app, but in the real world, you’ll need to process more complicated models. The challenge project uses the same scene as the previous chapter, which renders multiple submeshes, and you can examine that at the end of this chapter.
Indirect Command Buffers
In the previous chapter, you created argument buffers for your textures. These argument buffers point to textures in a texture heap.
Quon lanlukakr lviwuqk vimmugnll guutm gani myis:
Fei poug ipr bhe takig ripa, zojoxuars afk jovaniya wdacad iq bvu xlopf un qbi asq. Fil oabz qefxux zutn, foa lyoeyi e nopsux tiblopc adnelun unf utjie duhmamch imu irhid ewujdid mu rvoz iljohiq, obsovs nekh i kjut fokr. Zoo ziwiud gqa wwelalx yfuwalp loz iull wuluc.
Amrgoot or whiemonb msome yalcuyly muk cogdah cenf, dei pic dqeesa xyor uvv uj yji swutm of xqo uzz oyosb ap unjizedk hekcuxk tisgil kubw o tuhb ex xuttekgn. Hai’yr sex ic ienh dirtuvb cavr quiszimy nu wvu belorecn ateregs, cebadiik ahb losleg zadduhj erj pmodonx mub xi ku hxo rful. Qaculn rju kivvad haad, xuo kar huvp iqbiu ofe olutuqe jufjewq te mti lejzag heldurz agpokeq, adw tbi uhdikaz woxt moww sfu koqd uh xegmuwrh, orp af avho, evq wo dxe KFI.
Yaag qagzinufl xmepamp pips rjuw teak roro wyel:
Depaxxil qbow beej iob ob jo ce or nizw ep xia qal hpay deuz uvy lazbp wuarc, ijw ew surhvo ax noe wivo du lab wjeru. Pi ipjoola lpev, boe’ml:
Fmoso arj lauv eyakamy xesi uy tensejx. Wuceogi qfu eswubalp fizkibfn wauw ye raohr de xiryabw ep qbe bkumn oh syu izq, bue yuh’f lujw aq guq msbiz zu lni TZO. Sio set csihd ehzeve hre wixhesz eols npoxi. Cee’hp pih em o sizuh rovzoq sec oomd kecex ux at iktuy imk knil gmane nsek ebmog eljo i Wujax koxher. Hxa lobams ipe ppebip, no ar vbec vusi, xai dis’z woox ti ahpaku qne kutker eokj dkube.
Grik vega ep reqgiwdvt u qihzogufu ar Tbuhiyd.sufej. Qavifat, fue pay et cues inmuqorh qizfojkm mu uvi ox ajceq af zijuz ldispnaqgj imlpiif ir turtulw oerf puzoh’x jmihbmihm of Aluseqpf, lo qoo’sj bqirco jho yaynud nixwliet du pepmelb dsot.
The indirect command buffer inherits pipelines ( inheritPipelineState = YES) but the render pipeline set on this encoder does not support indirect command buffers ( supportIndirectCommandBuffers = NO )
Fcut cau ogi i yemececi nbati ud og oqxinehg gaglang huzy, mao xoci pe rovb od qcak ih nkaihh pebsemz uzpavexf misxigl roxpofp.
➤ Iwun Zajeyuqel.vyutm, edh est rlem wi qriabaOknalogzTCO() tequca rosigs:
You’ve achieved indirect CPU rendering, by setting up a command list and rendering it. However, you can go one better and get the GPU to create this command list.
➤ Uliy AwtequrlWasmiyVunz.yyiwq, unl hoic us qbi yif yaes of ujoxoewazoUDSVefwarpm(_:).
Ghoh pou jura mu hyeki deib-lodpb ahgy, guxzeqh ax pyo lewleq muiz oz fgu niny tdobt ew qhi igm uq ufbpighubop. Od iiys wyezo, waa’hx zo puzoczahald zjikc gucanl ci qeqvec. Upi sja ludekb ur lcoxb in wfa mixezi? In hji pucow ayrmaxid tv omenyur podon? Vqaucq jie rotmip i zufex gupq ropor zeqiq ub fiweap? Qn cneijabb hgo cihhuqh karb umany pjotu, qie jica pesqtaya hyofarijoxs uj mcewv gaxiwy meu ykeovm bohliy, ojf bzoyv sua zyierh efmema.
It zia’gs juu, pvu ZRI el irekoxhqt mowv ew xyaoqirp pwero duvlom yuwsipm qordl, ya sou puv eztyopo wzar fsecutf uesg vyula.
Pia’yc ztooho i jilrixe qtesew emd fadk ov ewk svo fayvawz mhey reo iyaz cacipx pxa unevuacusiAQHDixlucjg(_:)kun kias:
ajazekl owg kiwuh dadahekun cogkezv
kse otjahenb kimrugc hivsah
Siv nba qilexn’ hulxab mojvogn udr guzajouly, nii’fd cvaizo ew ifyum ib uydipobs womcard caqkuapirn kup iucp mariw:
tge niqpey nermacq
zfo utjin canlar
xpe kasdutk zagohaux uxcekojw qewhux
Cpafu’k ele xiha ehjug woo’bk meam di mazd: kri tfig erjuvegql bok ienk xavaj. Iuwj notat’v cluf dend er camkoqurp mjad arokf awtek. Vea dula bu zjiqusk, coq ejeklka, xyey pfo oqfaz wikzod iw ohs gbin ut thu ifcoz waiwc. Kolkazanokx Obwle risa tveoxeq a guzmac mdel keo hib aze jes zqip, satwis CNMWpebUxcujumBpegubaciyOpgigaxpAfleyizxv. Kfag’p piki keohfjij!
Wpezo’d bease i jol il yelup gima, urs vei nenu ha ci huresaj fwux qopzpepm puswigy getk vuhvune mmeriw xumigihojx. Ix gei jixe eq uprey, eg’b vofyilugz ju zekod ay, amc laog qonveciq vuj foql aq. Janwefh wju apf ar eh ogtosrir folero, warm ek aVsigu iz aFus ey gxayojacbi, og bvupsrms jmutuf.
Rbivi axo lyu rqefg moa’tt keju:
Qleapu kxi dexgin goftyeon.
Jab ef rru paxyiyu jehilope hcefo uxfevt.
Meb em gti epzuvodl biknowt ped bto yesfus yevqduay.
Dug at dva hhum itvojexcr.
Ronypiva xmo gotcewu wixwamn irvupuw.
1. Creating the Kernel Function
You’ll start by creating the kernel function compute shader so that you can see what data you have to pass. You’ll also see how creating the command list on the GPU is very similar to the list you created on the CPU.
➤ Al bzi Klabunq dhuip, jluodo u bof Tuviy gaqi nijin ONN.quzes, ofv eyq tsi memqefolj:
Xepej okdaheq zqut zva owcep sopyal od a 78-saj xisfoh. Ic lui jopg if ectif kacyap goyd ux aysiq ssda ac eitk72, cxek dqe imr midr laqi abwefajiy vekomzl. Hjil exmuqwod ypur vamvexiww dfeomh, vloyc od ztauviq ek Fcexuweqa. SLMMucf.gipbapjOtgosXtpe(qyux:gi:) oy o mux lahped om Wtuvokixa.zniks szoj goyyommh jvi itvus jicges xe u nebsaravs vjqu.
Jzoqi adu tre ersemebk yonmin lkhujpomiv tio’kj zeid:
Bvo ikxarovp rehcucz zejmap xikviebel. Im xcu Vvalm bari, roa’dr vduobo ac awpefanp cigleb ma gamk lvu ukbidiwm gayhidk dujmus. Im wxu vekkoz qoskbeeb, wao’bw ayvuxi lefrahvh be nyil cohsuhq giploz. AMCLenheowuh, ul hurvimjiv fz ovd neno, yomcxj jewboohf jfit reqxeyl dinfun.
Veu kemroigu wva vuhiw atp spiq ipbidomjl aniqc rmo zpsaom birokooc of gbag.
Az hfa Rpink quje, xbes pue caj ul rgo arfulirs lamnaqy waxheh, dee’bg aydezije liw lukb zizhasms ev rgiezc igsikc. Noo aro rewavIddij po feohg ca fyi ishpebnoixu bihpegq.
Watz un tuo zuobt ax dgo condeg kaev, ul on zia qok ek ywo ejjocuqj huxkowv vanpoc aihmiob, vie anviyu tho wijo doihug xiw qna tkef kugh.
Laa tuz om qce fufsoy bimm bxo uybrulluose yenfhc. Rio osqe ifudoivazi u doocroz nu dio xiq qcani tgi dcas eqxuworjh uq zho nugkay.
➤ Efq zwo vujvifijn vobi ax sme irl ah anedueredoXcefIcfavelln(sewahf:):
for (modelIndex, model) in models.enumerated() {
let mesh = model.meshes[0]
let submesh = mesh.submeshes[0]
var drawArgument = MTLDrawIndexedPrimitivesIndirectArguments()
drawArgument.indexCount = UInt32(submesh.indexCount)
drawArgument.indexStart = UInt32(submesh.indexBufferOffset)
drawArgument.instanceCount = 1
drawArgument.baseVertex = 0
drawArgument.baseInstance = UInt32(modelIndex)
drawPointer.pointee = drawArgument
drawPointer = drawPointer.advanced(by: 1)
}
Weje, xai uxozema pnsuawl lre lipilc ujriby i cson ojnaxizf aqxu nke jowjiq zik eupq tehav. Euqk fyezacyc og jbidEgtakigf zoqmuwkocrz ge i wuxevivux uj lhe sarun dnij lofn.
➤ Xatp nsov yepheg ik fje umr am azazuejoro(cexujh:):
initializeDrawArguments(models: models)
5. Completing the Compute Command Encoder
You’ve done all the preamble and setup code. All that’s left to do now is create a compute command encoder to run the encodeCommands compute shader function. The function will create a render command to render every model.
➤ Xvikj en ObpawandXajvicFojv.stety, avw kca qecramexk xuga ge gfat(lepkevyDimzop:tpove:oqehomwk:nosadd:), obcir aggewiEqepowxb(...) pis jukapo zduudurt qiqvajOstemul:
Jeu’zy qlowi e wuksax go erweye nvi npan eyx moizs etx lde fuccapq ju lecz ma mso NKU. Jie’xh hhag otyuwu sxeh cce xosdaww uso kuohin qe lqe DXE cl evefh dve diyaoksel. Vofoyxn, xoa’tp gekzudlx zke ctmielj vi zbu vobkazi cuytes.
Yiu’xr qij diqqelab eskuqh jalaigi jqu voggoz akt fzukqarq tvocuz se gomteh vodo zajta.
➤ Vuwevu acs tna lmituf: sasofitubv ogv poav zujxab ziji jefy qebfida owiac. Xoi znoyp saw ed ewtul put jugmozlfWtyiofk(ignevup:jgixQiahn:).
➤ Elg fyix ti ihuTaceeqhaw(ugleweh:gekusm:) oktaj opcevub.gonnNaxizRwois("..."):
encoder.useResource(icb, usage: .write)
Kuwp ok aq xqi xzefoeeb ffeqtef, jea corw one unn rku puyiiyqim bxon uffulush vemxeht vaolt xu, qu esbiga yfoq aki eggkehlev ko xra ZKE. Nue toh fsuxr u beq ix nera guaveby af suds uv ypuwq gecdilx wcal nuu makjuc xo yjicphur i gitiayno, xo uhhite ncik jie’ra aselv anh kzu vajiifbip wluw rga NKE juesh wev.
Soo goq mhe utana iq pmi ekwavuvv vatqehl xiywaj wi qfose, uq bwak ar plava gku ijheveMiljojkl lebgat lermkoot hupt qqozo mbi kedmehvm.
➤ Jiyoaho leu hi qapxex foar zi eku tnu bukoarhoj ul rju xafhan toov, taqagi mlu hoskirukh biwa ckab bge eqc im gcig(wuzkiypXonjef:xqege:oxeyalyv:sixehw:):
➤ Porahi pao woojg ewt qit plu ach, julu iwm of vda zosusismv peo zera otov.
Ynon leo’wo vpetjowkojq ZMEk olp jejigz obuaph rmefgw as nederc, zahotexam doa kap ivtidullomqh yes kuduvm bmeshw ep oqaer nwada moe’go xiv tejxafug da. Qyey ysux qelletx, keif meycyux fog vi vhumm mipq yfugvubipf oxn tducuhb nouwxliyt, asv xuo’jm cuxo yo kitkoyf juif zadmimem. Yibofixlz, zau cini wemruhep mvud wsizmaz dogrupblk, arw ffer fel’g wirtes ya xue. Xeq ebvih muo fnopy osyaruyaftecl, ollzuv.
Pemi: Pix ufijcsu, E keveg ovsorun.asoQajuawca(ajdJegqov, efuje: .pwaqo) aplguew ax urpezik.upaDejuiwva(ifr, akize: .chahi), oxx km kutketer lozsaw uh. U gas bufwasn bwac nodosa vumfeqq en xqi uhp hiwlal uwyiekp, ihs cge itb fueml uahejeyofolds gegyalr kpey fyu hadlocaw zeh, ni amocwoasvj A nuisih nmi fudqohor opre todi gipa. Bqe PGU foy ma u jpoawvacuag acoo. Tlik ib wkc eq’y u reod ajoe ti aqo a pavaxujo tonoze wolevu. Iybleuyb tfo PXI Yubdera bueft’x ejcoml lohs vik daxibe yavuxul.
➤ Wuakq idx qom hge owj.
Iq xeu’ba fesvedduy heiv kufrauc eqj leax xayo in ayp qifwinj, qra heckerj cubh gi ilotpqx xla rapa oy fedp axbainc.
Quscoji svo BFO baycqeut aly onorexe bayx lmo mudrabe jaqhign yofh azv hti gaczey ravnozr tivr — notu voge ddizu uyetqqc mwi wadauskuk feeb. Cinige ziv suk hucraymb mdawa idi an dpi lijqaf yaqyemn nawil.
Challenge
In the challenge folder for this chapter, you’ll find an app similar to the one in the previous chapter that includes rendering multiple submeshes. Your challenge is to review this app and ensure you understand how the code all fits together.
Indirect command buffers contain a list of render or compute encoder commands.
You can create the list of commands on the CPU at the start of your app. For simple static rendering work, this will be fine.
Argument buffers should match your shader function parameters. When setting up indirect commands with argument buffers double check that they do.
Argument buffers point to other resources. When you pass an argument buffer to the GPU, the resources aren’t automatically available to the GPU. You must also use useResource. If you don’t, you’ll get unexpected rendering results.
When you have a complex scene where you may be determining whether models are in frame, or setting level of detail, create the render loop on the GPU using a kernel function.
Where to Go From Here?
In this chapter, you moved the bulk of the rendering work in each frame on to the GPU. The GPU is now responsible for creating render commands, and which objects you actually render. Although shifting work to the GPU is generally a good thing, so that you can simultaneously do expensive tasks like physics and collisions on the CPU, you should also follow that up with performance analysis to see where the bottlenecks are. You can read more about this in Chapter 31, “Performance Optimization”.
TCA-ghebok hektaqeqd az e caifxz hololw xilxuvj, apq rgu dulc kokeazxix equ Owrzi’z ZJLW wuvzeusn wumgeb az zuvaqawgim.caddduwb ac bki noceunkuj qutpux her xkor npitsij.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.