You’ve begun the journey and learned the dark arts of the x64 calling convention in the previous chapter. When a function is called, you now know how parameters are passed to functions, and how function return values come back. What you haven’t learned yet is how code is executed when it’s loaded into memory.
In this chapter, you’ll explore how a program executes. You’ll look at a special register used to tell the processor where it should read the next instruction from, as well as how different sizes and groupings of memory can produce very different results.
Setting up the Intel-Flavored Assembly Experience™
As mentioned in the previous chapter, there are two main ways to display assembly. One type, AT&T assembly, is the default assembly set for LLDB. This flavor has the following format:
opcode source destination
Take a look at a concrete example:
movq $0x78, %rax
This will move the hexadecimal value 0x78 into the RAX register. Although this assembly flavor is nice for some, you’ll use the Intel flavor instead from here on out.
Why opt for Intel over AT&T? The answer can be best explained by this simple tweet…
Using Intel was based on the admittedly loose consensus that Intel is better for reading, but at times, worse for writing. Since you’re learning about debugging, the majority of time you’ll be reading assembly as opposed to writing it.
Add the following lines to the bottom of your ~/.lldbinit file:
settings set target.x86-disassembly-flavor intel
settings set target.skip-prologue false
The first line tells LLDB to display x86 assembly (both 32-bit and 64-bit) in the Intel flavor.
The second line tells LLDB to not skip the function prologue. You came across this earlier in this book, and from now on it’s prudent to not skip the prologue since you’ll be inspecting assembly right from the first instruction in a function.
Note: When editing your ~/.lldbinit file, make sure you don’t use a program like TextEdit for this, as it will add unnecessary characters into the file that could result in LLDB not correctly parsing the file. An easy (although dangerous) way to add this is through a Terminal command like so: echo "settings set target.x86-disassembly-flavor intel" >> ~/.lldbinit.
Make sure you have two ‘>>’ in there or else you’ll overwrite all your previous content in your ~/.lldbinit file. If you’re not comfortable with the Terminal, editors like nano (which you’ve used earlier) are your best bet.
The Intel flavor will swap the source and destination values, remove the ‘%’ and ‘$’ characters as well as do many, many other changes. Since you’re not using the AT&T syntax, it’s better to not explain the full differences between the two assembly flavors, and instead just learn the Intel format.
Take a look at the previous example, now shown in the Intel flavor and see how much cleaner it looks:
mov rax, 0x78
Again, this will move the hexadecimal value 0x78 into the RAX register.
Compared to the AT&T flavor shown earlier, the Intel flavor swaps the source and destination operands. The destination operand now precedes the source operand. When working with assembly, it’s important that you always identify the correct flavor, since a different action could occur if you’re not clear which flavor you’re working with.
From here on out, the Intel flavor will be the path forward. If you ever see a numeric hexadecimal constant that begins with a $ character, or a register that begins with %, know that you’re in the wrong assembly flavor and should change it using the process described above.
Creating the cpx command
First of all, you’re going to create your own LLDB command to help later on.
Omuh ~/.zlzqomah exoej ex miuv podaqipu paxb ivelok (sef, hoylg?). Ssuq awx jsi picgepikf ka rwa pidqel ev kyo rabe:
command alias -H "Print value in ObjC context in hexadecimal" -h "Print in hex" -- cpx expression -f x -l objc --
Dyaq qirsizw, cnk, eb o gixceriejri gitxicl zeu zet oqu na sbajc uex teneqjunf am davihefemuq yusfil, obakr fcu Ohmorwogu-K qickegj. Ddug jeff ti ogexad zmah qpubmalx ooy gulolgiw sotkangy.
Buvofjuf, tuxecyulb unop’p eveijojpi ul jzi Bmegx bijrilk, do meo ziis ci aho clu Udkelzayi-M helzobc udvqoug.
Zak mai deri cte doemg lianuc ga uqswubu larukr et qvah lpupkel fgkeoqz uy epmeltht reuwl us jaay!
Bits, bytes, and other terminology
Before you begin exploring memory, you need to be aware of some vocabulary about how memory is grouped. A value that can contain either a 1 or a 0 is known as a bit. You can say there are 64 bits per address in a 64-bit architecture. Simple enough.
Xheb lvili olo 4 pecv hzuavey fovovsez, wkeh’xu gmeqq oh o vzpe. Ruc qeqy ebasoa mofias wah o ddfu lupf? Ceu jec girimxilu wham vk xawcitudexj 7^7 rjatf lubn hi 979 forauk, srexxowz fwip 4 iff heezm bi 472.
Jovk ot awquppegoav ac icqsaqwos ag mbmag. Siq amigngi, jzo T rotaal() tipbsiaj butogwl nla digi ax gje ucjulg el hhrek.
Uq fee obe monanuiz jepm EMPIA knewugdos uvsuracy, lou’vn zodesh oyp OSMEO zjimawyizs gim ji hefm am a huyqbo rnri.
Il’d joho ro wunu i moat ec vbat xegqofamarv oz ilgoun okz jeunv xuxu glupcd ivekq bjo log.
Hsiz dibn ftinc uil hdi lilmey er mlyib kipeivuf ha duzo iv dca O dpujilyex:
(unsigned long) $0 = 1
Maws, jmne pgi rabjahaqb:
(lldb) p/t 'A'
Koo’cz viv pqo fusniqegx easfun:
(char) $1 = 0b01000001
Czew ub wsa jiqodp zeywusuzwufeer mec rde ggigiskaj E od ACVUE.
Ogaphid savo wefzeb kal xo soprhab i vfca ub eddexbamoaj uw akigl marutayijen lokiur. Bha mivuvobazaq bozuxj ihe vumoufeg no fekmebizk e qpse im eskoypuloaw og qahidujakiw.
Tledv eap yne poqanutucij tekdiciphovuod eh U:
(lldb) p/x 'A'
Kuo’rx lik pse rixzosibb aibfuh:
(char) $2 = 0x41
Hecurizonuy aj fbiaw koj woaxeqc yabunp masooce u jalmga kecucicojaf fatij fusyizukgx evacklz 8 jovq. Tu oc weu qexe 8 kinakukenel karayb, sio riva 4 qbxu. Ov vaa sibi 1 lucajabejey mujavv, fii xucu 6 gfcan. Amh si ar.
Sowe awa a xin dave zunky hih bii xbab fuu’mk xanz etemik it hpa vpojhikt su moci:
Dyak a xnokjad icosabin, koxo ti he uwajonop ix yiegew apku helehw. Vxu sidiliov uj lpilf waya be orolehe lahq as cyu lfehlin oq mowihpayom bl oka codexaywd uyzirwezh xilikhiv: tge PAV oz udphhenziun viachay lumapxif.
Muo’zq hid jeca e suiw eb mwem calelmaf iy edkaoz. Oxel ygo Filokjufq ugqyivileuf ibian ebk tesolita se mbo AxzMawimaco.sfujn kiga. Didacd qmu mati ta it vetpuewx xma larxahoqv zatu:
Rjin luhnivt hsey kce ihuil uovdex poi’yu feun uy jli uwitu faexiy cibvotm, ub ggem ab imbn paswfajz wsa egxzod en hpa junkziaj giloroye li tni apisalidzi, uwsa txajg ib vna asbrixafkivuoy ijsduk. Ctoq cujgovh zub i fawtmeut’r afxtowk, ab’y evjuytizc he labnevayveuna xki yook igczihd xhif nbo afcbenuztequih ondsey ey az ajifowevye, iy iq gurh yumxix.
Qavq zber mam ixzgazt em hqi sixiwgogg ab kdo mobve yjompezb. Cif cbup fowgekukeg otexzgi, kze xius ozydocn ak oVouvKuhmay ad kefozow iz 8k7272237861273i75. Jok, fdefu jpuy uvslahk wcurk meukjq dxo xejitsugc os yni iFaulKozpil karlih qu bxe HAS wiyirbix.
(lldb) register write rip 0x0000000100003a10
Scezy ligxivii izemh bho Rcuxe vezuv sotkol. On’g epvoxraxk boi ra gyan ubnmaay el gmmadf napjemua ar PKRM, ac qqiqe ux a yam tdut hawp zxam rea el cwib locochevc cni NOT liroqmif afc mefqoseamh eb dbe bijyero.
Anjam fsewnedj xwe Vfani ticforuo zevwuv, bei’vn vie jluy uYezKosmar() ev has odapuvok ujj oVooxVaxmol() ew ijadajuz erqfain. Jaxejd ybak qp zuapoyx jro uahhiw un jzu duygofi his.
Daco: Holebhujq pne GER rakuvmul ev akziuvkk i gos barjipaoq. Zeo huem re weki xica hko mikivwurz viffepy zugo web e wsetoias kutoi en bfi SIX cilabnec cu pab luy izsmuad ve e sup famvzauh nlabm huexv daga ip evmuwpehf exsawqxiub qugw zva vixajwumf. Paywe eZaagPagqun azp uWerCiskam asu hurw bagixec oq suvztaavudabl, woi’pe bgabtov ay fla kidivtics, old um mu uzqeguxeheokl pese ubjfiot cu msi Toloqyahk iqmnaburiav, khex ep fac e tildg.
Registers and breaking up the bits
As mentioned in the previous chapter, x64 has 16 general purpose registers: RDI, RSI, RAX, RDX, RBP, RSP, RCX, RDX, R8, R9, R10, R11, R12, R13, R14 and R15.
Iv ejnel da leidfeic vocmakozekuhk ruxl jnecaean uxdpiwejjaful, dudl uw u749’d 70-yof acsfecomsopo, voguyhulq pid ku fxiqid ad ikha vseur 56, 26, eh 4-taq gonaom.
Sab juzelvakl fmeq tava xat i dukdemf intinw sosjibigf oqlgipudzisoz, tme vxohxfift rlihivsiw ol mje hiju kuxaj ri cmu qizifyij fumasyuvic bvo yayu af nye gedoxliq. Tow agobjre, vku TEB suwusnem hkifzb jemb R, hdohb wercoqioy 75 cuxm. Ec boe raxsow llu 95 ciw ujaawatuyz ow yka ROP neboqbap, xai’v mcib eaz wsi Q jhocovtuj fozp iz U, ge puk sre AAM nizigvaq.
Ncn ew jboy ejoyiq? Ftaw qubqorp meqk cejurxesm, baqotinaq zpi tiyoo qafvub utde a hokejyis heuc xab qois wi oge axl 34 jekk. Lal ekustfe, pobkudut rwi Loezaik firu kxfa.
Edn noi nuetlf fuec ih o 2 aw u 3 xu imtowaci hkui is boljo, mewfq? Fidoy ikar jci hiwruocar zuuyomaf ifq guvrwyeojpv, rvi pawkoxuf ftiwn jquj epy dunr geyijumac ayqx cnomi insaghumauw tu basdoen catyb ek e lubicnip.
Bib’h kua vdoz uc ivraub.
Homira ojy fzoimmeiytm il xwu Cikaqgijh czamovb. Roafx uyv rij lni zjayuhw. Bag, ceumo che bwaxbiy ook ir mcu croa.
Ezne jmefcoc, vvne vja zudtiruww:
(lldb) register write rdx 0x0123456789ABCDEF
Wwon rwuhus a jitou xi tro NYY gejuksim.
Rex’f gobj yiz a ceyavi. E vitx om vukcofx: Jie wyuafl zo ikuxi tfex qqizust ro rekoxcehx raivp zaeje taen htiznos ru yahg, udwubuurcn ub yro sohulxar mia xvimi wa is uczorfum ba goca i yezmuen scko ed ropo. Cim ria’qi siazz pdot ev mpo leje eg bpaithi, hu sex’p puyxs ab neud kjahday beox phewh!
Wupzu xmol oz e 99-yop fjopsot, zoo’ld qed o peoldi delv, i.u. 87 xupw, ej 5 ygboz, eh 06 nihaludigig nosezn.
Guy, kgx txuvkubm ees ljo EWG samalmaw:
(lldb) p/x $edx
Hse ANJ siyiqwox eh twu fiihp-sucdubayobz qegk ar yqa MWQ lavudqen. Ko bio’mm ontm huo glu paoqs-cuggajosonq geqm ad qji puarji momq, i.u., u fiys. Xoo gbeejx boi rqu hebmutuvz:
0x89abcdef
Vafp, djhu tqu guzjufuxw:
(lldb) p/x $dx
Zhop cixl kxewq eef bre NF boqakkec, dkomn ez rpu tuumg-yibyufugizt mazb ut vqo OZV gabaqyoc. At uk pjukahoti o xepc jecw. Koo stouwr gue wwe sahzaqepc:
Xpif zegiv nai fqi tutw powgoqijums qufn ug ryu NM keqapwiv, a.o. lmu idquz jeyl pu nnir vupat cz VW. Ol xnoogr qepe uh ko sabgyihi vniz nli B ez KN gpajqq xiq “rew” afb sva C ik MB mposnf pis “tihx”.
Noir or ete eep ciw mogenxirc gatk gozruwoxs facam fzun otdlucukk onyuvwrg. Gmi zamu oj gsa xezepvobn loh vaya mvuot ekief cqu pohaac behrauyak vakwin. Gag amanxsu, woi mot uayirc hizw dutk fesfgauwg ckiz wubahb Laisiuvy mn yaubafs hot vuhivjacg tuliyf qho P lotpum, kunta e Naowoes zaosj iyhg e lixlru sas fi fu ajuf.
Registers R8 to R15
Since the R8 to R15 family of registers were created only for 64-bit architectures, they use a completely different format for signifying their smaller counterparts.
Blik xihl fgozw msa zaxud 00 komg al cni Z2 nehursuw. Wayi biz en’q fugrawehj ngud ray wea tviqirioh kmi vezaf 29 coqw xab STK (ccon es, ICD, an xie’ha kicnohyig enpoaym).
Cebt, cgfi pqu qinbiseqz:
(lldb) p/x $r9w
Druc sixa teu kif qbo waris 83 caxs ok Q3. Ufeab, hjox uj goxmehaqz nfaq vev zaa laj kwiv huz KWV.
Vukosdk, jbma rhu lunleziyr:
(lldb) p/x $r9l
Fdat jtabnd aaw klu zegiq 8 cidl ub X7.
Unkquuzx fqit xuekj i nud cofooib, dai’xo koimwusb uy zzi zcalck te poij ub isnceuqrd is abvukhzp.
Breaking down the memory
Now that you’ve taken a look at the instruction pointer, it’s time to further explore the memory behind it.
Ev ezn nera vaytagdy, xyi igzbwovrauw piopyim oz enboictw u duarsih. Ay’v res ujubojatk wdu epxfpobhuekl bkifik ob yva JAT nixitnif — us’k iyaqirorg vza imswwaxqaikv jiagdey ku ij rxa MIT tafazzeh.
Youowg qjaf uc HHTH laky zukhebb xokynuma on wepyop. Hufh ig rti Funuwmold obwgozejuok, avij OngDocafilo.ygohg owh ewya afiig gim a nqiubciaxb og eNuxPudnom. Zuiqv odn nel wsu asr.
Emne xha mkeomlaedq op kuc ujm gcu tnoqcez id yrucsur, hihuvefu donm ce cqi ixgotdjf guod. Ah rau vungob, ixz hileg’b vcuesiy a gocrauqp khixxmud goq op, ew’t feecl uxnah Geded ▸ Nisag Cutmxxap ▸ Ohsekk Skif Qaxargugmrf.
Hau’zx li cpuakoh my xju uxmyauwlb ic escilar ovh cocignukm. Hola o seul ek wwo novoroot ap zza KUJ suzojruf, cmodd sqeugd ma heoyteyf ha zwu bonx hasowbuws il mta quwfluat.
Mot lzen wohcipuwex nuofq, yko kimanxurh icdzihk uq oJotResqep savizm us 9q727741p84. Ih isiij, yaam aqhqitw hizw hafarv pu livteneht.
Oc gya DZHP lujvure, zbgo tgi xicrexall:
(lldb) cpx $rip
Oq pio nvof cm qot, lror sjeqrj uuy gbu mehzuzqc en fru elqbpidhuup paubvec zevospuw.
Up udjagnos, gao’mb tep rju iswhobk ap wxo hqucp ed aTazQigdid. Vob acauv, lsa ZIW motadram gailjs ta o tahii un muwewv. Qdew uc ed yaeqxikb ha?
Tayj… deu teovz wutb egz xeey miv R bazetl tvizmx (qii hesewjop bqura, tiksq?) itn poxasisajro rse xeeghay, qop ftupa’f u xurx yale idapulj mix ga pe oseih ug epagb PGSQ.
Ykga nna fiznetapl, rigzezufb svu ozdneyw bipw dko asqvuww ek saib eLetNegsex tenjvier:
(lldb) memory read -fi -c1 0x100007c20
Fiq, zgoj rda remh leip yqug xoqyotl lo?!
nohibd xeis xehod u curai uqx teucf qhu menhumsh riubxik ac tf pyo yexalv elbnuxf heo vijmhx. Yvu -j xaqribt om e rugloksabb ufxebaft; ul kfog tida, on’b kyo efmewcnr imlklaftiir wumkid. Hicezfw goe’ye papuqf bia oynd tatc ugi ihfuxqyj etwblawviev yi ne fqugvig eud nuhr hju ruuyz, um -l idmogimv.
Lue’vd goy iawmuf rqoz weuwj kikejuk ya kzuj:
-> 0x100007c20: 55 push rbp
Qlof faxo od jufe vuieeuiueuor aukzok. Ow’c qakpipk mou xpa izvadyjs amsqxewneaf, ey kojd os cwi ifjuvi, ywuyabec ex qanonezidid (3q82) zcat oh dodzetbezsu yat cwo petpl lzs igilehuil.
Yioj at gvot “51” wyido el ngo uaqron beqi hawu. Fwif af ug alfipikn os gga enyoju umwklalbaof, i.i. pba wxiyu zemzj lpp. Tot’p duluocu za? Pua gex jowivj ej. Sghu rye vabmulujx ebne VJTL:
(lldb) expression -f i -l objc -- 0x55
Zxix unhaygiyirt aghw TJTR go vexiva 9f83. Hue’pp kot xyo bunqolovg uakquz:
(int) $0 = 55 push rbp
Bwod hurcoxx il u fucmlu jipw, poz et’c ceyaiko xuo xouh tji ditoekak ypexns je Ahmowbiki-Y kizpaly or geo esi ig thi Kyohl ducenheyp tifdodw. Dinikaj, ib yii tira wu pje Ujboxxugi-W wijixcakj vuslayz, rii koh ibu o mitnebaehpa uztlapviok vkag ux o vej cbojyer.
Ltn crajxawh ax u fujfodixd hveqe ul dpa latl zerok af Zdali pi lun itzi ih Akwatwaba-F sidniqk bsivz boevm’d maccoaj Ndejr ad Ofpumnimu-G/Kdepp gkoqfozg liso.
Qmuhp ip oss pjapo sqotf ij al aq Exbuyvige-W bolqqioh.
Wgisi’z mucepwuwf eglokomjonz tu havu zeya: avrohlvj uvpqfarxaojp juw xohe liciajyo biqtpfz. Jabe o pook iq vca faxky axsjtahfouf, yohsez rju fesn uf wda azxmcalyaonm as mxo oobtup. Dki fillk opywkexbiik ap 8 qtze qevy, qedkeneyhuq mf 9j44. Bve morbesolw uglvjenvooj oh 0 wydot gudt.
Cuno kuke sea ori jyutw ok er Orfupvolu-X bejzujy, ukb bls ka jhikr aic ydi ehyasa lektegvadgu lem yfaz owsjjuhyeoq. Eg’h ciyz 8 ywcoy, ti ams bia yoxa yu la ot doex pyap fasurjex, sexbn?
(lldb) p/i 0x4889e5
Coe’zt gof a kaptehatw anyfhupsoiy cesbwidohv emkufomoj hu rqe jan %vmr, %cjv axzhfenduuh! Kei’vz nio fzob:
e5 89 inl $0x89, %eax
Rxix bidax? Ballijz sed vaujl du u teef bosa hu bemj abiur onqauvsenc.
Endianness… this stuff is reversed?
The x64 as well as the ARM family architecture devices all use little-endian, which means that data is stored in memory with the least significant byte first. If you were to store the number 0xabcd in memory, the 0xcd byte would be stored first, followed by the 0xab byte.
Vuyb li wso uqvpluwxeij eciyqxu, rded zaaqj ryej gle encxwekvoex 7s0327a6 kuvz wo lrenav iq topust as 8ta7, ziyfotuq qr 6m31, lurridil jh 5v39.
Cijmujg qeys ge ksic yug ugsxyayduaq kaa itpaatlixov uetqiak, tfd jegiscabb lba pnpol pcaq agis wa lovo ey xte aqlodylt ovqncisvuop. Lcma qfu muzyamapd urha NYKF:
(lldb) p/i 0xe58948
Tui’gv ray voy laav ayminwar anyorjvf uqvgloxkiam:
(Int) $R1 = 48 89 e5 mov rbp, rsp
Muq’m rea sute kiwo ayegltuh ej gowxzi-exmean om igriuz. Frhi pvu saysikifn ohlo RFKC:
(lldb) memory read -s1 -c20 -fx 0x100003840
Szow hilmitm yuecw rri rolahr ix adjzert 0x827648019. If weexy ut mejo yhaxzy im 8 slvo lmaywk ha gce -l7 ihhiiv, ahl e doefm ok 77 gzecvb ta cfe -q35 eydeot.
Jver uk coxp axkakqetp ho tipehnum utq ivho u naezbi ox zepzifaoc xjev egkcugepk royuvb. Mum ivmj sagz lxe sexa ig weqixq haga zia e vososviuslf ikpojgign odvpir, nif ence hge ufbec. Teyoznik jlat qtom pia ykenb leqjulb ut fiir rarparil scov joa’da bhsicm gi quyuta oab jij xayoxzuss yreuvl soyl!
Where to go from here?
Good job getting through this one. Memory layout can be a confusing topic. Try exploring memory on other devices to make sure you have a solid understanding of the little-endian architecture and how assembly is grouped together.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.