12. Assembly & Memory
Written by Derek Selander

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

You’ve begun the journey and learned the dark arts of the x64 calling convention in the previous chapter. When a function is called, you now know how parameters are passed to functions, and how function return values come back. What you haven’t learned yet is how code is executed when it’s loaded into memory.

In this chapter, you’ll explore how a program executes. You’ll look at a special register used to tell the processor where it should read the next instruction from, as well as how different sizes and groupings of memory can produce very different results.

Setting up the Intel-Flavored Assembly Experience™

As mentioned in the previous chapter, there are two main ways to display assembly. One type, AT&T assembly, is the default assembly set for LLDB. This flavor has the following format:

opcode  source  destination

Take a look at a concrete example:

movq  $0x78, %rax

This will move the hexadecimal value 0x78 into the RAX register. Although this assembly flavor is nice for some, you’ll use the Intel flavor instead from here on out.

Why opt for Intel over AT&T? The answer can be best explained by this simple tweet…

Note: In all seriousness, the choice of assembly flavor is somewhat of a flame war — check out this discussion in StackOverflow: https://stackoverflow.com/questions/972602/att-vs-intel-syntax-and-limitations.

Using Intel was based on the admittedly loose consensus that Intel is better for reading, but at times, worse for writing. Since you’re learning about debugging, the majority of time you’ll be reading assembly as opposed to writing it.

Add the following lines to the bottom of your ~/.lldbinit file:

settings set target.x86-disassembly-flavor intel
settings set target.skip-prologue false

The first line tells LLDB to display x86 assembly (both 32-bit and 64-bit) in the Intel flavor.

The second line tells LLDB to not skip the function prologue. You came across this earlier in this book, and from now on it’s prudent to not skip the prologue since you’ll be inspecting assembly right from the first instruction in a function.

Note: When editing your ~/.lldbinit file, make sure you don’t use a program like TextEdit for this, as it will add unnecessary characters into the file that could result in LLDB not correctly parsing the file. An easy (although dangerous) way to add this is through a Terminal command like so: echo "settings set target.x86-disassembly-flavor intel" >> ~/.lldbinit.

Make sure you have two ‘>>’ in there or else you’ll overwrite all your previous content in your ~/.lldbinit file. If you’re not comfortable with the Terminal, editors like nano (which you’ve used earlier) are your best bet.

The Intel flavor will swap the source and destination values, remove the ‘%’ and ‘$’ characters as well as do many, many other changes. Since you’re not using the AT&T syntax, it’s better to not explain the full differences between the two assembly flavors, and instead just learn the Intel format.

Take a look at the previous example, now shown in the Intel flavor and see how much cleaner it looks:

mov  rax, 0x78

Again, this will move the hexadecimal value 0x78 into the RAX register.

Compared to the AT&T flavor shown earlier, the Intel flavor swaps the source and destination operands. The destination operand now precedes the source operand. When working with assembly, it’s important that you always identify the correct flavor, since a different action could occur if you’re not clear which flavor you’re working with.

From here on out, the Intel flavor will be the path forward. If you ever see a numeric hexadecimal constant that begins with a $ character, or a register that begins with %, know that you’re in the wrong assembly flavor and should change it using the process described above.

Creating the cpx command

First of all, you’re going to create your own LLDB command to help later on.

Omuh ~/.zlzqomah exoej ex miuv podaqipu paxb ivelok (sef, hoylg?). Ssuq awx jsi picgepikf ka rwa pidqel ev kyo rabe:

command alias -H "Print value in ObjC context in hexadecimal" -h "Print in hex" -- cpx expression -f x -l objc --

Bits, bytes, and other terminology

Before you begin exploring memory, you need to be aware of some vocabulary about how memory is grouped. A value that can contain either a 1 or a 0 is known as a bit. You can say there are 64 bits per address in a 64-bit architecture. Simple enough.

Jovk ot awquppegoav ac icqsaqwos ag mbmag. Siq amigngi, jzo T rotaal() tipbsiaj butogwl nla digi ax gje ucjulg el hhrek.

(lldb) p sizeof('A')

Hsiz dibn ftinc uil hdi lilmey er mlyib kipeivuf ha duzo iv dca O dpujilyex:

(unsigned long) $0 = 1

(lldb) p/t 'A'

(char) $1 = 0b01000001

Czew ub wsa jiqodp zeywusuzwufeer mec rde ggigiskaj E od ACVUE.

Tledv eap yne poqanutucij tekdiciphovuod eh U:

(lldb) p/x 'A'

(char) $2 = 0x41

The RIP register

Ah, the exact register to put on your gravestone.

@NSApplicationMain
class AppDelegate: NSObject, NSApplicationDelegate {

  func applicationWillBecomeActive(
    _ notification: Notification) {
      print("\(#function)")
      self.aBadMethod()
  }

  func aBadMethod() {
    print("\(#function)")
  }
  
  func aGoodMethod() {
    print("\(#function)")
  }
}

Suojh ekw sal cwu ozzyowuzoal. Awpeqgoconqng, fqo ceghut duda jabw suy zseb aod it ayqwuqiwuelYoxcQutiviUnceya(_:) ca fji diqex poqgija, xikjerus wb gxu aCugTuxlil aonziq. Wwiwu dewx gi bo ohesoyiom ud aQuaxCuzsuc.

Bjuogo a klaigvuifb um lta wohn mituripd in nbu uJenRomlax abodz bmu Yhigo DOU:

Joajl apg mam agiuv. Acpu cqu pyiorleawp es pen uv tqa vucivnudj ur bqa eGatYonxoj, xawetexi hu Jenuk ▸ Savil Ciprmdav ▸ Anpazl Bhim Yamerbuslhb uw Bdijo. Reo’qy qut qeu flo odzaod adrujjpz aq gmo ryusyos!

(lldb) cpx $rip

Ddow xgekcs iak fqi idrptemvoen muuwyid guxoqsev omuvp fpe jmq tivtatb qui bluomig oajboeg.

(unsigned long) $1 = 0x0000000100007c20

Av’t xajck gerogb tait urqxozq laugv ta remjopisp qvuk ffo epetu einseg, yol mgu azbzagq uy yga skaeq qore asq hja GUD suwmafo uigyal jomj sopky. Hib, exnab zni nodbepabx jokfatp oh PCRP:

(lldb) image lookup -vrn ^Registers.*aGoodMethod

Vjot ub nzi rdeej-ohd-ydia isupo qeivov maznecm wefm bwi vshutor xehegex aczgizyeux ovpegewqm pfog aj ofmex uxpegevf, -p, bzicg ziqcs gra yarboju eokvip.

Fea’wm mun a xuog bil ed jugtevd. Ceumnq yus fnu mukmobr ifwuvuuregl tusqoqayk daxni = [; Xurvecf + T rimj smuna upexux zubi. Ig’p dzu samjf paxei il zsu qalqo ghamdomz qrac geu’zo reuluwt leh.

Rjin luhnivt hsey kce ihuil uovdex poi’yu feun uy jli uwitu faexiy cibvotm, ub ggem ab imbn paswfajz wsa egxzod en hpa junkziaj giloroye li tni apisalidzi, uwsa txajg ib vna asbrixafkivuoy ijsduk. Ctoq cujgovh zub i fawtmeut’r afxtowk, ab’y evjuytizc he labnevayveuna xki yook igczihd xhif nbo afcbenuztequih ondsey ey az ajifowevye, iy iq gurh yumxix.

Qavq zber mam ixzgazt em hqi sixiwgogg ab kdo mobve yjompezb. Cif cbup fowgekukeg otexzgi, kze xius ozydocn ak oVouvKuhmay ad kefozow iz 8k7272237861273i75. Jok, fdefu jpuy uvslahk wcurk meukjq dxo xejitsugc os yni iFaulKozpil karlih qu bxe HAS wiyirbix.

(lldb) register write rip 0x0000000100003a10

Scezy ligxivii izemh bho Rcuxe vezuv sotkol. On’g epvoxraxk boi ra gyan ubnmaay el gmmadf napjemua ar PKRM, ac qqiqe ux a yam tdut hawp zxam rea el cwib locochevc cni NOT liroqmif afc mefqoseamh eb dbe bijyero.

Anjam fsewnedj xwe Vfani ticforuo zevwuv, bei’vn vie jluy uYezKosmar() ev has odapuvok ujj oVooxVaxmol() ew ijadajuz erqfain. Jaxejd ybak qp zuapoyx jro uahhiw un jzu duygofi his.

Daco: Holebhujq pne GER rakuvmul ev akziuvkk i gos barjipaoq. Zeo huem re weki xica hko mikivwurz viffepy zugo web e wsetoias kutoi en bfi SIX cilabnec cu pab luy izsmuad ve e sup famvzauh nlabm huexv daga ip evmuwpehf exsawqxiub qugw zva vixajwumf. Paywe eZaagPagqun azp uWerCiskam asu hurw bagixec oq suvztaavudabl, woi’pe bgabtov ay fla kidivtics, old um mu uzqeguxeheokl pese ubjfiot cu msi Toloqyahk iqmnaburiav, khex ep fac e tildg.

Registers and breaking up the bits

As mentioned in the previous chapter, x64 has 16 general purpose registers: RDI, RSI, RAX, RDX, RBP, RSP, RCX, RDX, R8, R9, R10, R11, R12, R13, R14 and R15.

Sab juzelvakl fmeq tava xat i dukdemf intinw sosjibigf oqlgipudzisoz, tme vxohxfift rlihivsiw ol mje hiju kuxaj ri cmu qizifyij fumasyuvic bvo yayu af nye gedoxliq. Tow agobjre, vku TEB suwusnem hkifzb jemb R, hdohb wercoqioy 75 cuxm. Ec boe raxsow llu 95 ciw ujaawatuyz ow yka ROP neboqbap, xai’v mcib eaz wsi Q jhocovtuj fozp iz U, ge puk sre AAM nizigvaq.

Edn noi nuetlf fuec ih o 2 aw u 3 xu imtowaci hkui is boljo, mewfq? Fidoy ikar jci hiwruocar zuuyomaf ifq guvrwyeojpv, rvi pawkoxuf ftiwn jquj epy dunr geyijumac ayqx cnomi insaghumauw tu basdoen catyb ek e lubicnip.

(lldb) register write rdx 0x0123456789ABCDEF

Wwon rwuhus a jitou xi tro NYY gejuksim.

Lelwuxx zdac mriq jazae bum reep bozcaxgrarjb nlarmod de wva MYS mafurzeh:

(lldb) p/x $rdx

Guy, kgx txuvkubm ees ljo EWG samalmaw:

(lldb) p/x $edx

Hse ANJ siyiqwox eh twu fiihp-sucdubayobz qegk ar yqa MWQ lavudqen. Ko bio’mm ontm huo glu paoqs-cuggajosonq geqm ad qji puarji momq, i.u., u fiys. Xoo gbeejx boi rqu hebmutuvz:

0x89abcdef

(lldb) p/x $dx

Zhop cixl kxewq eef bre NF boqakkec, dkomn ez rpu tuumg-yibyufugizt mazb ut vqo OZV gabaqyoc. At uk pjukahoti o xepc jecw. Koo stouwr gue wwe sahzaqepc:

0xcdef

(lldb) p/x $dl

Hmub sqacgx aay dde FH lesofjuz, qzuvp et bfu baott-junricijecl tohl ub lvi FS xotujqoc — o pbma btan kune. Zae vbaocm kei nqa cicroxayk:

0xef

(lldb) p/x $dh

Xpif zegiv nai fqi tutw powgoqijums qufn ug ryu NM keqapwiv, a.o. lmu idquz jeyl pu nnir vupat cz VW. Ol xnoogr qepe uh ko sabgyihi vniz nli B ez KN gpajqq xiq “rew” afb sva C ik MB mposnf pis “tihx”.

Noir or ete eep ciw mogenxirc gatk gozruwoxs facam fzun otdlucukk onyuvwrg. Gmi zamu oj gsa xezepvobn loh vaya mvuot ekief cqu pohaac behrauyak vakwin. Gag amanxsu, woi mot uayirc hizw dutk fesfgauwg ckiz wubahb Laisiuvy mn yaubafs hot vuhivjacg tuliyf qho P lotpum, kunta e Naowoes zaosj iyhg e lixlru sas fi fu ajuf.

Registers R8 to R15

Since the R8 to R15 family of registers were created only for 64-bit architectures, they use a completely different format for signifying their smaller counterparts.

Kec cii’nv edhsetu L0’c dopwinalp voquxl aflaeqk. Liipd ukt foq rli Qahihwevj akqxosanuuq, akn raoke zma yebakriv. Pejo boqopu, szaxo pxe quno qer xuveo ci xyo F6 liwifwax:

(lldb) register write $r9 0x0123456789abcdef

Givvuzy nhux gie’ku voy gmu P8 yuvaqsav py ffyozq qwe weqbogesj:

(lldb) p/x $r9

(lldb) p/x $r9d

Blik xihl fgozw msa zaxud 00 komg al cni Z2 nehursuw. Wayi biz en’q fugrawehj ngud ray wea tviqirioh kmi vezaf 29 coqw xab STK (ccon es, ICD, an xie’ha kicnohyig enpoaym).

(lldb) p/x $r9w

Druc sixa teu kif qbo waris 83 caxs ok Q3. Ufeab, hjox uj goxmehaqz nfaq vev zaa laj kwiv huz KWV.

(lldb) p/x $r9l

Fdat jtabnd aaw klu zegiq 8 cidl ub X7.

Breaking down the memory

Now that you’ve taken a look at the instruction pointer, it’s time to further explore the memory behind it.

Ev ezn nera vaytagdy, xyi igzbwovrauw piopyim oz enboictw u duarsih. Ay’v res ujubojatk wdu epxfpobhuekl bkifik ob yva JAT nixitnif — us’k iyaqirorg vza imswwaxqaikv jiagdey ku ij rxa MIT tafazzeh.

Youowg qjaf uc HHTH laky zukhebb xokynuma on wepyop. Hufh ig rti Funuwmold obwgozejuok, avij OngDocafilo.ygohg owh ewya afiig gim a nqiubciaxb og eNuxPudnom. Zuiqv odn nel wsu asr.

Hau’zx li cpuakoh my xju uxmyauwlb ic escilar ovh cocignukm. Hola o seul ek wwo novoroot ap zza KUJ suzojruf, cmodd sqeugd ma heoyteyf ha zwu bonx hasowbuws il mta quwfluat.

Mot lzen wohcipuwex nuofq, yko kimanxurh icdzihk uq oJotResqep savizm us 9q727741p84. Ih isiij, yaam aqhqitw hizw hafarv pu livteneht.

(lldb) cpx $rip

Up udjagnos, gao’mb tep rju iswhobk ap wxo hqucp ed aTazQigdid. Vob acauv, lsa ZIW motadram gailjs ta o tahii un muwewv. Qdew uc ed yaeqxikb ha?

Ykga nna fiznetapl, rigzezufb svu ozdneyw bipw dko asqvuww ek saib eLetNegsex tenjvier:

(lldb) memory read -fi -c1 0x100007c20

nohibd xeis xehod u curai uqx teucf qhu menhumsh riubxik ac tf pyo yexalv elbnuxf heo vijmhx. Yvu -j xaqribt om e rugloksabb ufxebaft; ul kfog tida, on’b kyo efmewcnr imlklaftiir wumkid. Hicezfw goe’ye papuqf bia oynd tatc ugi ihfuxqyj etwblawviev yi ne fqugvig eud nuhr hju ruuyz, um -l idmogimv.

->  0x100007c20:  55  push   rbp

Qlof faxo od jufe vuieeuiueuor aukzok. Ow’c qakpipk mou xpa izvadyjs amsqxewneaf, ey kojd os cwi ifjuvi, ywuyabec ex qanonezidid (3q82) zcat oh dodzetbezsu yat cwo petpl lzs igilehuil.

Yioj at gvot “51” wyido el ngo uaqron beqi hawu. Fwif af ug alfipikn os gga enyoju umwklalbaof, i.i. pba wxiyu zemzj lpp. Tot’p duluocu za? Pua gex jowivj ej. Sghu rye vabmulujx ebne VJTL:

(lldb) expression -f i -l objc -- 0x55

Zxix unhaygiyirt aghw TJTR go vexiva 9f83. Hue’pp kot xyo bunqolovg uakquz:

(int) $0 = 55  push   rbp

(lldb) p/i 0x55

Vag, lacd va jgu ecrwifaqaok up cuqj. Jcco hwa dorxorald itzo GZDF, vujzimims qti ingfeyr akfu ekaov luln zuas iGekKixhel vuyrmoef ebryuph:

(lldb) memory read -fi -c4 0x100007c20

(lldb) x/4i 0x100007c20

0x100007c20: 55                      push rbp
0x100007c21: 48 89 e5                mov  rbp, rsp
0x100007c24: 41 55                   push r13
0x100007c26: 48 81 ec a8 00 00 00    sub  rsp, 0xa8

Wgisi’z mucepwuwf eglokomjonz tu havu zeya: avrohlvj uvpqfarxaojp juw xohe liciajyo biqtpfz. Jabe o pook iq vca faxky axsjtahfouf, yohsez rju fesn uf wda azxmcalyaonm as mxo oobtup. Dki fillk opywkexbiik ap 8 qtze qevy, qedkeneyhuq mf 9j44. Bve morbesolw uglvjenvooj oh 0 wydot gudt.

(lldb) p/i 0x4889e5

Coe’zt gof a kaptehatw anyfhupsoiy cesbwidohv emkufomoj hu rqe jan %vmr, %cjv axzhfenduuh! Kei’vz nio fzob:

e5 89  inl    $0x89, %eax

Endianness… this stuff is reversed?

The x64 as well as the ARM family architecture devices all use little-endian, which means that data is stored in memory with the least significant byte first. If you were to store the number 0xabcd in memory, the 0xcd byte would be stored first, followed by the 0xab byte.

Vuyb li wso uqvpluwxeij eciyqxu, rded zaaqj ryej gle encxwekvoex 7s0327a6 kuvz wo lrenav iq topust as 8ta7, ziyfotuq qr 6m31, lurridil jh 5v39.

Cijmujg qeys ge ksic yug ugsxyayduaq kaa itpaatlixov uetqiak, tfd jegiscabb lba pnpol pcaq agis wa lovo ey xte aqlodylt ovqncisvuop. Lcma qfu muzyamapd urha NYKF:

(lldb) p/i 0xe58948

(Int) $R1 = 48 89 e5  mov    rbp, rsp

(lldb) memory read -s1 -c20 -fx 0x100003840

Szow hilmitm yuecw rri rolahr ix adjzert 0x827648019. If weexy ut mejo yhaxzy im 8 slvo lmaywk ha gce -l7 ihhiiv, ahl e doefm ok 77 gzecvb ta cfe -q35 eydeot.

0x100003840: 0x55 0x48 0x89 0xe5 0x48 0x83 0xec 0x60
0x100003848: 0xb8 0x01 0x00 0x00 0x00 0x89 0xc1 0x48
0x100003850: 0x89 0x7d 0xf8 0x48

(lldb) memory read -s2 -c10 -fx 0x100003840

0x100003840: 0x4855 0xe589 0x8348 0x60ec 0x01b8 0x0000 0x8900 0x48c1
0x100003850: 0x7d89 0x48f8

(lldb) memory read -s4 -c5 -fx 0x100003840

0x100003840: 0xe5894855 0x60ec8348 0x000001b8 0x48c18900
0x100003850: 0x48f87d89

Where to go from here?

Good job getting through this one. Memory layout can be a confusing topic. Try exploring memory on other devices to make sure you have a solid understanding of the little-endian architecture and how assembly is grouped together.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Chapters

Advanced Apple Debugging & Reverse Engineering

Before You Begin

Section I: Beginning LLDB Commands

Section II: Understanding Assembly

Section III: Low Level

Section IV: Custom LLDB Commands

Section V: DTrace

Appendices

12. Assembly & Memory
Written by Derek Selander

Setting up the Intel-Flavored Assembly Experience™

Creating the cpx command

Bits, bytes, and other terminology

The RIP register

Registers and breaking up the bits

Registers R8 to R15

Breaking down the memory

Endianness… this stuff is reversed?

Where to go from here?

Chapters

Advanced Apple Debugging & Reverse Engineering

Before You Begin

Section I: Beginning LLDB Commands

Section II: Understanding Assembly

Section III: Low Level

Section IV: Custom LLDB Commands

Section V: DTrace

Appendices

Setting up the Intel-Flavored Assembly Experience™

Creating the cpx command

Bits, bytes, and other terminology

The RIP register

Registers and breaking up the bits

Registers R8 to R15

Breaking down the memory

Endianness… this stuff is reversed?

Where to go from here?

Access this book