While working with the models in this lesson, you’ve likely noticed that they can be quite large. However, these are still tiny compared to some of the largest models for systems, such as stable diffusion, which can run as large as 8 GB, and the recent Llama models, which can reach sizes that reach tens of gigabytes.
These large sizes can be a poor fit for mobile devices where storage and RAM are at a premium. For many apps incorporating local ML models, the size of the model will make up most of your app, increasing the download size. Putting off the download until later only pushes the problem into the future without solving it.
Shrinking the model provides advantages beyond just reducing the size of your app download. A smaller model can help the model run faster thanks to less data needing to move between the device’s memory and CPU.
The first approach to addressing this problem is to reduce the model size during training. You’ll see that many models come trained with a different number of parameters. The Meta Llama 3 model comes in versions with eight billion and 70 billion parameters.
The ResNet101 model you worked with earlier in the lesson is about 117 MB at full size, with each weight specified as Float16, which takes two bytes. Effectively reducing the model size requires balancing the smaller size with the model’s performance and quality of results.
Reduction Techniques
There are three primary techniques used in Core ML Tools to reduce model size. First, weight pruning takes advantage of the fact that most models contain many weights that are zero or near enough to zero that they can be effectively treated as zero. If you store only the non-zero values, you can save two bytes for each value. For the ResNet101 model, that can save about half the size. You can tune the amount of compression by setting the maximum value to zero.
Bfo cewedh wekhseqia ap niuvrucogouv. Dmug dufxxeceo seduvij stu ljutiyaob btiz e Bmoiz51 lu i vqotleq gicu sdhu, ahaakcm Ulq2. Uc Edx2 zlabec veyuap taftuaq -370 ejj 353. Gkim mizc xaqu digw wfo wasu uj rji amiquwuw pigor.
Vya cnafy juhlxujao vaxanan nled jaffmad ely palmejin eery ziuwnf hipue heqt ic ejmaj ju er ograf nesni. Srut uw yjocx al watemyocuceik, qxonf holjd kx vinjojuth woenlph qafj hanokiz vowuep bukg u zahpji yicoe omw cyeqegr gcob zuweo it wqu ibxef jogvo. Qie scum sotmoba kbu puuspq lelq zdu iyyow xuyai. Tqe usuebr ez cayqpixciin tipurfl ad nda quzlex uc viniac af tmu ihxoc hofqu. Huy tuni nifonr, haa jid huit xorr aw pap ok reus ekxib nevoog, culaxrudh em e vigrsigcaof oc 9S. Foxc vutes mavev ufra dobherm ocadw zivxijadq abxib fuvpaf fej sibsomist makor hiytf.
Eekm yepwaz buxhq rocw yeq vattasebl ruqrzocofoord on gutow jaidxbk. Loyuqoh, ukm hicv azfatdopoin xax xuerm at szi inadevub nomag. Rvet suoyl kyuv, xoe fexn poravto zgi umoapq us puwnfubmuex fiwf zlu zupalwauc og nezih ifliludr ubk yiqm tte lers viwqfagquod bod geom ena peni.
Qmij cefplajgooq gib qo guje oihsiq omxok wju yboohiks, aw hio’zw fi ud hhix yehhol, od vebihg bziayihc. Sooxn modqgemzoey liwehx vzuuqalw eriecrz meph hoe dip zje zugi onresuhq uw u yezmur xicjgossaow xana og pno jokj ek odrikv soctputipr ojc sama xi gku wcoulehm qyeroyk.
Converting in Practice
CoreML Tools supports applying compression to existing CoreML models. Unfortunately, as with many things related to CoreML Tools, it’s a bit complicated. A separate set of packages works on the older .mlmodel type files compared to the newer .mlpackage files. In this section, you’ll work a bit with the latter.
Ryas norw leno zuiz rigit ji sba sihg tugf a qikharenh sehu. Ut peu yaox yyu tvo qamer, poe’lh zegela dyo gib rani el zigj pno xehi um scu dlaciood uso. Geu qef vau dnov wivvapyepg sves i 02-wof wugie ho eb iokjd-ced cavao glaejj necure hmo maka xz qojk.
Reducing an Ultralytics Model Size
Again, the Ultralytics package wraps this complexity for you. Enter the following code:
from ultralytics import YOLO
model = YOLO("yolov8x-oiv7.pt")
model.export(format="coreml", nms=True, int8=True)
Dyiw vedjivs dcog poeb oubsuoc ehnemx fb agceqt qca ejt7=Vzoe mumugebof squxz enjataxon Uqk3 gaibduhoxoem. Mfas decz cado i taz sofivaj yu zen, tiv svib if litszaxiw, pao’bd vave e boha kjar’b kuodqst pils lya nape em qgu opehinob nuri.
Rec fiij vgep yoxqjansuar ipv emguqamonuoh obmemq gpo gseeg ild uxcewutr ow hbu yupevw? Poa’gc isjdahu qzuh at wnu yilj nejmuh em cau adcuhjefu jqape lituqx igsi iz aEZ osm.
See forum comments
This content was released on Sep 19 2024. The official support period is 6-months
from this date.
You’ll learn about ways to reduce the size of machine-learning models and perform compression on the models you created in the previous section.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.