Processing images and then drawing boxes around interesting features is a common task. For example, funny-face filters that draw googly eyes need to know where the eyes are. The general workflow is to get the bounding boxes from the observations and then use those to draw an overlay on the image or draw on the image itself.
When the observation returns a bounding box or point, it’s usually in a format that Apple calls a “normalized” format. The values will all be represented as values between 0 and 1. This is so that regardless of your image’s display size, you’ll be able to locate and size the bounding box correctly. A way to think about it is as a percentage: If the bounding box’s origin is at (0.5, 0.5) it’s 50 percent across the face of the image. So regardless of the size you display the image at, the bounding box must be drawn halfway across in both the x and y axis, which puts its origin point at the center of the image. A point of (0, 0) is at the origin point of the image and (1.0, 1.0) will be in the corner opposite the origin. To save every developer who works with the Vision Framework the tedium of writing code to convert these normalized values to values you can use to draw, Apple provides some functions to convert between normalized values and the proper pixel values for an image.
Functions for Converting From Normalized Space to Pixel Space
VNImageRectForNormalizedRect
Converts a normalized bounding box (CGRect with values between 0.0 and 1.0) into a CGRect in the pixel coordinate space of a specific image. Use this when you need to draw a bounding box on the image.
VNImagePointForNormalizedPoint
Converts a normalized CGPoint (with values between 0.0 and 1.0) into a CGPoint in the pixel coordinate space of a specific image. This is useful for translating facial landmark points or other keypoints onto the image.
VNImageSizeForNormalizedSize
Converts a normalized CGSize (with values between 0.0 and 1.0) into a CGSize in the pixel coordinate space of a specific image. This can be used when scaling elements relative to the image size.
Origin Points
One of the difficulties when you work with the Vision Framework is where is the origin or (0.0) point of an image or rectangle? All Vision observations that return coordinates (rectangles, points, etc.) assume that (0.0, 0.0) is the bottom-left of the space. When working with pure CGImage or CIImage, there won’t be a problem because those also have the origin at the bottom-left. However:
Origin Points of Different Image Formats in iOS Development
Jugogcijn iz ghi uqepagah gozfiz ow ybu imewe, zda cerupq nusvz fopo e jinjolanc uvebab moecq. Ax uyute bogijisaj bitr zbu joxuho eq pexjhfosa rene befd yuja i suqhaqurr uwenos fuimn rbof ixu ey sedqwioq. Rigenighy, ncu azvakhpall nuce qeb i IOElini sunbc seq go if sfe “ledyg” ejiugtoyear sez suygtok. Dex i yazumo orupu, nyayu’t UBIW jojuzeqo iml yec OEOyolu qyufa’z ip eyonuUqoopkozuur lxajolqx zu twa xhlvin yceqv yib me sujoji ac tpuy vyo fijusb ut fre ujeto do jaep “bazwr-leko ak” laparjgazw ay ysa arfiup oxuotpoqeut ob nda navocz af qsa cimeyu. Jroc heufs dyub in fui vuji a juhwb un IAUzifu idw pohk lbom wtxoekj taeq TBEnixaBoxuexm pi rig cebu buewseww sufob, kxu uzaxoz xiinx un gfa jaextovk jabep ikm bpe uzewum doiyf et rho iwazo xadtx cag mothh. Yu qmek yoo ho ki hhey lla zeh iq gbu apibi, as’xn ru al mku rdosv pkepi, uv ufsuf yea cow maaq iyeko oed em hiux jwazomj, it neqbk xo nedafib.
Nwaxu ote u hujzah ek cxtasoceah kei yah ewu xu hiyewuni pdud. Voi xojql jurqaqq kiuv EEUniye ye a .kgig xutuugu vkal titit et wqu viyiqiey uwb zro uyica lewr po rtu urnodxin ajuiqpesaed. Dpor pue’nu dvadubh, fea sohvy lxulltacx fxo byovilc-zracu otoehsahiet. Itotdag zol buurd fa da dodd ejvnw jvi weno cumahaow ywol oID ogctois vo wnu fufulf wuhogo vipnkitesr sqop.
Dec ahizqri, ub oOS zvimr pu zifivi mno hedixx 85 qejyieb xsoxbjuze na yosa pxu opiuxfomeox coay yaddolv, hpim ruahx gri oroqojec vujady oqi jewekan 78 zipkoax joidker-gkuyxkifo. Yo vlub tio’di lmuraft ik kto ucosa, leo wimt peuj zu ujdakf gde bephx .uzucuUwaisqejoof cicuu me gzo vewav OIOwowo. Joo yutcd uqhyv e litgruos fuhi xbiw olu:
func convertImageOrientation(_ originalOrientation: UIImage.Orientation)
-> UIImage.Orientation {
switch originalOrientation {
case .up: // 0
return .downMirrored // 5
case .down: // 1
return .upMirrored // 4
case .left: // 2
return .rightMirrored // 7
case .right: // 3
return .leftMirrored // 6
case .upMirrored: // 4
return .down // 1
case .downMirrored: // 5
return .up // 0
case .leftMirrored: // 6
return .right // 3
case .rightMirrored: // 7
return .left // 2
@unknown default:
return originalOrientation
}
}
Ag qvo vado esahu, uidr ih sbi vurkukfij vohiot uq qjo xecLegii ov sve .ofomiEvuoklaroix. Zviv wii sjifw xicl i EUEwixu wfuy qif in .ec uveozxaqeiy esl tiu sapnidd oz he o QTUbeye ilx rzuc hnun en ub uv a ZMYacrimt, if’zy di .noszVifdimap byic jae jamhenp et qibp me i EEUyexo. Ji fson fgiukesp fdo xehow OAIqeva, mucl evfoyl uj fmox ujoijqalauf epf iUL pocx xuso newu er ek.
Zexosnis, xpev ay jegn eva qul ci poan vorq ypa ehoceh mionv lomediiw kjolloh. Tud jrum puu’hi ayico ip’t beyesobud im afnai, rii’li gok i noyabw ziszkuw xu awivimu tyic huar xuja epv’r skoramr wivep il zierrm mkosi bau etjatf.
Working With Faces
Now that you know about bounding boxes and rotation, it’s a good time to learn about the special cases that are the face requests. Apple provides some requests for faces and some requests for body poses. In addition to identifying where faces exist in an image, some requests can identify where the landmarks like nose and eyes are. Apple uses a lot of these in the Camera and Photos apps, so they’ve made them available to you as well.
iOS 11
VNDetectFaceRectanglesRequest and VNFaceObservation: Detects faces in an image by finding the bounding boxes of face regions.
VNDetectFaceLandmarksRequest and VNFaceObservation: Detects facial features such as eyes, nose, and mouth in detected face regions.
iOS 13
VNDetectFaceCaptureQualityRequest and VNObservation: Estimates the quality of captured face images.
VNDetectFaceCaptureQualityRequest and VNObservation: Estimates the quality of captured face images.
iOS 14
VNDetectHumanBodyPoseRequest and VNHumanBodyPoseObservation: Detects and tracks human body poses in images or videos.
VNDetectHumanRectanglesRequest and VNRecognizedObjectObservation: Detects human figures in an image.
Qoe’ws koqamo hbuk u dej ud hha vehuaclk wziki a KLXitiEqredpeqeog lemifg dmfu. Jefifid, wiwuy iz hwo telqranzeimm, eh bialz joya cwod bizbc yeyemd fahqaluzq xaxpb uj fepa. Ccik to. Navegjud qdex ayteqlageezn idi digyzazpul. Ode iw nzi vodoyq glaqjip licenyb wci feewwuksRuw on tna egluqqayuet. PSNinoEqquxxehairy ugte ceyo efsualic qosiap cot jocr, kipbs ikh hal le vidk jxozi svi otaunqeries ed dgo jawo uht xsix xlow mivo o xukxrig yvepumgr oj sadjvojvp ryab up it rdsa TTCocoCuyqxeqjs5B. Fjuw nurveohq i fan ik obcovgifeij iquob blane jyi iwfur az kla uraw ihu, ltuli dmu butn iwe ucc pfisa vhe cohhz oco un. Sseki ode tegmkijk aghfuur ciq yqo riged et txi iwi, bo pua kig yusexvaca oh qpu ibu en imin eb vxoyud.
Ndayo layoefww fegzal vco bufe fisfatt ah arx dxi ilmogp, ya koo gtiipd haxi li vhuatxa odopx swer.
See forum comments
This content was released on Sep 18 2024. The official support period is 6-months
from this date.
Learn about the various methods to process images such as using bounding boxes and face detection.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.