Adding Augmented Reality to an ML Kit-Powered Android App

Machine Learning (ML) is quickly becoming a common feature in everyday computer interactions, and we often don’t even realize it. One example is ML-based image recognition for camera apps, such as the app introduced in the article “Face Detection on Android With Google ML Kit.” As demonstrated in that article, ML is now accessible even to individual developers thanks to projects like Google’s recently introduced new Firebase SDK called ML Kit.

In the previous article we created an example app that takes a picture, hands it off to ML Kit to do face detection, and uses data returned by ML Kit to outline the features of the face.

In this article, we’re going to explore how we can add Augmented Reality (AR) features with the ML capabilities already in the app to place a red lip image on top of a person’s lips when we take a picture.

To create this app, we’re going to continue using the sample app created in the previous article. If you haven’t had the chance to read the previous article, make sure to read it and create the example app to follow along here.

Getting Started

To recap what was accomplished in the previous article, we created an app that could take a picture of a user. Then, using Google’s ML Kit SDK, it could detect whether there was a face in the picture and provide some extra information, such as the location of facial features. Using this information, we drew an overlay of the detected facial features on the image.

In this article, we are going to build off what we learned and see how we can add AR capabilities based on the information we got from ML Kit.

To accomplish this, we’re going to create another button that will use another Graphic class like the one made for the ML Kit article to create an image overlay on top of the user’s face. We will look into displaying a normal picture of a lip and we’ll also looking into setting color filters to get different color lips.

We’ll use the Android Studio project and Firebase registration information already created for the first part of this tutorial.

Getting the AR Overlay Asset

Before we start diving into the code, we first need to start looking for assets that we can use. For a commercial app, we would most likely hire designers to make extravagant and interesting images to use. In our case, we are making a simple app to demonstrate how to overlay an image on top of our own image. I used two lipstick assets from pixabay.com. First we’ll use Red lips with teeth:

We’ll also use red lipstick mark:

For the first image, we’re going to apply the asset directly over the lips. This asset will be referred to as lip in the app.

For the second image, we’re going to do something more interesting. We’re going to apply color filters over the image to re-use it with different colors. This asset will be referred to as lip_filter in the app.

Understanding the ML Kit API to Use

To add our lip overlay on top of the user’s lip, we need to use ML Kit’s Facial Landmark recognition to locate where the user’s lips are.

To accomplish this, we need to look for the lips from the FirebaseVisionFace object we receive from ML Kit. This object stores all the facial information of the user’s face. The landmark that we are going to be working with is MOUTH_BOTTOM.

Reading the API, MOUTH_BOTTOM gives us: “The center of the subject’s bottom lip.”

Unlike the previous example, this time we have a location that is close to where we want to append our image in our canvas, but it’s not the exact location we are looking for. We will need to do some work to ensure that we place our asset in the correct location.

Writing the Code

This time, when we finish, our result will look something like this:

Or with our lip color filter on, like this:

To allow the user the ability to add these lips, we are going to change the layout of our existing app. We’re going to:

Create a new (for now disabled) button that will allow us to draw a lipstick on the detected face.
Create a spinner which provides color options to allow the user to choose the color they want their lipstick to be.
To create our layout, here’s what our new activity_main.xml looks like:

				
					<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout
        xmlns_android="http://schemas.android.com/apk/res/android"
        xmlns_tools="http://schemas.android.com/tools"
        android_layout_width="match_parent"
        android_layout_height="match_parent"
        tools_context=".MainActivity">

    <TextView
            android_layout_width="188dp"
            android_layout_height="wrap_content"
            android_id="@+id/happiness"
            android_layout_alignParentBottom="true"
            android_layout_marginBottom="82dp"
            android_layout_alignParentStart="true"/>
    <ImageView
            android_layout_width="wrap_content"
            android_layout_height="wrap_content"
            tools_layout_editor_absoluteY="27dp"
            tools_layout_editor_absoluteX="78dp"
            android_id="@+id/imageView"/>
    <com.example.mlkittutorial.GraphicOverlay
            android_layout_width="wrap_content"
            android_layout_height="wrap_content"
            android_id="@+id/graphicOverlay"
            android_layout_alignParentStart="true"
            android_layout_alignParentTop="true"
            android_layout_marginStart="0dp"
            android_layout_marginTop="0dp"/>
    <LinearLayout android_layout_width="match_parent" android_layout_height="wrap_content"
                  android_layout_alignParentStart="true" android_layout_alignParentBottom="true">
        <Button
                android_text="Take Picture"
                android_layout_width="wrap_content"
                android_layout_height="wrap_content"
                android_layout_weight="1"
                android_onClick="takePicture"
                android_id="@+id/takePicture"
                android_visibility="visible"
                android_enabled="true"/>
        <Button
                android_text="Detect Face"
                android_layout_weight="1"
                android_layout_width="wrap_content"
                android_layout_height="wrap_content"
                android_id="@+id/detectFace"
                android_onClick="detectFace"
                android_visibility="visible"
                android_enabled="false"/>
        <Button
                android_text="Draw Lip"
                android_layout_weight="1"
                android_layout_width="wrap_content"
                android_layout_height="wrap_content"
                android_id="@+id/drawLip"
                android_onClick="drawLip"
                android_visibility="visible"
                android_enabled="false"/>
    </LinearLayout>
    <Spinner
            android_layout_width="145dp"
            android_layout_height="wrap_content"
            android_layout_alignParentEnd="true"
            android_layout_marginEnd="0dp"
            android_layout_alignParentBottom="true"
            android_id="@+id/colorSpinner"
            android_layout_marginBottom="78dp"/>
</RelativeLayout>

Here’s what it looks like on Android Studio:

Here’s MainActivity.kt, updated to use our new options:

				
					package com.example.mlkittutorial

import android.content.Intent
import android.content.res.Resources
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.graphics.Color
import android.support.v7.app.AppCompatActivity
import android.os.Bundle
import android.provider.MediaStore
import android.view.View
import android.widget.ArrayAdapter
import com.google.firebase.ml.vision.FirebaseVision
import com.google.firebase.ml.vision.common.FirebaseVisionImage
import com.google.firebase.ml.vision.face.FirebaseVisionFace
import com.google.firebase.ml.vision.face.FirebaseVisionFaceDetectorOptions
import kotlinx.android.synthetic.main.activity_main.*

class MainActivity : AppCompatActivity() {
    private val requestImageCapture = 1
    private var cameraImage: Bitmap? = null
    private var faces: List<FirebaseVisionFace>? = null
    private var color = arrayOf("None", "Red", "Blue", "Green", "Yellow")

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)

        // setup the list of colors to show in our spinner
        val adapter = ArrayAdapter(this, android.R.layout.simple_spinner_item, color)
        adapter.setDropDownViewResource(android.R.layout.simple_selectable_list_item)
        colorSpinner.adapter = adapter
    }

    /** Receive the result from the camera app */
    override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
        if (requestCode == requestImageCapture && resultCode == RESULT_OK && data != null && data.extras != null) {
            val imageBitmap = data.extras.get("data") as Bitmap

            // Instead of creating a new file in the user's device to get a full scale image
            // resize our smaller imageBitMap to fit the screen
            val width = Resources.getSystem().displayMetrics.widthPixels
            val height = width / imageBitmap.width * imageBitmap.height
            cameraImage = Bitmap.createScaledBitmap(imageBitmap, width, height, false)

            // Display the image and enable our ML facial detection button
            imageView.setImageBitmap(cameraImage)
            detectFace.isEnabled = true
        }
    }

    /** Callback for the take picture button */
    fun takePicture(view: View) {
        // Take an image using an existing camera app
        Intent(MediaStore.ACTION_IMAGE_CAPTURE).also { takePictureIntent ->
            takePictureIntent.resolveActivity(packageManager)?.also {
                startActivityForResult(takePictureIntent, requestImageCapture)
                happiness.text = ""
                graphicOverlay.clear()
            }
        }
    }

    /** Callback for the detect face button */
    fun detectFace(view: View) {
        // Build the options for face detector SDK
        if (cameraImage != null) {
            val image = FirebaseVisionImage.fromBitmap(cameraImage as Bitmap)
            val builder = FirebaseVisionFaceDetectorOptions.Builder()
            builder.setContourMode(FirebaseVisionFaceDetectorOptions.ALL_CONTOURS)
            builder.setClassificationMode(FirebaseVisionFaceDetectorOptions.ALL_CLASSIFICATIONS)
            builder.setLandmarkMode(FirebaseVisionFaceDetectorOptions.ALL_LANDMARKS) // different

            val options = builder.build()

            // Send our image to be detected by the SDK
            val detector = FirebaseVision.getInstance().getVisionFaceDetector(options)
            detector.detectInImage(image).addOnSuccessListener { faces ->
                displayImage(faces)
            }
        }
    }

    /** Draw a graphic overlay on top of our image */
    private fun displayImage(faces: List<FirebaseVisionFace>) {
        graphicOverlay.clear()
        if (faces.isNotEmpty()) {
            // We will only draw an overlay on the first face
            val firstFace = faces[0]
            val smilingChance = firstFace.smilingProbability * 100
            val faceGraphic = FaceContourGraphic(graphicOverlay, firstFace)
            graphicOverlay.add(faceGraphic)
            happiness.text = "Smile Probability: " + (smilingChance) + "%"

            // Save the face and enable the lip drawing button
            drawLip.isEnabled = true
            this.faces = faces
        } else {
            happiness.text = "No face detected"
        }
    }

    /** Draw a graphical lip on top of the user's lips */
    fun drawLip(view: View) {
        graphicOverlay.clear()
        if (faces != null) {
            val position = colorSpinner.selectedItemPosition

            // based off of the position of our item, we pick the matching color
            val color = when (position) {
                1 -> Color.RED
                2 -> Color.BLUE
                3 -> Color.GREEN
                4 -> Color.YELLOW
                else -> null
            }
            val bmp = when (position) {
                0 -> BitmapFactory.decodeResource(resources, R.drawable.lip)
                else -> BitmapFactory.decodeResource(resources, R.drawable.filter_lip)
            }

            // Iterate through all of detected faces to create a graphic overlay
            faces?.forEach {
                val faceGraphic = FaceLipGraphic(graphicOverlay, it, bmp, color)
                graphicOverlay.add(faceGraphic)
            }
        }
    }
}

If you compare the code to the code from the previous project, you can see that we’ve made some changes.

In OnCreate, I created a list adapter that will be used to display the available options for the color for our lips that we want to select.
In detectFace, I enabled landmarkMode, to enable us to query landmark information about our face.
In displayImage, we made some changes in the code to save our FireBaseVisionFace object. This is so if we run detection, it is stored so we can use it later to draw our lipstick on the user’s picture instead of having to detect it again. Once detection has been run, we also enable the draw lip button, and save the list of faces we want to draw our lipstick on.
In drawLip, which is connected to our drawLip button, we check to see which lip option the user wants to use from the spinner. Based on that option, we load our specific image resource into a bitmap, then iterate through our list of faces, and finally pass our Bitmap and color into our new FaceLipGraphic class to draw the lipstick.

We haven’t seen our new FaceLipGraphic class yet, but on the high level, it’s similar to FaceContourGraphic class provided by Firebase, in that we draw something on top of the picture we’ve taken. Here’s what the FaceLipGraphic class looks like:

				
					package com.example.mlkittutorial

import android.graphics.*
import com.google.firebase.ml.vision.face.FirebaseVisionFace
import com.google.firebase.ml.vision.face.FirebaseVisionFaceLandmark

/** Graphic instance for rendering face contours graphic overlay view.  */
class FaceLipGraphic(overlay: GraphicOverlay, private val firebaseVisionFace: FirebaseVisionFace?, private val lips: Bitmap, color: Int? = null)
    : GraphicOverlay.Graphic(overlay) {

    private val lipPaint: Paint = Paint()

    init {
        lipPaint.alpha = 70
        if (color != null) {
            lipPaint.colorFilter = PorterDuffColorFilter(color, PorterDuff.Mode.SRC_IN)
        }
    }

    /** Draws the lips on the position on the supplied canvas. */
    override fun draw(canvas: Canvas) {
        val face = firebaseVisionFace ?: return
        val mouth = face.getLandmark(FirebaseVisionFaceLandmark.MOUTH_BOTTOM) ?: return

        // Get the center position of the bottom lip
        val mouthX = mouth.position.x
        val mouthY = mouth.position.y

        // Calculate the ratio of the size of a mouth to a face to re-size the lip image we have to an ideal size. I found this number through trial and error.
        val idealWidth = (face.boundingBox.width() / 2.5).toInt()
        val idealHeight = (face.boundingBox.height() / 4)
        val lipImage = Bitmap.createScaledBitmap(lips, idealWidth, idealHeight, false)

        // Our lip image will start at the center of the bottom lip. To allow our image to be in the center
        // we need to move our image half of its width to the left to move the center of our lip to be on the
        // center of the user's lips.
        val lipPositionX = mouthX - idealWidth / 2

        // We need to create an offset to move our lips to the center. Currently we're at the center of the bottom of our lip.
        // The center of the bottom lip is 1/4 of the whole lip. If we want to find the real center of the mouth, we need
        // to move our height another 1/4 up.
        val bottomLipOffset = idealHeight / 4

        // Just like our width, we need to move our image's height position half of its height up to be at the center
        // of the bottom lip. We also move our image up by another 1/4 of the bottom lip size so that the image will be
        // at the center of the user's mouth
        val lipPositionY = mouthY - idealHeight / 2 - bottomLipOffset

        // Draw the lipstick into our canvas
        canvas.drawBitmap(lipImage, lipPositionX, lipPositionY, lipPaint)
    }
}

Inside FaceLipGraphic, we do calculations to figure out where to place the lipstick image, then we draw it on to the picture we have taken.

In init, we set up our newly initialized Paint object that we will use to draw our lipstick in to the Canvas. The first thing we do is set the alpha to 70, which makes our image somewhat transparent, to see what is underneath the object.

Next, we check to see if a non-null color was provided. If it is valid, we can set a color filter on our Paint object that will allow us to color in our image. I won’t get into the specifics, but I chose to use the PorterDuffColorFilter, which allows us to “ tint the source pixels using a single color and a specific PorterDuff.” I used SRC_IN as my PorterDuff.Mode, because it replaces our existing image with our new color.

Next, in draw(), we do the necessary calculations to place our lipstick image in the right position. I’ve added extensive comments on the calculations I did to derive the proper location in the code.

To those unfamiliar with the Android’s canvas coordinates, my math might appear wrong, but I assure you it is correct. Here’s some more explanation:

A normal graph takes place in the first quadrant, so (0,0) would be located in the bottom-left corner. Android’s Canvas, on the other hand, takes place in the fourth quadrant, so (0, 0) is located at the top-left corner.

Here’s an image to represent the coordinate system:

Specifically, unlike what most people usually assume, subtraction on our Y-axis actually moves our image up instead of down. The X-axis remains the same. With this knowledge, you should be able to understand the calculations I did to determine the coordinates at which the lipstick image needs to be drawn.

Conclusion

When you first started reading this article, you might have thought that using ML and AR in your app might have been outside of your capabilities. Hopefully, after going through both this article and the previous article, where you learned how to use Google’s ML Kit to detect both faces and facial features, you have learned how to use that information to add AR functionality to your app. Now, you should now be more confident on how to use this technology to create more innovative apps.

Another benefit from using Google’s ML Kit is that, thanks to the power of the Arm processor that is built into our phone, we are able to do all this on-device, with no need to send any data to the cloud for processing.

You’ve already seen how we were able to apply lipstick images on a user’s image, but that’s only the tip of the iceberg in terms of what can be accomplished with ML and AR! If you want to continue learning how to expand, see how you might be able to add googly eyes over an eye, or place a funny nose on top of someone’s nose. Good luck, and have fun!

If you’re interested in developing expert technical content that performs, let’s have a conversation today.

ContentLab

ContentLab provides high-quality written articles, tutorials, courses, and other technical marketing materials to industry leaders. We create no-nonsense tech content that’s purpose-built to attract, educate, and engage your technical audience.

All Posts »

POST INFORMATION

If you work in a tech space and aren’t sure if we cover you, hit the button below to get in touch with us. Tell us a little about your content goals or your project, and we’ll reach back within 2 business days.

Adding Augmented Reality to an ML Kit-Powered Android App

Getting Started

Getting the AR Overlay Asset

Understanding the ML Kit API to Use

Writing the Code

Conclusion

ContentLab

POST INFORMATION

capabilities

navigation

Share on Mastodon

Adding Augmented Reality to an ML Kit-Powered Android App

Getting Started

Getting the AR Overlay Asset

Understanding the ML Kit API to Use

Writing the Code

Conclusion

You may also like

ContentLab

POST INFORMATION

capabilities

navigation