Mirantis Acquires Docker Enterprise Platform Business

LEARN MORE

How to build an edge cloud part 1: Building a simple facial recognition system

If you look at the internet, there’s a lot of talk about edge clouds and what they are — from a conceptual level.  But not too many people are telling you how to actually build one. Today we’re going to start to change that.

In the next few articles, we’re going to build a simple edge cloud that demonstrates the basic concepts behind why you’d want to use them and how they work.  We’ll be emulating an edge-cloud based surveillance camera system that analyses a live video stream looking for faces, then notifies the user when a stranger appears on their doorstep.

The target edge cloud architecture

The idea behind an edge cloud is that it moves processing closer to where the actual data is, but moves it to more powerful hardware where necessary.  Consider this architecture:

From a general standpoint, we have the end user client, which interacts with an edge cloud, which then feeds into a regional or central cloud. But let’s make that more concrete.

In our surveillance camera example, the client is the camera itself.  This might be a doorbell camera, or a webcam, or even someone’s cell phone.  In another context, it might be a gaming device, a cash register, an industrial sensor, or any other Internet of Things (IoT) object.

The client feeds to the edge cloud.  So in this case, the camera is sending a video stream to an application on an edge cloud.  In the real world, this might be a cloud that’s located directly in a cell tower handling local transactions, or it might be a small cloud inside a retail establishment.  For the purpose of our example, the edge cloud will host an application that analyses the video and looks for frames that include faces.

The edge cloud can feed to a regional cloud, or directly to a central cloud.  For example, in a real environment, the edge cloud might report transactions and inventory levels back to a cloud in a nearby corporate data center. In our example, the edge cloud will feed into a regional cloud, which will take any frames that include faces and identify the people in those frames by comparing them against “known” occupants and visitors.  If a “stranger” is found, the regional cloud reports the incident to the central cloud.

The central cloud is the nerve center for the entire operation; in a real world application this might be the main corporate database or other centralized systems. In our example, the central cloud is responsible for making decisions on what to do if a stranger is seen in the surveillance camera. It might alert law enforcement or take other actions. For our purposes, we’ll send an email with the stranger’s photo so the user can make a judgement of what to do.

Over the course of the next several articles, we’ll go through the process of building up these clouds so you can see how they can work together.  We’ll create containers managed by Kubernetes, look at how we can run VM-based resources in those clouds, and even look at options for storage that will help us move data around without having to do it explicitly.

For the moment, however, we need to start with the basics: creating the pieces of functionality our cloud is going to need.

Getting started: detecting faces in the video

The first piece of functionality we’re going to build is the ability to detect when faces appear in a video.  This may seem like a complicated process, but it’s actually pretty straightforward, especially using a library such as OpenCV, which is made specifically for computer vision tasks.

You can find a simple explanation of how face detection works in this article, or more details of the math behind it here, but the short version is that OpenCV has pre-trained classifiers that know how to recognize facial features such as eyes, noses, ears, and so on.  We’ll make use of those classifiers to detect whether and where OpenCV sees a face.

Start by making sure you have Python 3 installed, then then install OpenCV using pip:

pip3 install opencv-python

Now we’re ready to get started.  In our example we’re assuming that we have a connected video camera streaming into the system, but if you don’t actually have a connected camera you want to hack into at the moment (and I don’t) we can go ahead and use our webcam using OpenCV’s capabilities. 

Create a new file called camera.py and add the following code:

import cv2

# For webcam
cap = cv2.VideoCapture(0)

# For predefined video file
# cap = cv2.VideoCapture('filename.mp4')

while True:
    # Get a single frame
    found_frame, frame = cap.read()

    # Display the captured frame
    cv2.imshow('Video Feed', frame)

    # Stop when escape key is pressed
    k = cv2.waitKey(30) & 0xff
    if k==27:
        break

cap.release() 

After importing the OpenCV library, we’re creating an object that represents the video we’re going to analyze.  In my case I’m going to use the webcam, but you also have the option to use a predefined video file instead.

Once we’ve done that, we’re setting up an infinite loop that goes through and reads the current frame from the camera, returning a boolean value representing whether an image was successfully acquired, as well as the actual image frame itself.  From there we’re displaying the frame in a window called “Video Feed”.

Now, although what we’re seeing is going to LOOK like video, it’s actually just a series of frames, each displayed for 30 ms.  To make that happen, we’re using the cv2.waitKey() function; if we didn’t, the image wouldn’t appear at all — or would disappear so fast it would be like it never happened.  If you set the delay to 0, the image will remain static until the user presses a key.

Since we’re looking for the key anyway we can use this function as a way to exit the loop, using the 0xff mask to help get the ASCII value of the actual key pressed.

Finally, we’re releasing the camera object to clean things up. 

Now if you go ahead and run this file with Python 3, you’ll see a window that shows what your webcam sees.

Now aside from the questionable taste of my bedroom furniture, you can see that we’ve got a pretty large window, and that it’s titled with the name we gave it.  If you try to close this window, you’ll find the next frame just pops right up. To close it you need to either hit the escape key or go back to the terminal and press CTRL-C.

Now let’s start actually looking for faces in these frames.  The first thing we need to do is get the training data for the classifier.  The CascadeClassifier actually has a number of different files we can get depending on what we’re looking for, but we want the whole face, so download the haarcascade_frontalface_default.xml file.  (In the OpenCV Github repo you can see some of your other options, such configurations for just eyes, or for smiles, or for Russian license plates, for that matter.  The important thing is that the classifier has to load the proper training to find specific objects.

import cv2

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# To capture video from webcam.
...

Now we can go ahead and use the classifier to detect the faces.  To make things easier for the classifier, we’ll first convert the frame to a grayscale image:

...
while True:
    # Get a single frame
    found_frame, frame = cap.read()

    # Convert to grayscale and detect faces
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.1, 4)
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)

    # Display the captured frame
    cv2.imshow('Video Feed', frame)
...

After we convert the color, we can go ahead and do the detection.  In this case we’re using the detectMultiScale() function, which uses a progressively more detailed look at the image.  (For a more detailed explanation of how it works and the parameters involved, see the documentation.)

The detection method gives us a list of values that represent the area of the image in which it thinks there’s a face, and we can go ahead and add a rectangle to the image to show where it thinks those faces are.  We’ll specify the image, the upper left and lower right corners, the color (note that this is in blue-green-red, not red-green-blue) and the thickness of the line. (You can use the cv2.circle() function as well.)

Now if we run it we get a look at where OpenCV thinks the face is in the image:

Notice that I said where it THINKS the face is in the image; I wasn’t able to capture it here but you’re virtually guaranteed to get false positives in these images.  For example, for some reason it sees faces in my venetian blinds. (Which is kind of creepy, actually.)

Later, we’ll be vetting these faces so that we don’t get a ton of false positives sent to the user.

Now we need to deal with the frames that have faces in them.  In the real world we may send them on immediately, but for now let’s just save them to disk:

import cv2

i = 0
def send_image(face_img):
    global i
    i = i+1
    if i % 10 == 0:
        filename = "saved"+str(i)+".jpg"
        cv2.imwrite(filename, face_img)

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
...
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.1, 4)

    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)

    if len(faces) > 0:
        send_image(frame)

    # Display the captured frame
    cv2.imshow('Video Feed', frame)
...

First we’re creating the new function send_image(), and telling it to save every 10th image, so we don’t overwhelm the server (or the user).  Then just before we display the image, we’re checking to see if it has any faces, and if so, we’re sending it to the send_image() function, where it’ll get saved to disk.  Note that because we’re doing this AFTER the cv2.rectangle() call, the saved image does have the rectangles in it.

In the next section we’ll send the image on to the next step, but first we need to build that script.

Sending faces to the regional cloud for recognition

In later parts of this series, we’ll look at different ways to share data around a cloud architecture, but for the moment, we’re simply going to send images over HTTP.  To do that, we need to create a simple web server. Rather than building an entire WSGI architecture, we’re going to use Flask, which lets us easily create a development HTTP server.  Installation is straightforward with pip:

pip3 install flask

Now let’s go ahead and create the web server.  Later, we’ll move all of these scripts to their own clouds, but for now you can create it on the same machine on which you created camera.py.  

Create a new file called receiveframe.py and add the following code:

from flask import Flask

app = Flask(__name__)

@app.route('/')
def docroot():
     return "This is the regional cloud."

if __name__ == '__main__':
      app.run(host='0.0.0.0', port=5000)

This is perhaps the most basic web server you can build.  We’re creating the Flask object, in this case called app, using a decoration to specify that for the route “/”, or the main document root, we want to execute the docroot() function.  (The function is arbitrary; you can call it anything.) The function itself returns a simple text string.

Finally, we’re just specifying that we want the app to listen on all IPs, and on port 5000.  (The default is to listen on port 5000, but only from localhost.) Now if we run this script, we’ll see that it stays alive:

$ python receiveframe.py
 * Serving Flask app "receiveframe" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)

And then if we point a browser to it, we’ll see the text returned.

So that’s the simple version.  Now let’s create a “page” that takes a file upload.  You are probably familiar with uploading files from the browser; on the backend, we need to receive that file with a POST request:

from flask import Flask
from flask import request
from werkzeug.utils import secure_filename

app = Flask(__name__)

@app.route('/')
def docroot():
     return "This is the regional cloud."

@app.route('/check_image', methods = ['GET', 'POST'])
def check_image():

    if request.method == 'POST':
        frame = request.files['face_frame']
        filename = secure_filename("saved_"+frame.filename)
        frame.save(filename)
        return "Saved "+filename
    else:
        return "Not post."

if __name__ == '__main__':
      app.run(host='0.0.0.0', port=5000)

First we’re importing the request object from the flask package, and then the secure_filename function.  This function does manipulation on filenames to prevent someone from hacking your server by uploading a file named, say, “../../../init.d”.  That’s not so important now, but it will be later when we’re taking the filename that’s submitted by an external routine.

Next we’re creating a second URL to be served, /check_image, and specifying that it can accept both GET and POST requests.  If the request is a POST request, we’re requesting the request.files array, and pulling the parameter named file. Note that this is an arbitrary name; as long as it matches what’s being submitted, this will work.

Now let’s send a file to see this in action.  We could build an HTML form, but we’ll do it directly from the script just to make things simpler (and because we’re going to need to know how to do this anyway).  We’ll use the Python Requests library:

pip3 install requests

Now put an image file in the same directory and call it target.jpg, just so we have something to send, and add the following to the script:

from flask import Flask
from flask import request
from werkzeug.utils import secure_filename
import requests

app = Flask(__name__)

@app.route('/')
def docroot():
    files = {'face_frame': open('target.jpg', 'rb')}
    result = requests.post('http://localhost:5000/check_image', files=files)
    return result.text

@app.route('/check_image', methods = ['GET', 'POST'])
def check_image():
...

As you can see, we’re simply creating a JSON object that includes a parameter of face_frame with a value that is the binary content of the file we want to send.  We’re then sending that as the payload of a POST request.

Now if we go ahead and call the web server, it will make a request to the check_image URL, send the file and return the result text, which is what was returned by check_image():

If we check the filesystem, we can see the saved_target.jpg file:

$ ls
camera.py                           saved_target.jpg
haarcascade_frontalface_default.xml target.jpg
receiveframe.py

Now we’ve got the file, so we can look at whether it’s a “known” person.  The first thing we need to do is get the face_recognition library and the dlib library it depends on.  Install dlib and then get face_recognition with

pip3 install face_recognition

What face_recognition does is create a set of encodings of known faces, then compare an “unknown” face to those encodings and determine whether it matches any of the known people.  We won’t go into what a “match” is here in detail — you can check the documentation for more information — but keep in mind that this is not perfect. For example, these images match:

But with default settings, these are considered two different people, even though we as humans know they’re not:

I’ve created a knownusers directory of photos of people.  In this case they’re characters from Star Wars, because … Star Wars, but in the real application the user would submit people who are expected to be at the house, and the system would segregate these “known” individuals by user.

We’ll start by reading that directory and creating encodings for those images:

...
import requests
import face_recognition
import os

images = []
known_faces = []
directory = 'knownusers'
for filename in os.listdir(directory):
    this_image = face_recognition.load_image_file(directory+"/"+filename)
    images.append(this_image)
    for face in face_recognition.face_encodings(this_image):
        known_faces.append(face)

app = Flask(__name__)
...

In this case we’re simply looping through all of the files in our knownusers directory and using those files to create an array of images; for each image, we’re also creating an array of face encodings.  Note that the code assumes that there can be more than one face in these “known” photos.

Now we’re ready to compare our target image:

...
@app.route('/check_image', methods = ['GET', 'POST'])
def check_image():

    if request.method == 'POST':
        frame = request.files['face_frame']
        filename = secure_filename("saved_"+frame.filename)
        frame.save(filename)
        try:
            unknown_image = face_recognition.load_image_file(filename)
            unknown_face_encoding = face_recognition.face_encodings(unknown_image)[0]
            results = face_recognition.compare_faces(known_faces, unknown_face_encoding)
            return "{}".format(not True in results)
        except Exception as inst:
            print("No face in "+filename)
            print(inst)
            return "False"

if __name__ == '__main__':
      app.run(host='0.0.0.0', port=5000)

We’ve already saved the target image to the filesystem, so now we can load it using the same routines we used to load the “known” images.  A couple of things to note here, regarding facial recognition. First, remember that the original images had some false positives; this routine eliminates those, but this may give us a situation in which the image has zero faces, and since we’re just looking at the “first” face in the image, that would cause an error.  To solve this problem, we’re enclosing the routine in a try/except block.

Second, when we run the comparison, it returns a list of boolean values stating whether the target image matches each face.  If it does match, it will the results list will return True, so if all values are False, the user is a stranger, and we want to return True. (Got that?) And of course, if there are no faces, there are no strangers, so we’re returning False.

Now if we run the script we’ll see True or False depending on whether the image is of a person in our knownusers directory.

So in this case, the picture — which was a picture of me — is of a stranger.  Now we just want to go ahead and report the stranger:

...
import os

def report_stranger(filename):
    print("Stranger!!!")

images = []
...
            results = face_recognition.compare_faces(known_faces, unknown_face_encoding)
            is_stranger = (not True in results)
            if is_stranger:
                report_stranger(filename)
            return "{}".format(is_stranger)
        except Exception as inst:
            print("No face in "+filename)
...

We’ll take care of the actual reporting process in the next step, but there’s one more thing we have to do:  connect the camera to this routine. In camera.py, add the following, just as we did before:

import cv2
import requests

i = 0
def send_image(face_img):
    global i
    i = i+1
    if i % 10 == 0:
        filename = "saved"+str(i)+".jpg"
        cv2.imwrite(filename, face_img)
        files = {'face_frame': open(filename, 'rb')}
        r = requests.post('http://localhost:5000/check_image', files=files)

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
...

We’re simply loading the file and sending it to the check_image routine.  In this case, we don’t care about the results, so we’re not even checking it.  

Make sure that the web server is running in one window and run camera.py in another.  In the webserver window, you should see the output telling you there is a “stranger”:

$ python3 receiveframe.py
 * Serving Flask app "receiveframe" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
Stranger!!!
127.0.0.1 - - [04/Nov/2019 10:59:14] "POST /check_image HTTP/1.1" 200 -
Stranger!!!
127.0.0.1 - - [04/Nov/2019 10:59:16] "POST /check_image HTTP/1.1" 200 -
Stranger!!!
127.0.0.1 - - [04/Nov/2019 10:59:35] "POST /check_image HTTP/1.1" 200 -
No face in saved_saved80.jpg
list index out of range
127.0.0.1 - - [04/Nov/2019 10:59:37] "POST /check_image HTTP/1.1" 200 -
Stranger!!!

If you turn the camera so there are no faces, the output will stop, because nothing is being sent to the web server — that is unless there’s a false positive, such as the one shown here.  We can pull that up and see what the original routine thought was a face:

As you can see, there really IS a face there, but it’s too dark to recognize.  You can also see the non-faces marked in the blinds.

You can easily change this routine to send the false positives to another routine to help tune the machine learning models.

Now that we’ve got these two routines hooked up, the last step is to do the actual reporting.

Sending notifications to the central cloud

We’re in the final stretch!  All we need now is a routine that will take care of the reporting. There are all kinds of options here, from tracking to sending an email; we’ll start with an email for now.

You’ll need an email address you can send from; if you use a Gmail account, make sure to turn on “less secure app access”.

From here, it’s a matter of using the email package:

import email, smtplib, ssl
from flask import request
from email import encoders
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from flask import Flask

app = Flask(__name__)

@app.route('/sendmail', methods = ['GET', 'POST'])
def sendmail():

    subject = "Stranger alert!"
    body = "Your camera has spotted an unidentified individual."
    sender_email = "edgeemaildemo@gmail.com"
    receiver_email = "user@example.com"
    password = input("Type your password and press enter: ")

    # Create a multipart message and set headers
    message = MIMEMultipart()
    message["From"] = sender_email
    message["To"] = receiver_email
    message["Subject"] = subject
    message.attach(MIMEText(body, "plain"))

    text = message.as_string()

    # Log in to server using secure context and send email
    context = ssl.create_default_context()
    with smtplib.SMTP_SSL("smtp.gmail.com", 465, context=context) as server:
        server.login(sender_email, password)
        server.sendmail(sender_email, receiver_email, text)
    return "Sent."

if __name__ == '__main__':
      app.run(host='0.0.0.0', port=5050)


Here we’re once again creating a simple webserver (since we’ll ultimately be sending the photo of the stranger to the user).  We’re simply creating the message, then creating an SSL connection, logging into the server, and sending the mail.  

Note also that because we’re running this on the same computer as the receiveframe.py script, we’ve moved it to port 5050.

If we run this webserver, then access it by calling the URL http://localhost/sendmail, we’ll get an email in our inbox:

Now we need to go ahead and add the image to it:

import email, smtplib, ssl
from flask import request
from email import encoders
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from werkzeug.utils import secure_filename
import requests

from flask import Flask
app = Flask(__name__)

@app.route('/')
def docroot():
    files = {'stranger': open('target.jpg', 'rb')}
    r = requests.post('http://localhost:5050/sendmail', files=files)
    return str(r.text)

@app.route('/sendmail', methods = ['GET', 'POST'])
def sendmail():

    subject = "Stranger alert!"
    body = "Your camera has spotted an unidentified individual."
    sender_email = "edgeemaildemo@gmail.com"
    receiver_email = "user@example.com"
    password = input("Type your password and press enter: ")

    if request.method == 'POST':

        # Create a multipart message and set headers
        message = MIMEMultipart()
        message["From"] = sender_email
        message["To"] = receiver_email
        message["Subject"] = subject

        message.attach(MIMEText(body, "plain"))

        f = request.files['stranger']
        filename = secure_filename(f.filename)
        f.save(filename)

        with open(filename, "rb") as attachment:
            part = MIMEBase("application", "octet-stream")
            part.set_payload(attachment.read())

        # Email has to be ASCII characters
        encoders.encode_base64(part)
        part.add_header(
            "Content-Disposition",
            f"attachment; filename= {filename}",
        )

        message.attach(part)

        text = message.as_string()
        # Log in to server using secure context and send email
        context = ssl.create_default_context()
        with smtplib.SMTP_SSL("smtp.gmail.com", 465, context=context) as server:
            server.login(sender_email, password)
            server.sendmail(sender_email, receiver_email, text)
        return "Sent."
    else:
        return "Not POST."

if __name__ == '__main__':
      app.run(host='0.0.0.0', port=5050)

Starting at the beginning, we’re not doing anything new, just sending the target image to the sendmail routine. From there, we’re saving the file just as we did before, but then we’re creating a new part for the email and adding the binary data to it.  We then convert it to text and attach it to the message.

Now if we call the main URL for the web server — http://localhost:5050 — we’ll still get a simple message of “Sent.” but if we check our email, the image is attached.

Now we just have to get the receiveframe.py script to send any stranger images to this routine using the report_stranger() function:

import face_recognition
import os

def report_stranger(filename):
    files = {'stranger': open(filename, 'rb')}
    r = requests.post('http://localhost:5050/sendmail', files=files)

images = []
known_faces = []
...

Now if we run the camera, we’ll get a bunch of emails showing frames from the camera:

If you take any of those photos and add it to the knownusers folder, you’ll see that you no longer get those emails.  (You’ll have to stop the receiveframe.py script and start it up again so that it picks up the new images.)

Ok!  So that’s all our base functionality.  

Next up:  “cloudifying” the application

Now that we have all of our base functionality we can focus on turning this into an edge cloud solution.  In part 2, we’ll look at deploying Kubernetes clusters and creating containerized versions of these routines to run on them.

LIVE DEMO
How to Use Service Mesh with VMs and Containers
REGISTER