Facial recognition is an increasingly accessible technology for consumer applications. Powered by machine learning, the capability centers around training a model to detect specific entities such as faces from images or live video streams. I previously wrote about Tensorflow, which supports image detection. Image detection evaluates what but stops short of image recognition, which evaluates who

In this article, I’ll share the steps of a recent project I’ve added to my smart home stack, which expands local imaging capabilities to include facial recognition, accomplished with Facebox and Home Assistant

About the imaging layer 

Before covering implementation details, it’s helpful to understand how facial recognition can add value to a connected home. I think of imaging capabilities as three distinct layers, as shown in the image below.

Imaging Layers

First, there’s the imaging layer, which enables foundational capabilities to capture images, view live streams, and integrate with other services and platforms such as Home Assistant. Consumer imaging has matured over the years to become widely accessible with a low barrier of entrance. Products like Amazon’s Ring and Google Nest offer plug and play cameras that don’t require an advanced home network, knowledge of networking protocols, or expensive installation. This simplicity comes at the cost of privacy. With these popular solutions, your data is owned by the service provider whose history of privacy protection is spotty at best. This isn’t necessarily a bad thing, but if privacy is a concern, these options aren’t ideal. 

The second layer involves image detection. Detection can help reduce false positives and screen for specific object types, such as people, animals, and vehicles. Consider the following use case. You have a motion-activated camera covering a front door entrance. While the camera works fine, the motion sensitivity picks up on false positives such as animals or the motion of vehicles from a nearby street within range of the motion sensor. An image detection layer can evaluate what is in frame and route alerts based on what’s detected. For example, pushing a notification if a human face is detected but not if a vehicle or animal was the motion trigger. Image detection is quickly becoming a standard feature for many consumer level products mentioned above. Amazon’s Ring doorbell now offers similar capabilities to cut down on false positives triggered by animals or vehicles. 

The final layer involves image recognition. With this most advanced layer, an imaging system can now identify who is detected. This feature is just starting to emerge with products such as the latest generation of Nest cameras. The trade-off being you can’t add train the model and must rely on 3rd party cloud processing. 

Local facial recognition with Facebox 

Enter Facebox. 

At its core, Facebox detects and identifies faces in photo using machine learning and runs in a container. This was perfect for my use case as I already leverage Docker for all my connected home services. After exploring other options such as Dlib and OpenCV, I settled on Facebox for the following reasons: 

  • Containerized and able to be set-up in less than 2 minutes by integrating with my existing Docker compose file 
  • Totally local. Works with or without an internet connection and doesn’t send the images used to train the model anywhere 
  • Simple integration with Home Assistant 
  • Free (up to 100 images) 

Getting started with Facebox

First, spin up a new container. For me, this involved adding a new entry to my Docker compose file, which looks like this:

Copy to Clipboard

That’s it. 

Training the model 

With the service up and accessible at http://localhost:8080, it’s time to train the model. This is where you’ll feed images and tell Facebox who they correspond with. The free tier allows for 100 images to be used for training the model. While not ideal, I’ve been able to consistently achieve ~90% with 10 faces using 10 images per person. This is one drawback of Facebox over other options, especially if your use case involves a larger number of faces you need to recognize. 

There are a few ways to train the model, but for my use case, the easiest way is to use this handy Python script developed by robmarkcole. Using the script simply involves structuring each face by creating a new folder. When you execute the script, it will associate each image in each directory with the corresponding name. 

The script is a must, in my opinion, as Facebox doesn’t currently offer a way to persist the trained model in a volume or mount. Anytime the container is bounced, the model will be wiped. Once the initial model is trained, Facebox has a feature to upload the state file on demand. This can be further automated by creating a cron job or script that fires anytime the container is bounced. 

Integrating Facebox with Home Assistant 

Refer to the oficcial Home Assistant docs for the latest, but as of 0.99.3, integration in config.yaml looks something like this: 

Copy to Clipboard

After conguring, you can validate the configuration by using the Services tab found in the Devleoper Tools page and calling a Facebox enabled entity:

Facebox Service

Then, check the output of that entity in the State tab: 

Facebox State

Automatically train the model when the container is bounced

It’s a pain to re-train the model whenever the Facebox container is bounced but fortunately, there’s an easy solution to automate it.

First, using something like Dockermon, setup a sensor to detect the state of Facebox’s container. If you’re using Dockermon, it’ll look like this:

Copy to Clipboard

Next, create a shell command (or script if you prefer) to execute the teach_facebox.py script mentioned above. In the example below, I’ve placed the teach_facebox.py script in the main HA config directory under the python_scripts folder.

Copy to Clipboard

Finally, create an automation to fire the script anytime the container status changes from off to on.

You can do this in Node-RED or Home Assistant. If you’re using Home Assistant’s native automation editor, it’ll look like this:


Automating with Node-RED 

Finally, with the set-up of Facebox and integration with Home Assistant complete, it’s time to begin automating the new service 

Sky’s the limit when it comes to incorporating image recognition into a connected home and so far, here are some of the ways I’ve found Facebox helpful: 

  • Security-related automation which announces who’s at the front door or who just entered the home  
  • Convenience/courtesy related automation that announces specific names based on context. For example, I’ve baked in the Facebox service to an existing Entertainment automation. The entertainment automation does several things when I want to watch some TV such as adjusting the lights based on the sun’s position, suspending nearby motion-based lighting automation, and so on. I use the Facebox service to announce a courtesy notice if I’m recognized, such as “Enjoy the show, Josiah.” Useless, yeah, but a pretty fun trick when entertaining. 
  • Notifications using Telegram based on room or general occupancy of certain people. From these notifications, I can use Telegram’s existing menu features to choose additional actions, such as showing me a live video stream, or broadcasting an audio message back to the house if I’m not there. 

Integration with Node-RED is pretty straight forward if you’re using Home Assistant. The image_processing.facebox_camera service will automatically show up (in this case, facebox_camera will be the name of your actual camera). 

The main point to be aware of is how to extract face matches using Node-RED’s concept of msg.payload

This took me a few hours before figuring out how to do this. By default, Node-RED uses the standard msg.payload, which for the Facebox service, represents the number of faces detected. This wasn’t helpful as I already have flows using Tensorflow for use cases that only need image detection. To parse which faces are recognized, use a Change node to set the payload from the default state to the specific msg.data.attributes.matched_faces property. 

The flow does the following: 

  1. A trigger occurs. In this example, it’s based on a motion sensor 
  2. Call the image processing scan service for the applicable camera(s) 
  3. Check the status of the image processing Facebox entity 
  4. Transform the payload (as shown in the screenshot below) 
  5. Use a switch node to take action based on what face or faces are detected 

Facebox Node-RED


That’s it. 

100% locally controlled, free facial recognition that easily integrates with popular smart home platforms like Home Assistant in under 20 minutes. 

Facial recognition adds an interesting new layer to any existing imaging capability and also continues improving context, which is a key pillar to a truly “smart” connected home experience.

You can learn more about Facebox here or Home Assistant here