Real-Time Face Recognition Using CCTV Images: A Guide to Understanding Face Recognition Technology

SourceFuse
7 min readMay 22, 2024

--

By Madhura Kansara, Junior Engineer, SourceFuse

The system recognizes multiple people in a CCTV image

In today’s digital era, face recognition has emerged as a pivotal technology, revolutionizing various sectors from security to personalized experiences. From access control to attendance systems and lost child detection, face recognition’s applications are diverse, spanning security, education, and public safety sectors. However, recognizing faces accurately amidst diverse conditions poses a unique set of challenges. In this guide, we delve into the intricacies of face recognition, exploring techniques to ensure robust identification and verification using Python and advanced machine learning tools.

Understanding Face Recognition

Face recognition technology utilizes advanced algorithms to analyze and compare facial features extracted from images or video footage, enabling identification and verification of individuals. The provided Python code showcases the implementation of face recognition. Let’s explore the main components of the code to gain insights into the process.

  1. Importing Libraries:

In this section, necessary libraries are imported to perform various tasks such as face detection, image processing, data augmentation, and face recognition.

import face_recognition
import cv2
import numpy as np
from deepface import DeepFace
from utils import apply_blur, generate_unique_random_numbers, find_cosine_distance_helper
from utils import apply_resize
from utils import augment_data, face_distance
import os
from PIL import Image
from mtcnn.mtcnn import MTCNN
import random

2. Dataset Preparation:

The dataset preparation phase involves iterating through a directory containing images of known individuals (known_people_dir). For each person, the code creates an output directory within the train_dataset directory. It then loads each image, detects faces using the MTCNN (Multi-Task Cascaded Convolutional Networks) model, crops the detected face region, and saves it within the corresponding output directory (known_people_train_dir ). Additionally, data augmentation techniques such as blurring, resizing, and applying random transformations are applied to increase dataset variability and enhance the robustness of the face recognition system.

for person_name in os.listdir(known_people_dir):
person_dir = os.path.join(known_people_dir, person_name)
if os.path.isdir(person_dir):
output_person_dir = os.path.join("train_dataset", person_name)
os.makedirs(output_person_dir, exist_ok=True)
for filename in os.listdir(person_dir):
image_path = os.path.join(person_dir, filename)
image = cv2.imread(image_path)
faces = mtcnn.detect_faces(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
if faces:
for face in faces:
x, y, w, h = face['box']
left = max(x, 0)
top = max(y, 0)
right = min(x + w, image.shape[1])
bottom = min(y + h, image.shape[0])
if right > left and bottom > top:
output_face_path = os.path.join(output_person_dir, f"{filename}.jpg")
cv2.imwrite(output_face_path, image)
# Apply data augmentation
apply_blur(output_face_path, output_folder=output_person_dir)
apply_resize(output_face_path, output_folder=output_person_dir)
augment_data(output_face_path, output_folder=output_person_dir, face_coordinates=(left, top, right, bottom), prefix=filename)

Train Dataset

3. Detection of Face:

Face detection is performed using the MTCNN (Multi-Task Cascaded Convolutional Networks) model, which is capable of detecting faces in images. Detected faces are then used for further processing.

# Detect faces in the image using MTCNN
faces = mtcnn.detect_faces(rgb_image)

4. Extracting Bounding Box Coordinates:

# Get the bounding box coordinates of the face
x, y, w, h = face['box']
# Ensure that the bounding box coordinates are valid
left = max(x, 0)
top = max(y, 0)
right = min(x + w, image.shape[1])
bottom = min(y + h, image.shape[0])

Rectangular Bounding boxes are drawn around the detected faces

5. Augmentation of Images:

Data augmentation techniques such as blur, resize, and random transformations are applied to the extracted face images to enhance the dataset’s diversity.

i. Random Transformations

def augment_data(original_image_path, output_folder, face_coordinates, num_augmented_images=3, should_add_jitter=True, prefix=""):
# Load the original image
original_image = Image.open(original_image_path)
# Convert face image to grayscale
face_image_gray = original_image.convert('L')
# Define torchvision transforms for data augmentation
data_transforms = transforms.Compose([
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(degrees=15),
transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
transforms.ToTensor(),
])
# Apply data augmentation and save augmented images
for i in range(num_augmented_images):
# Apply different transformations to each augmented image
transformed_image = data_transforms(face_image_gray)
augmented_image_path = os.path.join(output_folder, f"{prefix}augmented_{i + 1}.jpg")
transforms.ToPILImage()(transformed_image).save(augmented_image_path)
print(f"Augmented image {i + 1} saved to {augmented_image_path}")

Original-Random Transformations

ii. Resize and Blur

def apply_blur(image_path, output_folder, kernel_size=(7, 7)):
# Load the image
image = cv2.imread(image_path)
# Apply Gaussian blur
blurred_image = cv2.GaussianBlur(image, kernel_size, 0)
# Save the blurred image
filename = os.path.basename(image_path)
output_path = os.path.join(output_folder, f"blurred_{filename}")
cv2.imwrite(output_path, blurred_image)
print(f"Blurred image saved to {output_path}")

def apply_resize(image_path, output_folder, target_size=(256, 256)):
# Load the image
image = cv2.imread(image_path)
# Resize the image
resized_image = cv2.resize(image, target_size)
# Save the resized image
filename = os.path.basename(image_path)
output_path = os.path.join(output_folder, f"resized_{filename}")
cv2.imwrite(output_path, resized_image)
print(f"Resized image saved to {output_path}")

Original-Resized-Blurred

6. Storing Augmented Images in Train Directory:

Processed images, including cropped faces, blurred faces and augmented images, are stored in a train directory (train_dataset). This directory structure facilitates easy access to training data for building the face recognition model.

# Save the face image
cv2.imwrite(output_face_path, image)
# Apply data augmentation on the face image
apply_blur(output_face_path, output_folder=output_person_dir)
apply_resize(output_face_path, output_folder=output_person_dir)
augment_data(output_face_path, output_folder=output_person_dir,
face_coordinates=(left, top, right, bottom),prefix=filename)

7. Encoding Known Faces from the Training Dataset:

The code iterates through directories within our designated training dataset directory, known as known_people_train_dir. Within each directory, representing a specific individual, it processes each image file. The code verifies the validity of each image file, loads it, and extracts facial features using advanced techniques. These features are encoded into numerical vectors, known as face encodings, using the DeepFace.represent function. These encodings, along with the corresponding person's name, are then appended to lists for further processing.

By incorporating augmented data alongside original images, our model’s training dataset becomes richer and more diverse, leading to improved accuracy and robustness in face recognition across varied conditions and environments.

for person_name in os.listdir(known_people_train_dir):
person_dir = os.path.join(known_people_train_dir, person_name)
# Check if it's a directory
if os.path.isdir(person_dir):
# Iterate over each file in the person's directory
for filename in os.listdir(person_dir):
image_path = os.path.join(person_dir, filename)
print(image_path)
# Check if the file is a valid image file
try:
with Image.open(image_path) as img:
img.verify() # Attempt to open and verify the image file
# Load the image file
person_image = face_recognition.load_image_file(image_path)
# Encode the face in the image-
face_encoding = DeepFace.represent(person_image,model_name="Dlib",detector_backend="mtcnn", enforce_detection=False)
# Append the face encoding and name to the respective lists
known_face_encodings.append(np.array(face_encoding[0]['embedding']))
known_face_names.append(person_name)
except (IOError, SyntaxError,IndexError):
# Ignore any files that are not valid image files
continue

8. Face Recognition Loop:

In the face recognition loop, the program continuously captures frames from the webcam, ensuring real-time face recognition. To optimize processing speed, each frame is resized, reducing computational load without sacrificing accuracy. Using the MTCNN face detection model, the program identifies faces within the frame, encoding their features for comparison.

# Continuous capture of frames from the webcam
while True:
ret, frame = video_capture.read()
# Resize each frame for optimized processing speed
small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
# Using MTCNN for face detection
rgb_small_frame = small_frame[:, :, ::-1]
result1 = DeepFace.represent(rgb_small_frame, model_name="Dlib", detector_backend="mtcnn", enforce_detection=False)
# Encoding features of detected faces for comparison
face_locations = [(res['facial_area']['y'], res['facial_area']['x'] + res['facial_area']['w'], res['facial_area']['y'] + res['facial_area']['h'], res['facial_area']['x']) for res in result1]
face_encodings = [res['embedding'] for res in result1]

By calculating cosine distances between the detected faces and known faces stored in the training dataset, the program determines potential matches.

# Calculating cosine distances between detected faces and known faces
for f_encoding in face_encodings:
face_distances = find_cosine_distance_helper(known_face_encodings, f_encoding)
best_match_index = np.argmin(face_distances)
if face_distances[best_match_index] <= 0.07:
name = known_face_names[best_match_index]
else:
name = "Unknown"
face_names.append(name)

9. Displaying Results:

Detected faces are displayed on the video feed, along with their corresponding names(if recognized, else “Unknown”). Rectangular boxes are drawn around the faces, and labels are added below each face for easy identification.

# Draw a bounding box around the face
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)
# Draw a label with a name below the face
cv2.putText(frame, text, (left + 6, bottom - 6), font, 1.0, (255, 255, 255), 1)
# Display the resulting image
cv2.imshow('Video', frame)

Output

The face recognition system achieves remarkable performance in real-time face detection and recognition tasks using the webcam. It accurately identifies known individuals with good precision and labels unknown faces appropriately as “Unknown.” The system operates with high confidence levels, enhancing its reliability and usability.

The model trained on Salman’s images accurately identified his face in the CCTV footage.

Detected known face labeled correctly

When Amitabh’s image, not in the dataset, was encountered, it was appropriately labeled as ‘unknown’.

Detected Unknown face labeled correctly

Conclusion

The implemented system efficiently performs real-time face recognition using CCTV cameras. By combining face detection, data augmentation, and face recognition techniques, it achieves accurate identification of known individuals from live video streams. The modular approach allows for easy extension and customization, making it adaptable to various surveillance and security applications.

However, it’s crucial to acknowledge the variability in accuracy influenced by factors such as lighting conditions, facial expressions, occlusions, and the quality of training data. These variables pose challenges to consistent performance but can be mitigated through continuous refinement and adaptation of the system.

We would love to hear from you! LET’S TALK!

Disclaimer:

Images used are under Fair Use: Copyright Disclaimer Under Section 107 of the Copyright Act in 1976; Allowance is made for “Fair Use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research.
Fair use is a use permitted by copyright statute that might otherwise be infringing. All rights and credit go directly to its rightful owners. No copyright infringement intended.

--

--

SourceFuse

Strategic digital transformation helping businesses evolve through cloud-native technologies