Edge AI Optimization

OpenVINO python Computer Vision Quantization Edge Devices

The Challenge

Running complex deep learning models for eye-tracking on consumer-grade edge devices (like laptops with basic webcams) was resulting in high latency. The standard models were too heavy, causing frame drops and a poor user experience in real-time applications.

The Solution

I led the optimization effort by porting PyTorch models to Intel OpenVINO. By applying aggressive **model quantization (FP32 to INT8)** and **pruning techniques**, I significantly reduced the model size without compromising accuracy. The inference engine was rewritten in highly optimized python for maximum performance on CPU.

Converted models to OpenVINO Intermediate Representation (IR).
Implemented post-training quantization to reduce memory footprint.
Profiled and optimized the python inference pipeline to remove bottlenecks.

Key Outcomes

4x

Faster Inference Speed

3.5x

Model Size Reduction

Project Info

Role Master Thesis Student / Engineer
Timeline Jan 2021 - Present
Area Computer Vision & Edge AI
Core Tech python, OpenVINO, PyTorch