Manufacturing Part Recognition by Training 2D Views

This project tackles the problem of recognising highly similar manufacturing parts using only their 3D CAD models, without requiring real annotated images. By leveraging synthetic data generated in a Unity-based simulation environment, it trains a deep network to identify parts through a two-stage classification approach. The method applies domain randomisation to bridge the gap between simulation and real-world conditions.

Code Link

Python

Pytorch

Multi-View CNN

Domain Randomisation

Sim2Real

ResNet

Linux

Project Overview

Problem Definition: Recognising similar-looking manufacturing parts using only 3D CAD models, without real annotated image datasets.
Simulation Environment: Developed a Unity-based setup replicating a real-world multi-camera rig to render synthetic images from 3D models.
Data Augmentation: Adopted domain randomisation, including background textures, lighting variation, geometric transformations, and motion blur to enhance generalisation to real-world images.
One-Stage Approach (Baseline): Adapted MVCNN architecture with three ResNet18 branches for multi-view feature extraction and part classification using synthetic data.
Two-Stage Architecture: Introduced a scalable two-stage network:
- Stage 1: Predicts cluster ID from pseudo-labelled groups of parts (e.g. 20 clusters).
- Stage 2: A specialised classifier per cluster refines the part classification.
Training Details: All networks trained using Adam optimiser for 40 epochs, with weight decay and learning rate scheduling. Pretrained ResNet18 backbones used in all branches.
Qualitative Results: Visual tests show the two-stage model is more robust to occlusion, lighting variation, and minor part differences.