3D Ear Identification Using Block-wise Statistics based Features and LC-KSVD

Lin Zhang, Lida Li, Hongyu Li, and Meng Yang


Introduction

This is the website for our paper "3D ear identification using block-wise statistics based features and LC-KSVD", IEEE Trans. Multimedia 18  (8) 1531-1541, 2016.

Biometrics authentication has been corroborated to be an effective method for automatically recognizing a person's identity with high confidence. In this field, the use of 3D ear shape as a biometric trait is a recent trend. As a biometric identifier, ear has several special inherent merits. However, though a great deal of efforts have been devoted, there is still a large room for improvement for developing a highly effective and efficient 3D ear identification approach. In this paper, we attempt to fill this gap to some extent by proposing a novel 3D ear classification scheme, making use of the label consistent K-SVD (LC-KSVD) framework. As an effective supervised dictionary learning algorithm, LC-KSVD learns a single compact discriminative dictionary for sparse coding and a multi-class linear classifier simultaneously. To use the LC-KSVD framework, one key issue is how to extract feature vectors from 3D ear scans. To this end, we propose a block-wise statistics based feature extraction scheme. Specifically, we divide a 3D ear ROI into uniform blocks and extract a histogram of surface types from each block; histograms from all blocks are then concatenated to form the desired feature vector. Feature vectors extracted in this way are highly discriminative and are robust to mere misalignment between samples. Experimental results demonstrate that the proposed approach can achieve much better recognition accuracy than the other state-of-the-art methods. More importantly, its computational complexity is extremely low at the classification stage, making it quite suitable for the large-scale identification applications.


Source Code

The source code can be downloaded here: LCKSVD_LHST.zip (Extract code: 77n2).


Evaluation Results

In experiments, we used the UND Collection J2 dataset. This dataset contains 2346 3D side face scans captured from 415 different persons, making it the largest 3D ear scan dataset so far. Those range images were collected using a Minolta Vivid 910 range scanner in high resolution mode. There are variations in pose between them and some images are occluded with hair or ear rings.

To evaluate the performance of our method, however, we cannot simply conduct experiments on the whole dataset since some classes in UND-J2 have only 2 samples. As pointed out in [1], classification schemes based on sparse coding need sufficient samples for each class in the gallery. Consequently, we virtually created four subsets from UND-J2 for experiments. Specifically, we required that each class should have more than 6, 8, 10, and 12 samples, respectively. For subset1, we randomly selected from each class 6 samples to form the gallery set and the rest samples were used to form the test set. For subset2, we randomly selected from each class 8 samples to form the gallery set and the rest samples were used to form the test set. For subset3 and subset4, similar strategies were used to generate the gallery and test sets. To make it clear, major information about the four subsets used for evaluation is summarized in Table I.

We use the recognition rate as the performance measure. In addition, the running speed of each competing method was also evaluated. Experiments were performed on a standard HP Z620 workstation with a 3.2GHZ Intel Xeon E5-1650 CPU and an 8G RAM. The software platform was Matlab R2013b.

TABLE I
SUBSETS USED IN OUR EXPERIMENT

subset index

no. of classes

gallery size probe size total samples

1

127

762

715

1477

2

85

680

461

1141

3

62

620

291

911

4

39

468

168

636

Recognition rates and time costs are listed in the following two tables.

TABLE II
RECOGNITION RATES BY USING DIFFERENT IDENTIFICATION METHODS (%)

 

subset 1

subset 2 subset 3 subset 4

ICP

83.22

90.02

94.09

95.83

Zhang et al. [2]

83.78 90.67 94.50 96.43

SRC_LHST

92.17 94.36 96.56 98.81

LCKSVD_LHST

92.86 95.88 98.63 100

TABLE III
TIME COST FOR ONE IDENTIFICATION OPERATION (SECONDS)

 

subset 1

subset 2 subset 3 subset 4

ICP

5.356*105 
3.763*105 
1.876*105
1.287*105

Zhang et al. [2]

2.425
2.424
2.423
2.420

SRC_LHST

0.074
0.070
0.066
0.056

LCKSVD_LHST

0.058
0.034
0.033
0.018

 


Reference                

[1] J. Wright et al., "Robust face recognition via sparse representation," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2, pp. 210-227, 2009.

[2] L. Zhang et al., "3D ear identification based on sparse representation," PLoS ONE, vol. 9, no. 4, pp. e95506.1-9, 2014.


Created on: Oct. 03, 2014

Last update: Jul. 16, 2016