8方向特征向量.pdf-淘文阁

资源描述

《8方向特征向量.pdf》由会员分享，可在线阅读，更多相关《8方向特征向量.pdf（5页珍藏版）》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。

1、A Study On the Use of 8-Directional Features For Online Handwritten ChineseCharacter RecognitionZhen-Long BAI and Qiang HUODepartment of Computer Science, The University of Hong Kong, Hong Kong, China(Email: zlbaics.hku.hk, qhuocs.hku.hk)AbstractThis paper presents a study of using 8-directional fea

2、-tures for online handwritten Chinese character recogni-tion. Given an online handwritten character sample, a se-riesofprocessingsteps, includinglinearsize normalization,adding imaginary strokes, nonlinear shape normalization,equidistance resampling, and smoothing, are performed toderive a 6464norma

3、lized onlinecharactersample. Then,8-directional features are extracted from each online tra-jectory point, and 8 directional pattern images are gener-ated accordingly, from which blurred directional featuresare extracted at 8 8 uniformly sampled locations usinga filter derived from the Gaussian enve

4、lope of a Gabor fil-ter. Finally, a 512-dimensional vector of raw features isformed. Extensive experiments on the task of recognizing3755 level-1 Chinese characters in GB2312-80 standardare performed to compare and discern the best setting forseveral algorithmic choices and control parameters. Theef

5、fectiveness of the studied approach is confirmed.1. IntroductionIn this paper, we study the problem of how to extract so-called directional features from an online handwritten char-acter sample to form a vector of raw features that can beused to construct a classifier for online Chinese characterrec

6、ognition (OLCCR) using any promising statistical pat-tern recognition approach. Previous works on this topic asreported in e.g., 4, 1, 5, 6, have demonstrated successfullythe effectiveness of the 4-directional features, where 4 di-rections are defined naturally as vertical ( | ), horizontal( ), left

7、-up ( / ) and right-down(), as shown in Fig. 1(a). Theprimary motivations of this study are to extend the previousworks to extracting 8-directional features with 8 directionsas shown in Fig. 1(b), and to study their effectiveness forThis work was supported by a grant from the RGC of the Hong KongSAR

8、 (Project No. HKU7145/03E).Figure 1. A notion of 4 vs. 8 directions.OLCCR.It is noted that different ways of extracting directionalfeatures were used in the previous works. For example, in4, 5, 4-directional features were extracted directly fromthe nonlinear shape normalized (NSN) online trajectory;

9、while in 1, 6, 4-directional features were extracted from abitmap using an “offline” approach. Furthermore, differentways of projecting a direction vector to relevant directionalaxes, and different strategies for grid-based feature blurringwere used in 4 and 5, respectively. In addition to theabove,

10、 there are actually other algorithmic choices in de-riving directional features from an online handwritten char-acter sample. We have conducted extensive experiments tostudy the behavior and performance implications of differ-ent choices with a hope of identifying the most promisingscheme and discer

11、ning the best setting for the relevant con-trol parameters. In this paper, we report those most im-portant findings and recommend a promising approach ofextracting 8-directional features for OLCCR.The rest of the paper is organized as follows. Details ofrecommended approach are described in Section

12、2. Experi-mentalresultsofcomparativestudiesarereportedinSection3. Finally, our findings are summarized in Section 4.2. Our ApproachThe overall flowchart of our recommended approach isshown in Fig. 2. In the following subsections, we explainin detail how each module works.Proceedings of the 2005 Eigh

13、t International Conference on Document Analysis and Recognition (ICDAR05) 1520-5263/05 $20.00 2005 IEEE PreprocessingOnline Character SampleGenerating 8 Directional Pattern ImagesExtracting 8Directional FeaturesRaw Feature Vector Locating Spatial Sampling Points andExtracting Blurred Directional Fea

14、turesFigure 2. Overall flowchart.2.1. PreprocessingThe main objective of a series of preprocessing steps istoremovecertainvariationsamongcharactersamplesof thesame class that would otherwise reduce recognition accu-racy. This module includes the following steps:(1) Linear size normalization: Given a

15、 character sample,it is normalized to a fixed size of 64 64 using anaspect-ratio preserving linear mapping.(2) Addingimaginary strokes: Imaginarystrokes are thosepen moving trajectories in pen-up states that are notrecordedintheoriginalcharactersample. We defineanimaginary stroke as a straight line

16、from the end pointof a pen-down stroke to the start point of its next pen-down stroke. All such constructed imaginary strokesare added into the stroke set of a character sample.(3) Nonlinear shape normalization (NSN): NSN is used tonormalize shape variability. The online character sam-ple after the

17、above two steps is first transformed into abitmap that is then normalized by a dot density equal-ization approach reported originally in 7. Using thederived NSN warping functions, the online charactersample after the above step (2) is transformed into anew sample such that the temporal order of the

18、origi-nal points is maintained.(4) Re-sampling: Re-sampling is purposed to reduce dis-tance variationbetween two adjacent onlinepoints andthe variance of number of points in a stroke. The se-quence of online points in each stroke (including allimaginary strokes) of a character is re-sampled by asequ

19、ence of equidistance points (a distance of 1 unitlength is used in our approach).Figure 3. Different ways of projecting a direc-tion vector onto directional axes and the cor-responding directional feature values: (a)adapted from 4; (b) adapted from 5; (c) ourproposal.(5) Smoothing: Smoothing can red

20、uce stroke shape vari-ation in a small local region. In a stroke, besides thestart point and end point, we replace the position ofevery other point by the mean value of that of its 2neighbors and itself.2.2. Extracting 8-Directional FeaturesGiven a stroke point Pj, its direction vector Vjis defineda

21、s follows: Vj= PjPj+1if Pjis a start point Pj1Pj+1if Pjis a non-end point Pj1Pjif Pjis an end point(1)For a non-end point Pj, if its two neighbors Pj1and Pj+1are in the same position, the point Pjis ignored and no di-rectional features are extracted at this point.Given Vj, its normalized version, Vj

22、/| Vj|, can be pro-jected onto two of the 8 directional axes, as shown inFig. 3, one is from the direction set of D1,D3,D5,D7and denoted as d1j, and the other is from the set ofD2,D4,D6,D8 and denoted as d2j. If we define the co-ordinate system for the original online point Pj= (xj,yj)as follows: th

23、e x-axis is from left to right, and the y-axis isfrom top down; then d1jand d2jfor a non-end point PjcanProceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR05) 1520-5263/05 $20.00 2005 IEEE be identified as follows:d1j=D7if xj1 xj+1& |yj+1 yj1| |xj+1 xj1

25、rection vec-tor. Given the above identified directions, an 8-dimensionalfeature vector can be formed with non-zero directional fea-ture values a1jand a2jcorresponding to d1jand d2jrespec-tively. Feature values corresponding to other 6 directionsare set as 0s. Using the same example, such a feature v

26、ectoris (a1j,0,0,0,0,0,0,a2j)t. Apparently, there are differentways to calculate a1jand a2j. We have studied three meth-ods as described in the following.The first method (“Method-1”) is adapted from 4 andshown in Fig. 3(a). For a non-end online point Pj, a1janda2jare calculated as follows:a1j=|dx d

27、y|s(2)a2j=2 min(dx,dy)s(3)where dx= |xj+1 xj1|, dy= |yj+1 yj1| and s =?d2x+ d2y. The second method (“Method-2”) is adaptedfrom 5 and shown in Fig. 3(b). For a non-endonline pointPj, a1jand a2jare calculated as follows:a1j=max(dx,dy)s(4)a2j=22(dx+ dy)s.(5)Thethird method(“Method-3”)is proposedhereand

28、 shownin Fig. 3(c). We just simply set a1j= 1, a2j= 1.If Pjis an end point of a stroke, we replace (xj1,yj1)with (xj,yj) for the start point, and replace (xj+1,yj+1)with (xj,yj) for the end point in the above discussions.2.3. Generating 8 Directional Pattern ImagesAfter extracting 8-directional feat

29、ures from all onlinepoints of a character, 8 directional pattern images Bd=fd(x,y), x,y= 1,2,64; d= D1,D2,D8can be generated as follows: set fd1j(xj,yj) = a1jandfd2j(xj,yj) = a2j; and set the values for all the otherfd(x,y)s as 0s. For each of the above directional patternimages, the following thick

30、ening processing is further per-formed: for each non-zero pixel fd(x,y) = a, the valueof each of its 8 neighboring pixels is set as f(x + m,y +n) = maxf(x + m,y + n),a, where m = 1,0,1 andn = 1,0,1.2.4. Locating Spatial Sampling Points and Extract-ing Blurred Directional FeaturesEach directional pat

31、tern image is divided uniformly into8 8 grids whose centers are treated as locations of 8 8spatial sampling points. At each sampling point (xi,yj), ablurred directional feature is extracted as follows:Fd(xi,yj) =x=N?x=Ny=N?y=Nfd(xi+ x,yj+ y)G(x,y) (6)where G(x,y) is a Gaussian filter whose size is d

32、eterminedby a parameter N. Based on our experience in using Ga-bor features for Chinese OCR and offline Chinese characterrecognition (e.g. 2, 3), we decided to use the followingGaussian envelop derived from the Gabor filter to serve asthe Gaussian filter G(x,y):G(x,y) =22exp2(x2+ y2)22 =42exp2(x2+ y

33、2)2(7)where = , =2, and is the wavelength of theplane wave of the original Gabor filter. According to ourpast experience (e.g. 2, 3), for an image with a size of64 64, a spatial sampling resolution of 8 8 and a wave-length = 8 should offer a “good” setting for these twocontrol parameters. Interestin

34、gly, this is also confirmed tobe a “good setting” for OLCCR in our experiments in thisproject. Given the wavelength, the control parameter N isset as N = 2. It is noted that the optimal setting of theabove control parameters will be different for images withdifferent sizes.Because there are totally

35、8 directional pattern images,each with 88 sampling points, an 888 = 512 dimen-sional vector of raw features can finally be formed by us-ing the nonlinear transformed features, ?Fd(xi,yj),d =D1,D2,D8;i,j = 1,2,8.3. Experiments and Results3.1. Experimental SetupIn order to evaluate the efficacy of the

36、 above techniques,a series of experiments are conducted on the task of therecognition of isolated online handwritten Chinese charac-ters. A subset of the corpus of Chinese handwritings devel-oped in our lab was used. The data set contains 300 writersProceedings of the 2005 Eight International Confer

37、ence on Document Analysis and Recognition (ICDAR05) 1520-5263/05 $20.00 2005 IEEE Table 1. A comparison of character recognition accuracies (in %) of using 4 vs. 8 directional featuresunder the condition of using vs. not using imaginary strokes (no thickening operation, no nonlinearfeature transform

38、ation).“Top-N”Single-Prototype Classifier1-NN ClassifierRecognitionNo imaginary strokesWith imaginary strokesNo imaginary strokesWith imaginary strokesResults4-direction8-direction4-direction8-direction4-direction8 direction4-direction8-directionN=171.4880.4374.8484.5774.0479.1977.2683.86N=587.8493.

39、2089.4094.9293.8695.7094.5496.82N=1091.5695.6092.5096.6297.0497.9597.2698.44N=5096.7298.4496.8598.6999.7299.8099.6999.81Table 2. A comparison of character recognition accuracies (in %) of using three different directionvector projection methods (with thickening operation, no nonlinear feature transf

40、ormation).“Top-N”Single-Prototype Classifier1-NN ClassifierRecognition ResultsMethod-1Method-2Method-3Method-1Method-2Method-3N=185.5583.5483.2685.0284.5984.33N=595.4394.6294.4997.1797.0796.99N=1097.0096.4896.4198.6298.5798.51N=5098.8998.7798.7399.8399.8299.82samples of 3755 categories of level 1 (G

41、B1) Chinese char-acters in GB2312-80 standard. Each writer contributed onesample foreach charactercategory. To collect the handwrit-ing samples, each writer was asked to write naturally, usinga stylus pen, a set of Chinese characters on the touch screenof a PDA or a Pocket PC. No otherrestriction is

42、 imposedonthewritingstyles. Foreachcharacterclass, weuse200sam-ples randomly selected from the above data set for trainingand the remaining 100 samples for testing. The followingtwo simple character classifiers are used for performanceevaluation on the testing set: A maximum discriminant function ba

43、sed classifierwith a single prototype. The prototype is the mean oftraining feature vectors, and the discriminant functionis the negative Euclidean distance between a testingfeature vector and the prototype feature vector; A 1-NN classifier with all the training feature vectorsas prototypes and the

44、Euclidean distance as a dissimi-larity measure.3.2. A Comparison of 4 vs. 8 Directional FeaturesThe first set of experiments are designed to comparethe performance of using 4 or 8 directional features undertwo different choices of adding and not adding imaginarystrokes in the preprocessing stage. Th

45、e directional featurevalues are extracted using Eqs. (2) and (3). No thicken-ing processing is performed in generating directional pat-tern images. Nonlinear transformation is not applied when512-dimensional feature vector is formed. The “Top-N”character recognition accuracies (in %) are summarized

46、inTable 1. From the results, two conclusionscan be drawn: 1)Using 8-directional features achieves a much better perfor-mance than using 4-directional features; 2) Adding imag-inary strokes gives a better performance than not addingimaginary strokes. Therefore, in the later experiments, wealways use

47、8-directionalfeatures and add imaginarystrokesin the preprocessing steps.3.3.A ComparisonofDifferent ProjectionMethodsThe second set of experiments are designed to com-pare the performance of using three different direction vec-tor projection methods for extracting directional features ateachon-line

48、trajectorypoint. We referto themas Method-1,Method-2, and Method-3, respectively according to the de-scriptions in subsection 2.2. Thickening processing is per-formed in generating directional pattern images. Nonlineartransformationis not appliedwhen512-dimensionalfeaturevector is formed. The “Top-N

49、” character recognition accu-racies (in %) are summarized in Table 2.By comparing the results of two columns labeled as “8-direction” and “With imaginarystrokes” in Table 1 with theones of corresponding columns labeled as “Method-1” inTable 2, it is observed that the thickening operation offersProce

50、edings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR05) 1520-5263/05 $20.00 2005 IEEE Table 3. A comparison of character recogni-tion accuracies (in %) of using vs. not usingthe nonlinear feature transformation.“Top-N”Single-Prototype1-NNRecognitionClassifier

展开阅读全文