8方向特征向量.pdf
《8方向特征向量.pdf》由会员分享,可在线阅读,更多相关《8方向特征向量.pdf(5页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、A Study On the Use of 8-Directional Features For Online Handwritten ChineseCharacter RecognitionZhen-Long BAI and Qiang HUODepartment of Computer Science, The University of Hong Kong, Hong Kong, China(Email: zlbaics.hku.hk, qhuocs.hku.hk)AbstractThis paper presents a study of using 8-directional fea
2、-tures for online handwritten Chinese character recogni-tion. Given an online handwritten character sample, a se-riesofprocessingsteps, includinglinearsize normalization,adding imaginary strokes, nonlinear shape normalization,equidistance resampling, and smoothing, are performed toderive a 6464norma
3、lized onlinecharactersample. Then,8-directional features are extracted from each online tra-jectory point, and 8 directional pattern images are gener-ated accordingly, from which blurred directional featuresare extracted at 8 8 uniformly sampled locations usinga filter derived from the Gaussian enve
4、lope of a Gabor fil-ter. Finally, a 512-dimensional vector of raw features isformed. Extensive experiments on the task of recognizing3755 level-1 Chinese characters in GB2312-80 standardare performed to compare and discern the best setting forseveral algorithmic choices and control parameters. Theef
5、fectiveness of the studied approach is confirmed.1. IntroductionIn this paper, we study the problem of how to extract so-called directional features from an online handwritten char-acter sample to form a vector of raw features that can beused to construct a classifier for online Chinese characterrec
6、ognition (OLCCR) using any promising statistical pat-tern recognition approach. Previous works on this topic asreported in e.g., 4, 1, 5, 6, have demonstrated successfullythe effectiveness of the 4-directional features, where 4 di-rections are defined naturally as vertical ( | ), horizontal( ), left
7、-up ( / ) and right-down(), as shown in Fig. 1(a). Theprimary motivations of this study are to extend the previousworks to extracting 8-directional features with 8 directionsas shown in Fig. 1(b), and to study their effectiveness forThis work was supported by a grant from the RGC of the Hong KongSAR
8、 (Project No. HKU7145/03E).Figure 1. A notion of 4 vs. 8 directions.OLCCR.It is noted that different ways of extracting directionalfeatures were used in the previous works. For example, in4, 5, 4-directional features were extracted directly fromthe nonlinear shape normalized (NSN) online trajectory;
9、while in 1, 6, 4-directional features were extracted from abitmap using an “offline” approach. Furthermore, differentways of projecting a direction vector to relevant directionalaxes, and different strategies for grid-based feature blurringwere used in 4 and 5, respectively. In addition to theabove,
10、 there are actually other algorithmic choices in de-riving directional features from an online handwritten char-acter sample. We have conducted extensive experiments tostudy the behavior and performance implications of differ-ent choices with a hope of identifying the most promisingscheme and discer
11、ning the best setting for the relevant con-trol parameters. In this paper, we report those most im-portant findings and recommend a promising approach ofextracting 8-directional features for OLCCR.The rest of the paper is organized as follows. Details ofrecommended approach are described in Section
12、2. Experi-mentalresultsofcomparativestudiesarereportedinSection3. Finally, our findings are summarized in Section 4.2. Our ApproachThe overall flowchart of our recommended approach isshown in Fig. 2. In the following subsections, we explainin detail how each module works.Proceedings of the 2005 Eigh
13、t International Conference on Document Analysis and Recognition (ICDAR05) 1520-5263/05 $20.00 2005 IEEE PreprocessingOnline Character SampleGenerating 8 Directional Pattern ImagesExtracting 8Directional FeaturesRaw Feature Vector Locating Spatial Sampling Points andExtracting Blurred Directional Fea
14、turesFigure 2. Overall flowchart.2.1. PreprocessingThe main objective of a series of preprocessing steps istoremovecertainvariationsamongcharactersamplesof thesame class that would otherwise reduce recognition accu-racy. This module includes the following steps:(1) Linear size normalization: Given a
15、 character sample,it is normalized to a fixed size of 64 64 using anaspect-ratio preserving linear mapping.(2) Addingimaginary strokes: Imaginarystrokes are thosepen moving trajectories in pen-up states that are notrecordedintheoriginalcharactersample. We defineanimaginary stroke as a straight line
16、from the end pointof a pen-down stroke to the start point of its next pen-down stroke. All such constructed imaginary strokesare added into the stroke set of a character sample.(3) Nonlinear shape normalization (NSN): NSN is used tonormalize shape variability. The online character sam-ple after the
17、above two steps is first transformed into abitmap that is then normalized by a dot density equal-ization approach reported originally in 7. Using thederived NSN warping functions, the online charactersample after the above step (2) is transformed into anew sample such that the temporal order of the
18、origi-nal points is maintained.(4) Re-sampling: Re-sampling is purposed to reduce dis-tance variationbetween two adjacent onlinepoints andthe variance of number of points in a stroke. The se-quence of online points in each stroke (including allimaginary strokes) of a character is re-sampled by asequ
19、ence of equidistance points (a distance of 1 unitlength is used in our approach).Figure 3. Different ways of projecting a direc-tion vector onto directional axes and the cor-responding directional feature values: (a)adapted from 4; (b) adapted from 5; (c) ourproposal.(5) Smoothing: Smoothing can red
20、uce stroke shape vari-ation in a small local region. In a stroke, besides thestart point and end point, we replace the position ofevery other point by the mean value of that of its 2neighbors and itself.2.2. Extracting 8-Directional FeaturesGiven a stroke point Pj, its direction vector Vjis defineda
21、s follows: Vj= PjPj+1if Pjis a start point Pj1Pj+1if Pjis a non-end point Pj1Pjif Pjis an end point(1)For a non-end point Pj, if its two neighbors Pj1and Pj+1are in the same position, the point Pjis ignored and no di-rectional features are extracted at this point.Given Vj, its normalized version, Vj
22、/| Vj|, can be pro-jected onto two of the 8 directional axes, as shown inFig. 3, one is from the direction set of D1,D3,D5,D7and denoted as d1j, and the other is from the set ofD2,D4,D6,D8 and denoted as d2j. If we define the co-ordinate system for the original online point Pj= (xj,yj)as follows: th
23、e x-axis is from left to right, and the y-axis isfrom top down; then d1jand d2jfor a non-end point PjcanProceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR05) 1520-5263/05 $20.00 2005 IEEE be identified as follows:d1j=D7if xj1 xj+1& |yj+1 yj1| |xj+1 xj1
24、|D3if xj1 xj+1& |yj+1 yj1| |xj+1 xj1|D5if yj1 yj+1& |yj+1 yj1| |xj+1 xj1|D1if yj1 yj+1& |yj+1 yj1| |xj+1 xj1|d2j=D6if xj1 xj+1& yj1 yj+1D8if xj1 xj+1& yj1 yj+1D2if xj1 xj+1& yj1 yj+1D4if xj1 xj+1& yj1 yj+1For the example shown in Fig. 3, these two directionsare d1j= D1, d2j= D8for the highlighted di
25、rection vec-tor. Given the above identified directions, an 8-dimensionalfeature vector can be formed with non-zero directional fea-ture values a1jand a2jcorresponding to d1jand d2jrespec-tively. Feature values corresponding to other 6 directionsare set as 0s. Using the same example, such a feature v
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 方向 特征向量
限制150内