《通用汽车的3D汽车模型.pdf》由会员分享,可在线阅读,更多相关《通用汽车的3D汽车模型.pdf(13页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Vehicle Surveillance with aGeneric,Adaptive,3D Vehicle ModelMatthew J.Leotta,Member,IEEE,and Joseph L.MundyAbstractIn automated surveillance,one is often interested in tracking road vehicles,measuring their shape in 3D world space,anddetermining vehicle classification.To address these tasks simultan
2、eously,an effective approach is the constrained alignment of a priormodel of 3D vehicle shape to images.Previous 3D vehicle models are either generic but overly simple or rigid and overly complex.Rigid models represent exactly one vehicle design,so a large collection is needed.A single generic model
3、 can deform to a wide varietyof shapes,but those shapes have been far too primitive.This paper uses a generic 3D vehicle model that deforms to match a widevariety of passenger vehicles.It is adjustable in complexity between the two extremes.The model is aligned to images by predictingand matching im
4、age intensity edges.Novel algorithms are presented for fitting models to multiple still images and simultaneoustracking while estimating shape in video.Experiments compare the proposed model to simple generic models in accuracy andreliability of 3D shape recovery from images and tracking in video.St
5、andard techniques for classification are also used to compare themodels.The proposed model outperforms the existing simple models at each task.Index TermsMachine vision,road vehicle location monitoring,image shape analysis,image recognition,video signal processing.1INTRODUCTIONROADvehicles are argua
6、bly the second most importantsubjects in machine vision applicationssecond only tohuman subjects.Certainly this is true in automated surveil-lance,if not also more generally.In recent years,there havebeen many advances in tracking,recognition,and shapeestimationofhumansubjectsbyemployingdetailed,gen
7、eric,3D mesh models that deform to the shape of the subject.Yet,similar techniques have not been applied to road vehicles.Three-dimensional models of road vehicles used in machinevision studies have instead remained segregated intodetailed or generic models,but not both.The late 1980s and early 1990
8、s produced a flurry ofresearch activity on using 3D vehicle models for visualsurveillance.This activity was in large part due to fundingfrom some sizable United Kingdom and European pro-grams,most notably SERC Alvey(MMI-007)and EspritVIEWS(P2152).Vehicle models at the time tended to bequite simple(e
9、.g.,Fig.1a),owing to limited computationalresources and image resolution.In most cases,the vehicleshape and dimensions were also assumed to be knowna priori.However,a few studies 1,2 considered genericvehicle models that could deform to a wide range of roadvehicle shapes.Simple mesh models like that
10、 in Fig.1a are still used invehicle surveillance research.The resolution and accuracyof deformable vehicle models has not kept pace with theincreases in image resolution and computational resources.Conversely,deformable 3D models of human faces andbodies used in tracking,recognition,and shape estima
11、tionhave continued to advance in complexity and accuracy 3,4.Advancement has been driven in large part bycomputer graphics and the entertainment industry.Three-dimensional model-based vision research frequently bor-rows from the products of graphics research.Deformablegeneric human models are useful
12、 for rendering thecontinuous variation found in human shapes.Yet,there islittle need for deformable vehicles in graphics applications.Vehicles are made to precise specifications.One can simplybuild a detailed model by computer aided design(CAD)foreach vehicle of interest.An example of one such vehic
13、leCAD model is shown in Fig.1b.The vision community hasbenefited from highly detailed and generic graphics modelsfor humans but has been limited to collections of specificCAD models for vehicles.Vision researchers have indeedmade use of these CAD models in their work 5,6.The problem with using CAD m
14、odels for machine visionis that they are too specific and often too detailed.It isusually not sufficient in vision applications to assume thatall vehicles are known a priori.Furthermore,a CAD modelis not likely to be available for every vehicle.One solution isto use a small collection of CAD models,
15、apply an algorithmwith each,and choose the one that best agrees with the data5,6.In this case,the model agreement is not exact,soextreme detail in the CAD model is useless and a waste ofcomputational resources.This paper evaluates a highly detailed,yet generic,vehicle model for use in the tasks of v
16、ehicle shapeestimation,tracking,and classification.The proposed modelcombines the flexibility of a single generic vehicle modeland a level of detail approaching that found in CADmodels.The proposed model was first introduced in earlierwork by the authors in 7,where it was used in somepreliminary exp
17、eriments to reconstruct 3D vehicle shapefrom multiview still images.This paper improves uponIEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,VOL.33,NO.7,JULY 20111457.M.J.Leotta is with Kitware,Inc.,28 Corporate Drive,Clifton Park,NY 12065.E-mail:.J.L.Mundy is with the School of Engine
18、ering,Brown University,Box D,Providence,RI 02912.E-mail:mundylems.brown.edu.Manuscript received 7 Mar.2010;revised 17 Aug.2010;accepted 22 Oct.2010;published online 29 Nov.2010.Recommended for acceptance by S.Sclaroff.For information on obtaining reprints of this article,please send e-mail to:tpamic
19、omputer.org,and reference IEEECS Log NumberTPAMI-2010-03-0150.Digital Object Identifier no.10.1109/TPAMI.2010.217.0162-8828/11/$26.00?2011 IEEEPublished by the IEEE Computer Societythe fitting algorithm in 7 and extends the applications toinclude simultaneous tracking and 3D reconstruction frommonoc
20、ular video and also vehicle classification based onrecovered shape.A far more extensive set of experimentshas been performed in both the new and old applicationareas.This paper covers the key ideas,experimental results,and conclusions of the work.Yet,many details have beensuppressed to meet space co
21、nstraints.Complete details canbe found in the first authors PhD thesis 8.2RELATEDWORKMany different types of models have been used for roadvehicles in machine vision applications.A common threadis that all models have some geometric component to them.The types of vehicle models in the literature can
22、 becategorized based on their use of geometry.Some vehiclemodels live in the image space and are inherently 2D.Others live in a 3D world and are related to the 2D image byprojection.Different models are also made of geometricelements of different intrinsic dimension.In the imagespace,this geometry m
23、ust have an intrinsic dimension ofeither zero,one,or two corresponding,respectively,topoints,curves,or regions.Examples of each are shown inFig.2.Finally,vehicle models can be classified as eitheragglomerative or prior.Agglomerative models are built upfrom detected,primitive image features(e.g.,poin
24、ts,edges,or regions).The primitive features are clustered into groupsrepresenting vehicles using little or no prior knowledgeabout vehicle shape.Prior geometric models are con-structed in advance and are fit to image data by matchingto primitive features.An example of an agglomerative,2D vehicle mod
25、el usingpointsisBeymeretal.9.Inthiswork,Harriscorners10aretracked in video and clustered into vehicles based onproximity and similar 2D motion.Similarly,Kanhere andBirchfield 11 track points.However,in 11,the points areback projected into 3D before clustering into vehicles.Pointfeatures have the adv
26、antage of being well localized and easyto track.Unfortunately,very few stable points are typicallyfound on vehicles,resulting in a very sparse representation.Edge features,on the other hand,offer a much richerdescription.SeeFig.2forexample.Jainetal.12trackvehicleedges in video using 2D elastic match
27、ing.In earlier work bythe authors,Leotta and Mundy 13 track vehicle edges bymatching curves that best agree with a reconstructed 3Dcurve.In both cases,the primitives are subpixel Canny 14edges linked into curves,and tracked curves are clusteredinto vehicles based on consistent motion.Exact pointwise
28、correspondence between curves is difficult since matching islocally ambiguous in the tangent direction.Furthermore,edge linking is very sensitive to image noise,and theresulting curve topology is rarely stable over many frames.Regions are also used as agglomerative vehicle repre-sentations.In video
29、processing,the most common regionsare moving object detections(MODs)derived from framedifferencing or a background modeling approach like thatof Stauffer and Grimson 15.Gupte et al.16 track suchMODs in video.They identify splitting and merging regionsover time to group regions into vehicles.Atev et
30、al.17expand upon these ideas and use multiple camera views toreduce ambiguities and to create a 3D bounding box aroundeach vehicle.Like edges,regions suffer from instabilitiesand ambiguities in correspondence.MOD regions are wellsuited to describing vehicle location and size but do notprovide as muc
31、h information as edges.Various other agglomerative models use combinations ofdetected point,curve,and region features.Some examplesare 18,19,20,21,22.The advantage of agglom-erative models is that they require little prior knowledge,and therefore easily generalize to represent other classes ofobject
32、s.This extreme generality is also a disadvantage.Theresulting representations are just collections of low-levelfeatures with little structure.It is difficult to use suchsimplistic representations for more complex tasks likeidentifying the vehicle class,measuring the wheelbase,orreconstructing a dens
33、e 3D model for rendering.An alternative to the agglomerative approach is to builda prior model of vehicles and fit that model to the images.Aprior model takes into account knowledge about vehiclesimparted by the designer or learned by training.Completegenerality is sacrificed to gain a more powerful
34、 model forthe objects of interest.One example of a prior model is thecode book approach of Leibe et al.23.This method detects2D points with associated descriptors of local appearance.Descriptors are matched to a previously learned code bookof vehicle descriptors and used to vote for the 2D location
35、ofthe vehicle center.Liebelt et al.24 extend this method todetect the presence and 3D pose of vehicles from 2D pointsand descriptors.Mundy and Chang 25 use a 2D region-based prior model of vehicles.The vehicle pixels segmentedfrom one frame of video are matched to later frames usingmutual informatio
36、n.Most relevant to work presented in this paper are theprior models that predict image edges.Leuven et al.261458IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,VOL.33,NO.7,JULY 2011Fig.2.Detected geometric features of varying intrinsic dimension.Clockwise from upper left:the original
37、image,points(Harris corners),curves(subpixel Canny edges),and regions(Stauffer-Grimson fore-ground regions).Fig.1.Meshes commonly used to model vehicles:(a)simple polyhedralwith about 12 faces,(b)CAD model with about 80,000 faces.represent vehicles as seen from the front or back by acollection of li
38、ne segments.The line segments are matchedto detected edges using a distance transform.DubuissonJolly et al.27 go one step further and connect linesegments into a polygon representing a vehicle silhouetteobserved from the side.The polygon deforms according to aprior shape model to match both line seg
39、ments to detectededges and the polygon region to a MOD region.These 2Dmodels are severely limited by requiring observations fromvery specific viewing directions.Koller 1 uses a 3D,polyhedral analog of the 2D,polygonal vehicle model.This model,shown in Fig.1a,has12 faces and 12 manually defined param
40、eters for shapedeformation.Koller estimates the shape parameters and 3Dpose to align the projected model line segments with linesegments fit to image edge detections.Ferryman et al.2increase the complexity of this model slightly by addingline segments representing wheel wells.They also learn thedefo
41、rmation parameters by aligning the model to trainingimages(with user guidance).Furthermore,they reduce thedimension of the shape space to six,using principalcomponents analysis.Simple 3D vehicle models,similar to 1 and 2,areabundant in the literature.Often these models are usedwith fixed shape param
42、eters,and shape estimation isignored completely.Dahlkamp et al.28 provide athorough literature review on fixed shape,3D model-based,vehicle tracking.They also show that adding wheel wells(as in 2)improves tracking results,but only when imageresolution is sufficiently high.Likewise,as resolutionconti
43、nues to increase,these simple models are no longersufficient.Detailed CAD models are sometimes usedinstead.Song and Nevatia 6 use CAD models to createdetailed vehicle silhouettes from arbitrary viewing direc-tions.These 2D silhouettes are used in vehicle detectionand tracking.Guo et al.5 use CAD mod
44、els to predict theappearance of vehicles from different viewpoints in highresolution images.Unfortunately,most CAD models aremore complex than necessary for most vision applications.Excessive model complexity can result in unnecessarycomputational complexity.To make matters worse,lackof generality n
45、ecessitates the use of a collection of overlycomplex models.3A DEFORMABLEVEHICLEMODELThe vehicle model used in this paper combines thegenerality of deformable vehicle models with a level ofdetail approaching that found in CAD models.This modelwas originally proposed in 7 but is reviewed here forcomp
46、leteness.Fig.3 gives an overview of the vehicle model.There are two components that the model is comprised of:a3D polygonal mesh describing the shape of the vehiclesurface and a collection of 2D polygons describing vehicleparts.Both the 3D shape and 2D parts are deformable torepresent a large variet
47、y of vehicles.The 2D polygons aremapped onto the vehicle surface and represent parts ofthe vehicle that tend to have different material properties.The occluding contours of 3D shape,combined with theboundaries of parts,predict the appearance of intensityedges in images.Later,in Section 4,the model i
48、s aligned toimages by adjusting pose and shape parameters to alignpredicted and detected edges.The approach resembles bothActive Appearance Models(AAMs)29 and Active ShapeModels(ASMs)30.The model has appearance dependenton shape like an AAM,but the appearance is a set of curveswhich are fit to image
49、 edges like an ASM.Unlike ASMs,theset of curves is variable and depends on projection of the 3Dmodel into 2D.The vehicle model exhibits a few other special proper-ties.First,recursive subdivision of 3D shape and 2D partsallows for a variable level of detail to suit the needs of agiven problem withou
50、t requiring excessive complexity.Second,reduction to a low-dimensional deformationparameter subspace makes fitting tractable.Third,theparts are closed contours that partition the vehicle surfaceinto regions of different materials.This last property hasnot yet been fully exploited in the fitting algo