1、 OLAP和数据挖掘技术在QAD产品审计中的应用与研究OLAP和和数据挖掘掘技术在QQAD产品品审计中的的应用与研究究摘要随着时代的的发展,如如今的企业业已大多进进入了“无纸化”的办公时时代。原有有的手工信信息输入与与分析已无无法适应如如今日益增增多的信息息数据。可可以说,企企业每天都都面临着大大量的商业业信息,而而如何利用用、分析好这这些数据从从而为企业业的发展提提供指导就就显得尤其其重要。对对于QADD公司而言言,它是一一家专门为为制造业提提供企业解解决方案的的软件供应应商,它在在全球范围围内的九十十多个国家家拥有超过过六千多个个客户。每每一年,它它都会对每每个客户使使用公司软软件产品的的

2、情况进行行审计。在在每个客户户审计的过过程中,自自然会产生生大量的数数据。对于于公司而言言,所有客客户的审计计数据将是是百万级的的。面对如如此庞大的的数据,如如何从这些些数据中获获取公司所所需的信息息,分析出出审计的结结果,并得得出一定的的指导性结结论就显得得尤其重要要。为此,本文文提出了一一种基于联联机分析处处理(OLLAP)和和数据挖掘掘技术的审审计信息分分析的设计计。OLAPP和数据挖挖掘技术是是近年来数数据库领域域和人工智智能领域研研究的热点点,它通过过对大量数数据进行分分析和处理理,得到隐隐含在这些些数据背后后有用的信信息和知识识。本项目目实现了基基于SQLL Serrver 200

3、88分析服务务(SSAAS)在审审计信息立立方体之上上的OLAAP多维数数据分析和和MDX多多维数据查查询,并利利用决策树树、神经网网络等数据据挖掘算法法对审计信信息数据进进行挖掘,得得出有用的的知识。为实现上述述目标,首首先需要决决定存储分分析后的审审计信息的的数据库版版本与类型型以及最终终生成审计计结果报表表所需使用用的报表生生成工具。为为此,针对对数据库的的选择提出出了以下几几种可行的的解决方案案:Proogresss数据库库、MySSQL数据据库、Acccesss数据库和和SQL Servver数据据库。根据据实际需求求,分别比比较了以上上四种数据据库的优缺缺点,最终终权衡之后后选择了

4、SSQL SServeer数据库库作为本项项目的关系系数据库服服务器。同同样,针对对报表生成成工具,也也提出了以以下几种可可行的解决决方案:QQAD公司司自己实现现的报表生生成框架、微微软的Acccesss工具和微微软的Exxcel工工具。分析析了以上几几种工具使使用的便捷捷性以及代代价考虑,最最终选择了了大家比较较常用且比比较轻量型型的Exccel作为为我们最终终的报表生生成工具。在选定了数数据库和报报表生成工工具的解决决方案之后后,便要对对历史审计计数据进行行一定的整整理、分析析以及数据据提取和存存储工作。审计数据是由公司产品所提供的功能菜单运行后自动生成的,客户在对这些多种多样的报表整理





关键词:数数据挖掘,OOLAP,多多维数据分分析,SQQL Seerverr 20008分析服服务,产品审计计

10、DITTABSTRRACTWith the deveelopmment of tthe ssocieety, scieence and techhnoloogy, mostt of the enteerpriises havee noww entteredd thee papperleess oofficce tiime. The origginall mannual inpuut annd annalyssis ccoulddnt deeal wwith the incrreasiing iinforrmatiion aand ddata any moree. Evvery day, ent

11、terprrisess aree facced wwith lotss of busiinesss infformaationn, annd bee awaare oof knnow hhow tto usse thhese dataa to anallysiss, too proovidee guiidancce foor deeveloopmennt off thee entterprrise is eespecciallly immporttant. For QAD whicch iss a ssoftwware suppplierr commpanyy whoo proovidees

12、soolutiions to sspeciializzed mmanuffactuuringg entterprrisess, haas moore tthan 6,0000 muultipple ccliennts wwithiin 900 couuntriies wworlddwidee. Eveery yyear, alll cliientss usiing ssoftwware prodductss connditiions willl be audiited by tthe ccompaany. In eeveryy cusstomeer auudit proccess, larrg

13、e aamounnts oof daata wwill be pproduuced. Forr QADD, thhe nuumberr of all custtomerrs auddit ddata gathheredd toggetheer wiill bbe miillioons. Faciing ssuch largge nuumberrs off datta, iit iss esppeciaally impoortannt too knoow hoow too gett thee infformaationn thee commpanyy neeed, tto annalyssis

14、tthe aauditt ressultss, annd soome gguidaance concclusiions fromm thiis huuge aand mmessyy datta.Thereeforee, thhis ppaperr prooposeed a multti-diimenssionaal annalyssis oof QAAD prroducct auuditiing ddesiggn baased on OOLAP and dataa minning techhnoloogy. OLAPP andd datta miiningg tecchnollogy have

15、e beccominng a ressearcch hoot sppot oof daatabaase aand tthe ffieldd of artiificiial iintellligeence duriing rrecennt yeears. It throough the largge daata aanalyysis and proccessiing, impllicattes uusefuul innformmatioon annd knnowleedge behiind tthesee datta. TThis projject reallizedd OLAAP muulti

16、-dimeensioonal dataa anaalysiis annd MDDX muulti-dimeensioonal dataa queery oon thhe baasis of SSQL SServeer Annalyssis SServiice 22008 in tthe aauditt infformaationn cubbe, aand aachieeved audiit innformmatioon daata mmininng ussing deciisionn treees aand tthe nneuraal neetworrk daata mmininng allg

17、oriithm for audiit innformmatioon too draaw ussefull knoowleddge ffor tthe ccompaany.To acchievve thhis ggoal, we firsst neeed tto deecidee thee dattabasse veersioon annd tyype tto chhoosee, whhich is uused to sstoree thee auddit iinforrmatiion aafterr thee anaalysiis annd whhich repoortinng frramew

18、work toolls too chooose to ggenerrate finaal auudit resuult rreporrt. TThereeforee, seeveraal feeasibble ssoluttionss of dataabasees baased on ddemannd weere pput fforwaard. Theyy aree Proogresss daatabaase, MySQQL daatabaase, Acceess ddatabbase and SQL Servver ddatabbase. Acccordiing tto acctuall n

19、eeeds, and afteer coomparring eachh advvantaages and disaadvanntagees off thee aboove ffour dataabasee serrverss, thhe SQQL seerverr dattabasse waas chhosenn as the relaationnal ddatabbase servver ffinallly. Simiilarlly, ffor rreporrtingg toools, the folllowinng feeasibble ssoluttionss werre allso p

20、put fforwaard: QAD repoortinng frramewwork reallizedd by the comppany, Miccrosooft AAccesss toool aand MMicroosoftt Exccel ttoolss. Exxcel was chossen aas thhe reeportting tooll finaally becaause mostt of the userrs arre veery ffamilliar withh it and it iis poowerfful ffor ggenerratinng vaariouus re

21、eportts thhoughh it is vvery lighhtweiight.Afterr finnisheed seelecttion of ddatabbase and repoort ggenerratioon toool, the nextt steep waas too deaal wiith tthe aauditt datta. WWe neeed tto soort oout, anallysiss andd exttractt datta annd fiinallly maake tthese useeful infoormattion be sstoreed inn

22、 dattabasse coorrecctly. Auddit ddata is aautommaticc genneratted bby ussing the prodduct funcctionn mennu. CCustoomer willl worrk onn theese vvarioous rreporrts aaccorrdingg to theiir owwn haabit of ddiffeerentt pacckagiing. So tthe ccompaany oobtaiined fromm thee cusstomeer onn thee auddit ddata s

23、truucturre iss commplexx rannge, no ccertaain rregullaritty, tthis to oour hhistooricaal auudit infoormattion extrractiion hhas ccauseed soome ddiffiicultties and obsttaclees. TThrouugh aanalyyzingg thee hisstoriical dataa aftter ddetaiiled anallysiss of audiit, wwe foound out thatt onlly twwo kiind

24、s of rreporrt arre neeededd, inncludding Appllicattion Detaail UUsagee Proofilee Repport and Liceensedd Appplicaationn Repport. Othher ffiless succh ass dattabasse loog fiiles are not so iimporrtantt to us ffor nnow. So oour ggoal is tto fiind oout tthesee twoo filles iin eaach ccustoomer audiit daa

25、ta ffoldeer annd too anaalysiis, exttractt datta frrom tthesee twoo filles aand sstoree thee exttractted iinforrmatiion iin daatabaase. In ppracttice, we founnd thhat iif wee runn datta prrocesssingg prooceduure ddirecctly insttead of ppre-pproceessinng thhe hiistorricall datta fiirst, theen thhe da

26、ata pproceessinng prroceddures effficieency is vvery low. Thee reaason it tthat everry tiime tthe pproceeduree is run, it needd to travversaal evvery filee in the appoointeed foolderrs too seee wheetherr thiis fiile iis ussefull forr us or nnot, thatt is to ssay tthe pproceeduree willl oppen eeach f

27、ilee to see whetther thiss fille iss thee Appplicaationn Dettail Usagge Prrofille Reeportt andd Liccenseed Apppliccatioon Reeportt or not, thuus a lot of ttime is sspentt. Too sollve tthis probblem, it needd to pluss a llayerr of dataa preetreaatmennt prrocesss beeforee thee aboove pproceessinng prr

28、ocedduress. Thhat iis too sayy, affter runnning the dataa preetreaatmennt prrocesss, somme hiistorricall auddit ffiless whiich aare nnot nneedeed wiill bbe fiilterred, onlyy twoo kinnds oof reeportt menntionned aabovee willl bee remmaineed acccordding to tthe oorigiinal direectorry sttructture. So,

29、on the basiis off datta prretreeatmeent, the proggram willl greeatlyy impprovee thee effficieency of tthe ooperaationn. Affter solvving thiss prooblemm succcesssfullly, iit iss timme too deppositt thee anaalyzeed daata iinto dataabasee. Acccordding to tthe hhistooricaal daata iinforrmatiion aand rre

30、latted aauditting dataa infformaationn stoored in eexterrnal dataabasee, inn acccordaance withh thee reqquireementt, annd siix taabless werre deesignned. Thiss dattabasse fiillinng wiith aauditt infformaationn datta wiill pproviide eefficcientt datta soourcee forr OLAAP meentiooned beloow.Sincee we

31、alreeady havee a rrelattionaal daatabaase ssourcce noow, wwe caan usse SQQL Seerverr Anaalysiis Seervicce too buiild ddimennsionnal mmodelling for audiit daata. Thiss papper ddetaiiled disccussees thhe coonceppt off auddit ddata modeel deesignn andd loggicall moddel ddesiggn, iincluudingg thee meaas

32、uree, diimenssion and grannularrity desiign, factt tabble aand ddimennsionn tabbles desiign, alsoo adooptedd thee loggic sstruccturee moddel oof snnowfllakess, geeneraatingg a vview of tthe aauditt infformaationn andd eveentuaally geneerateed muultiddimennsionnal ccube of aauditt infformaationn, whh

33、ich provvide a muultiddimennsionnal ddata sourrce ffor aauditt ressultss staatemeents and dataa minning. So the appllicattion of OOLAP in QQAD pproduuct aauditt hass beeen brroughht innto eeffecct. IIn orrder to ggenerrate audiit reesultts reeportt, wee cann usee Exccel, estaablissh thhe piivot tabl

34、le, aand cchoosse too buiild ddatabbase connnectiion oof muultiddimennsionnal ddata sourrce. Thenn we can readd thee conntentts off thee mulltidiimenssionaal daata. In oorderr to faciilitaate ccustoomer reviiew, moree thaan teen reeportt temmplattes wwere defiined, whiich ccan bbe prrovidded ffor cc

35、ustoomer to cchoosse, bbasiccallyy coveers tthe eentirre auudit resuults, alsso cuustommers can chooose tto haave aa cheeck oon thhe coontennt annd thhe daata oof drrilliing, veryy connveniient at aall.Finallly, the dataa minning techhnoloogy sshoulld bee useed too do somee minning taskk stuudy oon

36、auudit dataa. Thhe trradittionaal daata mmininng iss oftten bbasedd on relaationnal ddatabbase. Howweverr, inn thiis paaper, it detaails the appllicattion of ddata miniing ttechnnologgy baased on OOLAP in pproduuct aauditt, ussing deciisionn treees aalgorrithmm andd thee neuural netwwork algoorithhm

37、 reespecctiveely ffor tthe ssame miniing mmodell, allso aa lifft chhart was builld too commparee thee twoo alggoritthms miiningg acccuraccy. TThis papeer esstabllisheed a miniing mmodell to findd outt thee keyy facctorss whiich iinfluuencee thee cusstomeers chooice of ddiffeerentt typpes oof QAAD pr

38、roducct coombinnatioon, iin orrder to ffind a beest ppracttice for usinng a combbinattion of pproduucts whicch QAAD prroviddes iin a speccula induustryy fieeld, whicch caan bee useed foor diifferrent typees off cusstomeers iin chhoosiing pproduucts combbinattion provvide somee sugggesttionss. Duue t

39、oo thee limmitedd timme annd ennergyy of minee, I onlyy makke a rouggh exxplorratioon off thee posssibiilityy of the appllicattion of ddata miniing iin coompanny prroducct auudit. Acccordiing tto thhe exxistiing rresullts, it iis beelievved tthat the dataa minning techhnoloogy ccan bbe ussed tto exx

40、cavaate mmore guidding signnificcancee of knowwledgge foor opperattion deciisionns.This reseearchh andd ressultss shoow thhat tthe aappliicatiion oof OLLAP aand ddata miniing ttechnnologgy inn QADD prooductt auddit iis feeasibble aand iit iss higgh effficiient and moree connveniient alsoo. Itt nott

41、onlly prrovidde moore iintellligeent aanalyysis of tthe wways and meanns foorm tthe ppersppectiive oof buusineess mmanaggemennt foor coompanny addminiistraatorss andd salles ppersoonnell, buut allso pproviided new meanns annd meethodds foorm tthe ppersppectiive oof auudit reseearchh forr thee commpa

42、nyy auddit rreseaarcheers, to ccreatte moore uusefuul knnowleedge hidiing bbehinnd thhe auudit dataa forr commpanyy devveloppmentt.Key wwordss: dataa minning, Onlline Anallyticcal PProceessinng, mmultii-dimmensiionall datta annalyssis, SQL Servver 22008 Anallysiss Serrvicees, prooductt auddit目 录第一章 绪论-11.1 QAD产品品审计问题题简述及分分析-11.2 研研究目的与与意义-111.3 国国内外研究究现状-221.4 论论文研究内内容-2第二章 解决方案比比较与选定定-42.1 数数据库的选选择-42.1.11 Proogresss数据库库-42.1.22 MySSQL数据据库-442.1.33 Acccess数数据库-42.1.44 SQLL Serrver数数据库-442.2 报报表工具的的选择-552.2.11 QADD报表生成成框架-52.2.22 微软AAccesss报表-552.2.33 微软EExcell报表-552.3 本本章小结-


