《天地网远程教育关键技术、系列产品及其应用.ppt》由会员分享,可在线阅读,更多相关《天地网远程教育关键技术、系列产品及其应用.ppt(36页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Plagiarism Detection for Multithreaded Software Based on Thread-Aware Software BirthmarksZhenzhou TMOE Key Lab for Intelligent Networks and Network SecurityXian Jiaotong University, China2022-5-312Outline1.Introduction2.Thread-Aware Birthmark Methods3.Evaluation4.Unsolved Problems & Future Work3Intr
2、oductionSoftware plagiarism has been a serious threat to the healthy development of software industryViolate licenses for commercial interests or unwittinglyWeak code protection awarenessPowerful automated code obfuscation toolsDistributed in binary form4IntroductionnA series of methods are proposed
3、 for plagiarism detectionSoftware WatermarkingInsert extra data“a sufficiently determined attacker will eventually be able to defeat any watermark”Static and Dynamic Software BirthmarksDynamic birthmarks are more resilient to semantic-preserving code obfusctions5IntroductionnA series of methods are
4、proposed for plagiarism detectionSoftware WatermarkingStatic and Dynamic Software BirthmarksnIncreasingly popular trend towards multithreaded programming brings new challenge to existing dynamic birthmark methodsExisting dynamic birthmark remain optimized for sequential programsNeglect the effect of
5、 thread schedulingTwo executions of a single program under same input can be very different, rendering the existing methods ineffective!knnknk6IntroductionDKISBSCSSBCosine0.8380.452Jaccard0.5510.369Dice0.6780.51Containment0.7350.477p DKISB: dynamic key instruction sequence birthmarkp SCSSB: system c
6、all short sequence birthmark7IntroductionContributions:Two thread-aware dynamic birthmarks TW-DKISB and TW-SCSSB are proposed to detect software plagiarismOperates directly on binary executablesNot limited to specific operating systems and languagesResilient to various automated obfuscation techniqu
7、es29 different obfuscation techniques in SandMarkArrayFolderArraySplitterBlockMarkerBludgeonSignaturesBooleanSplitterBranchInverterDuplicateRegistersDynamicInlinerFalseRefactorFieldAssignmentInlinerIntegerArraySplitterMergeLocalIntegersMethodMergerObjectifyOpaqueBranchInsertionOverloadNamesParamAlia
8、sPromotePrimitiveRegistersPromotePrimitiveTypesPublicizeFieldsRandomDeadCodeRenameRegistersReorderInstructionsReorderParametersSimpleOpqauePredicatesSplitClassesStaticMethodsBodiesVariableReassigner0.00.20.40.60.81.0 Similarity8IntroductionContributions:A prototype is implemented using the Pin instr
9、umentation framework, and extensive experiments are conducted.A suite of benchmarks is compiled for researchers to conduct experiments and present their findingshttp:/ 9Outline1.Introduction2.Thread-Aware Birthmark Methods3.Evaluation4.Unsolved Problems & Future Work10A set of characteristics extrac
10、ted from a program that reflects intrinsic properties of the program, and which can be used to identify the program uniquely.Two types: Static and Dynamic software birthmarksDynamic birthmark defined by MylesSoftware Birthmark11Thread-Aware Dynamic Software BirthmarkPredetermining a thread schedule
11、is very difficultTry to shield their influence on executions instead of enforcing thread schedule12Thread-Aware Dynamic Software BirthmarkspMain Idea: Split then AggregateExecution order in each thread is relatively stable. Projecting the trace on thread-ids to obtain sub-traces to extract Slice bir
12、thmarksAggregating all slice birthmarks.Different traces of a program under the same inputSame slices13Slice Birthmark & Program BirthmarkK-GramSlice BirthmarksSAMSSM14Thread-Aware Birthmark based Plagiarism Detection5 main modules:DAM: monitoring and recordingPP: constitute valid tracesBG: extract
13、thread-aware birthmarksBSC: calculate similarity scoresPD: determine detection result15Thread-Aware Birthmark based Plagiarism Detection5 main modules:DAM: monitoring and recordingPP: constitute valid tracesBG: extract thread-aware birthmarksBSC: calculate similarity scoresPD: determine detection re
14、sult16Dynamic Analysis ModuleMonitoring the execution of a program using PinDKISExtractor: performs dynamic taint analysis to identify and record key instructionsSysTracer: record each execution of system calls17Thread-Aware Birthmark based Plagiarism Detection5 main modules:DAM: monitoring and reco
15、rdingPP: constitute valid tracesBG: extract thread-aware birthmarksBSC: calculate similarity scoresPD: determine detection result18Thread-Aware Birthmark based Plagiarism Detection5 main modules:DAM: monitoring and recordingPP: constitute valid tracesBG: extract thread-aware birthmarksBSC: calculate
16、 similarity scoresPD: determine detection result19Pre-Processor & Birthmark GeneratornPre-Processor: filter out noises and extract valid traces nBirthmark Generator: generate TW-DKISBs and TW-SCSSBs utilizing SA model and SS model implemented20Thread-Aware Birthmark based Plagiarism Detection5 main
17、modules:DAM: monitoring and recordingPP: constitute valid tracesBG: extract thread-aware birthmarksBSC: calculate similarity scoresPD: determine detection result21Thread-Aware Birthmark based Plagiarism Detection5 main modules:DAM: monitoring and recordingPP: constitute valid tracesBG: extract threa
18、d-aware birthmarksBSC: calculate similarity scoresPD: determine detection result22Similarity Calculator & Plagiarism DecidernSimilarity Calculator11221122For two SA model generated birthmarks: ,and,nnmmAk vk vkvBk vk vkv,2,JaccardExDiceContainmentABA BExCosine A BExA BABA BABABA BExA BABA,minA BmaxA
19、 B,cSim A BsimA B,cExCosine ExJaccard ExDice ExContainmentFour Similarity Metrics23Similarity Calculator & Plagiarism DecidernSimilarity Calculator 11221122For two SS model generated birthmarks: =,and=,mmnnAt Birth ttBirth ttBirth tBt Birth ttBirth ttBirth t ,11,ijcijijt tMaxMatch A Bmnijijsimt tcou
20、nt tcount tSim A Bcount tcount t iicount tkeySet Birth t jjcount tkeySet Birth tBipartite matching24Similarity Calculator & Plagiarism DecidernSimilarity CalculatornDecision Maker25Outline1.Introduction2.Thread-Aware Birthmark Methods3.Evaluation4.Unsolved Problems & Future Work26EvaluationA high qu
21、ality birthmark manifests in that the ratio of false classifications should be rather low for a given Two properties to check27Evaluating Resilience PropertynResilience to different compilers and optimization levels Similairty scores between binaries of pigzStatistical differences for 20 versions of
22、 pigz28Evaluating Resilience PropertynResilience to special obfuscation toolsCosine similarity between ConGzip and its 29 Sandmark obfuscated versionsArrayFolderArraySplitterBlockMarkerBludgeonSignaturesBooleanSplitterBranchInverterDuplicateRegistersDynamicInlinerFalseRefactorFieldAssignmentInlinerI
23、ntegerArraySplitterMergeLocalIntegersMethodMergerObjectifyOpaqueBranchInsertionOverloadNamesParamAliasPromotePrimitiveRegistersPromotePrimitiveTypesPublicizeFieldsRandomDeadCodeRenameRegistersReorderInstructionsReorderParametersSimpleOpqauePredicatesSplitClassesStaticMethodsBodiesVariableReassigner0
24、.00.20.40.60.81.0 Similarity29Evaluating Resilience PropertynResilience to special obfuscation toolsAllatori, DashO, Jshrink, ProGuard and RetroGroundResilience to Allatori-Series obfuscation tools30Evaluating Credibility PropertynSimilarity between independently implemented programs6 compression so
25、ftware: Lbzip, lrzip, pbzip2, pigz, plzip and rar5 audio players: Cmus, mocp, mp3blaster, mplayer and sox10 web browsers: arora, chromium, dillo, dooble, epiphany, firefox, konqueror, luakit, midori and seaMonkeyCredibility evaluation of TW-SCSSBs using 10 web browsers31Comparing with Traditional Bi
26、rthmarksnPerformance Evaluation MetricBy varying from 0-0.5, an F-Measure curve can be drawnAUC: area under the F-Measure curveEPJPEIJIPrecisionJPJIEPJPEIJIRecallEPEI2PrecisionRecallFMeasurePrecisionRecall1,ABABABP P areclassified as copiesSim P PP P areclassified asindependentotherwise inconclusive
27、 Detection Criteria32Comparing with Traditional BirthmarksF-Measure curves for TW-SCSSBSA, TW-SCSSBSS, and SCSSB33Outline1.Introduction2.Thread-Aware Birthmark Methods3.Evaluation4.Unsolved Problems & Future Work34Unsolved Problems & Future Work lProblemsPartial and library plagiarism problemsTool i
28、s preliminaryImpact of K is not evaluatedlFuture WorksConduct experiments using other kinds tools, such as the shelling tools (Upx, ASProtect etc.); and on real plagiarism casesImprove our method to support for partial plagiarism detectionEvaluate the effect of K to detection abilityForm a relatively mature tool35Q&A36Some Definitions