《并行计算实验指南课件.ppt》由会员分享,可在线阅读,更多相关《并行计算实验指南课件.ppt(80页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、2023/5/26并行计算 实验指南肖健、金舟肖健、金舟China Research Laboratory提纲实验概述实验概述实验内容相关技术串讲PthreadOpenMPMPIMapReduce2023/5/26China Research Laboratory实验概述熟悉主流的并行计算环境,为科研工作打基础理论与实践并重(实验成绩占总成绩的50%)各个实验的权重(共4次,每个实验总分均为100分)前二次各20%后二次各30%4次实验完成后,总结为一份实验报告(纸质&电子版)提交,按实验顺序撰写2023/5/26China Research Laboratory2023/5/26电子版实验报
2、告要求每一次实验的报告内容包括:实验题目,作者,时间,实验内容,原理,程序流程图程序流程图,实现方法,结果(数据图表、效果图等),理论性能分析以及对实际结果的分析,总结展望总结展望(心得体会等)。提交电子版实验报告时,必须附上全部源代码,以及源代码的编译、运行、部署详细说明详细说明。如果是使用集成开发工具完成的程序,则必须提交全部工程文件全部工程文件以及相应说明China Research LaboratoryChina Research Laboratory2023/5/26实验报告提交提交时间为2011.12.31中午12:00之前纸质和电子版纸质:25楼B区606不要封面不打印代码(可包
3、含少量关键代码或伪代码)调整好版面、字体,尽可能减少页数建议双面打印以一个订书钉装订即可电子版:附件压缩,命名:附件压缩,命名:学号学号_姓名姓名.rar邮件标题:邮件标题:并行计算并行计算_学号学号_姓名姓名China Research Laboratory2023/5/26注意事项上机地点为25教710机房可自带笔记本电脑(有无线网络)遵守实验室纪律上机时间第3、4、6、7次课为实验上机课下午2:00-5:00课程网站及相关资源工具下载内容实时更新,注意及时查看China Research Laboratory提纲实验概述实验内容实验内容相关技术串讲PthreadOpenMPMPIMapR
4、educe2023/5/26China Research LaboratoryChina Research Laboratory2023/5/26实验环境机房统一的实验环境OS:WinXPLanguage:C/C+IDE:CodeBlocks+MinGW(含GCC,支持OpenMP)Lib:mpich.nt,pthreads_win32个人实验环境可利用个人使用的计算机或服务器自由构建,在实验报告上描述清楚即可实验环境搭建参见并行计算实验环境配置.doc鼓励在Linux环境下进行实验,酌情加分(10分以内)China Research Laboratory计算任务在金星的气温数据记录中,寻找每
5、年的最高气温原始数据集描述数据格式文本文件,每行代表一个温度记录观测点ID(4位),空格,温度值(8位)例如:9721 62429744数据布局每年的温度记录条数为50万条,放到以年份命名的文件中,如:1992.txt,年份范围为1900-2010整个数据集被分成了4等份,每份均含有1900-2010年间1/4的观测数据,即每份内有110个文件,单个文件的记录条数为1250002023/5/26China Research LaboratoryChina Research Laboratory提纲实验概述实验内容相关技术串讲相关技术串讲PthreadOpenMPMPIMapReduce2023
6、/5/26China Research LaboratoryChina Research Laboratory实验1.1:PThread多线程线程基本概念线程的同步(竞争与死锁)POSIX PThread API2023/5/26China Research Laboratory2023/5/26线程(thread)是进程上下文(context)中执行的代码序列,又被称为轻量级进程(light weight process)在支持多线程的系统中,进程是资源分配的实体,而线程是被调度执行的基本单元。多线程概念China Research LaboratoryChina Research Labo
7、ratory2023/5/26访问共享变量时产生的冲突China Research Laboratory2023/5/26死锁China Research Laboratory2023/5/26线程的同步竞争条件临界区同步方法信号量锁条件变量China Research Laboratory2023/5/26锁锁(Lock)类似于信号量,不同之处在于在同一时刻只能使用一个锁。锁对应的两个原子操作分别是:Acquire():获取操作,将锁据为己有并把状态改为已加锁,如果该锁以被其他线程占有则等待锁状态变为未加锁状态。Release():释放操作,将锁状态由已加锁改为未加锁状态。一个锁最多只能由一
8、个线程获得。任何线程对共享资源进行操作访问前必须先获得锁。否则,线程将保持在该锁地等待队列,直到该锁被释放。China Research Laboratory2023/5/26互斥量(互斥锁)互斥量(mutex 是 MUTual EXclusion 的缩写)是实现线程间同步的一种方法。互斥量是一种锁,线程对共享资源进行访问之前必须先获得锁;否则线程将保持等待状态,直到该锁可用。只有其他线程都不占有它时一个线程才可以占有它,在该线程主动放弃它之前也没有另外的线程可以占有它。占有这个锁的过程就叫做锁定或者获得互斥量。China Research LaboratoryChina Research L
9、aboratory2023/5/26互斥量可能的输出结果:或China Research Laboratory2023/5/26POSIX Thread APIPOSIX:Portable Operating System InterfacePOSIX 是基于UNIX 的,这一标准意在期望获得源代码级的软件可移植性。为一个POSIX 兼容的操作系统编写的程序,应该可以在任何其它的POSIX 操作系统(即使是来自另一个厂商)上编译执行。POSIX 标准定义了操作系统应该为应用程序提供的接口:系统调用集。POSIX是由IEEE(Institute of Electrical and Electro
10、nic Engineering)开发的,并由ANSI(American National Standards Institute)和ISO(International Standards Organization)标准化。China Research Laboratory2023/5/26程序示例China Research LaboratoryChina Research Laboratory2023/5/26执行执行 Pthread 线程线程China Research Laboratory2023/5/26Pthread线程的生命周期使用 pthread_join()和 pthread_
11、exit()China Research LaboratoryChina Research Laboratory实验1.2:OpenMPOpenMP概述编译制导语句2023/5/26China Research Laboratory2023/5/26OpenMP概述OpenMP 是一种面向共享内存以及分布式共享内存的多处理器多线程并行编程语言。OpenMP是一种能够被用于显式制导多线程、共享内存并行的应用程序编程接口(API)。OpenMP标准诞生于1997 年,目前其结构审议委员会(Architecture Review Board,ARB)已经制定并发布OpenMP 3.0 版本。Chin
12、a Research Laboratory2023/5/26OpenMP发展历程OpenMPFortran 1.1OpenMPC/C+1.0OpenMPFortran 2.0OpenMPC/C+2.01998200019992002OpenMPFortran 1.01997OpenMPF/C/C+2.52005OpenMPF/C/C+3.02008 China Research Laboratory2023/5/26OpenMP编程模型:Fork-JoinFork-Join 执行模式在开始执行的时候,只有主线程程存在。主线程在运行过程中,当遇到需要进行并行计算的时候,派生出(Fork)线程来执
13、行并行任务。在并行执行的时候,主线程和派生线程共同工作。在并行代码结束执行后,派生线程退出或者挂起,不再工作,控制流程回到单独的主线程中(Join)。China Research Laboratory2023/5/26OpenMP的实现编译制导语句运行时库函数环境变量China Research Laboratory2023/5/26编译制导语句(Compiler Directive)编译制导语句的含义是在编译器编译程序的时候,会识别特定的注释,而这些特定的注释就包含着OpenMP 程序的一些语义。在C/C+程序中,用#pragma omp parallel 来标识一段并行程序块。在一个无法识
14、别OpenMP 语义的普通编译器中,这些特定的注释会被当作普通的注释而被忽略。China Research Laboratory2023/5/26编译制导语句(Compiler Directive)void main()double Res1000;for(int i=0;i1000;i+)do_huge_comp(Resi);#include“omp.h”void main()double Res1000;#pragma omp parallel for for(int i=0;i1000;i+)do_huge_comp(Resi);将循环拆分到多个线程执行将循环拆分到多个线程执行并行代码并
15、行代码串行代码串行代码China Research Laboratory2023/5/26编译制导-并行域(parallel region)China Research Laboratory2023/5/26并行域并行域中的代码被所有的线程执行具体格式#pragma omp parallel clause,clausenewlineclause=if(scalar-expression)private(list)firstprivate(list)default(shared|none)shared(list)copyin(list)reduction(operator:list)num_th
16、reads(integer-expression)China Research Laboratory2023/5/26并行域示例#include main()int nthreads,tid;/*Fork a team of threads giving them their own copies of variables*/#pragma omp parallel private(tid)/*Obtain and print thread id*/tid=omp_get_thread_num();printf(Hello World from thread=%dn,tid);/*Only m
17、aster thread does this*/if(tid=0)nthreads=omp_get_num_threads();printf(Number of threads=%dn,nthreads);/*All threads join master thread and terminate*/China Research Laboratory2023/5/26编译制导-数据域属性子句变量作用域范围数据域属性子句private子句:表示它列出的变量对于每个线程是局部的。语句格式:private(list)China Research Laboratory2023/5/26private(
18、)#include int main()int i,x=100;#pragma omp parallel for private(x)for(i=0;i8;i+)x+=i;printf(x=%dn,x);printf(global x=%dn,x);return 1;x=0 x=1 x=2 x=5 x=6 x=13 x=4 x=9 global x=100 4线程China Research Laboratory实验2:Message Passing InterfaceMPI简介MPI主要接口MPI示例2023/5/26China Research LaboratoryMessage Pass
19、ing InterfaceDerived from several previous librariesPVM,P4,ExpressA standard/criterion of message passing programming model,not an implementation.Standard message-passing libraryProvide interface calls for C/C+/FORTRANAvailable for freeCan be installed onNetworks of WorkstationsParallel Computers(Cr
20、ay T3E,IBM SP2,Parsytec PowerXplorer,other)China Research LaboratoryGoals of MPI StandardPowerful functionsMPI-1 128 routinesMPI-2 287 routinesHigher performancesAvoid memory-to-memory copying,allow overlap of communication and computation.PortabilityCan be implemented on most vendors platforms Prov
21、ide credible and extensible interfaces China Research LaboratoryMPI BasicsA wide range of problems can be solved by using just six of its functions.Starting and FinishingMPI_INITMPI_FINALIZEIdentifying yourselfMPI_COMM_SIZEMPI_COMM_RANKSending and Receiving messagesMPI_SENDMPI_RECVChina Research Lab
22、oratoryMPI starting and finishingStatement needed in every program before any other MPI codeMPI_Init(&argc,&argv);Last statement of MPI code must beMPI_Finalize();Program will not terminate without this statement.China Research LaboratoryCommunicatorCollection of processesInitially,all processes enr
23、olled in a“universe”called MPI_COMM_WORLDMPI_COMM_SIZE(comm,size)can be used to know its sizeDetermines scope to which messages are relativeIdentity of process(rank)is relative to communicatoreach process is given a unique rank,a number from 0 to n-1,where there are n processes.MPI_COMM_RANK(comm,ra
24、nk)can be used to find process rank.(Determine the identifier of the current process.)Scope of global communications(broadcast,etc.)China Research LaboratoryTypes of MPI CommunicationPoint to Point communication Group(Collective)communicationOne to many MPI_BCASTMany to one MPI_GATHERMany to many MP
25、I_ALLTOALLChina Research LaboratoryPoint-to-Point Communication The general formats of parameters of the send/receive are:China Research LaboratoryMPI message protocolSend-Receive is point-to-point,destination process is specified by fourth parameter(dest)in MPI_SendMessages can be tagged by integer
26、 to distinguish messages with different purposes by the fifth argument in MPI_Send and MPI_RecvCount(second parameter)in MPI_Recv specifies capacity of buffer(number of items)in terms of type given in third parameterMPI_Recv can specify a specific source from which to receive(fourth parameter)MPI_Re
27、cv can receive from any source or with any tag using MPI_ANY_SOURCE and MPI_ANY_TAGChina Research LaboratoryMPI message protocol(Cont.)Status of message received by MPI_Recv is returned in the seventh(status)parameterNumber of items actually received can be determined from status by using function M
28、PI_Get_countThe following call would return the number of characters sent in the integer variable cnt MPI_Get_count(&status,MPI_CHAR,&cnt);China Research LaboratoryMessage Passing Example#include#include#include mpi.h /*includes MPI library code specs*/#define MAXSIZE 100int main(int argc,char*argv)
29、int myRank;/*rank(identity)of process */int numProc;/*number of processors */int source;/*rank of sender */int dest;/*rank of destination */int tag=0;/*tag to distinguish messages */char messMAXSIZE;/*message(other types possible)*/int count;/*number of items in message */MPI_Status status;/*status
30、of message received */China Research LaboratoryMessage Passing Example(Cont.)MPI_Init(&argc,&argv);/*start MPI */*get number of processes */MPI_Comm_size(MPI_COMM_WORLD,&numProc);/*get rank of this process*/MPI_Comm_rank(MPI_COMM_WORLD,&myRank);/*/*code to send,receive and process messages*/*/MPI_Fi
31、nalize();/*shut down MPI*/China Research LaboratoryMessage Passing Example(Cont.)if(myRank!=0)/*all processes send to root*/*create message*/sprintf(message,Hello from%d,myRank);dest =0;/*destination is root */count=strlen(mess)+1;/*include 0*/MPI_Send(mess,count,MPI_CHAR,dest,tag,MPI_COMM_WORLD);el
32、se/*root process receives and prints messages*/*from each processor in rank order */for(source=1;source 1TB)上百/上千 CPU 实现并行处理简单地实现以上目的Google Earth uses 70.5 TB:70 TB for the raw imagery and 500 GB for the index data.From:分而治之分而治之Divide and Conquer Google MapReduce架构设计师Jeffrey DeanChina Research Labor
33、atoryMapReduce“MapReduce 是一种 编程模型,是处理和产生大规模数据集的一种整合实现.”China Research LaboratoryMapReduce编程模型借鉴了函数式编程方式(functional programming)用户只需要实现两个函数接口:map (in_key,in_value)-(out_key,intermediate_value list)reduce(out_key,intermediate_value list)-out_value listChina Research Laboratorymap将数据源中的记录(文本中的行、数据库中条目等
34、)作为map函数中的key*value对例如,(,line)map()将生成一个或多个中间结果,以及与input相对应的一个output keyChina Research Laboratoryreducemap操作结束后,所有与某指定out key相对应的中间结果组合为一个列表(list)。reduce()函数将这些中间结果组合为一个或多个对应于同一output key 的 final value(实际上每一个output key通常只有一个final value)China Research LaboratoryMapReduce逻辑过程China Research Laboratory并
35、行化map()函数可以并行执行,为不同的输入数据集生成不同的中间结果reduce()函数也可以并行执行,分别处理不同的output keymap和reduce的处理过程中不发生通信瓶颈:只有当map处理全部结束后,reduce过程才能够开始China Research LaboratoryMapReduce的并行执行China Research Laboratory整体执行流程整体执行流程China Research Laboratory示例:WordCount源数据Page 1:the weather is goodPage 2:today is goodPage 3:good weathe
36、r is goodChina Research Laboratorymap 输出Worker 1:(the 1),(weather 1),(is 1),(good 1).Worker 2:(today 1),(is 1),(good 1).Worker 3:(good 1),(weather 1),(is 1),(good 1).China Research Laboratoryreduce 的输入Worker 1:(the 1)Worker 2:(is 1),(is 1),(is 1)Worker 3:(weather 1),(weather 1)Worker 4:(today 1)Work
37、er 5:(good 1),(good 1),(good 1),(good 1)China Research Laboratoryreduce输出Worker 1:(the 1)Worker 2:(is 3)Worker 3:(weather 2)Worker 4:(today 1)Worker 5:(good 4)China Research Laboratorymap(String input_key,String input_value):/input_key:document name /input_value:document contents for each word w in
38、input_value:EmitIntermediate(w,1);reduce(String output_key,Iterator intermediate_values):/output_key:a word /output_values:a list of counts int result=0;for each v in intermediate_values:result+=ParseInt(v);Emit(AsString(result);WordCount 伪代码China Research Laboratory倒排索引(Inverted Index)AlgorithmMapp
39、er:For each word in(file,words),map to(word,file)Reducer:Identity function文件内容:fooThis page contains so much textbarMy page contains text tooChina Research LaboratoryInverted Index:Data flowChina Research Laboratory其他示例翻转翻转web-link图图在每个作为源的页面中,检查其连接URL,并逐个输出元组。reduce函数将连接到每个target的所有source组合起来,形成lis
40、t列表,输出每个站点的术语向量每个站点的术语向量术语向量表示出在一篇文章中或者一组文章中最重要的单词,通常以元组的方式。map函数输出每个文章的(hostname通过文章的URL分析得到)。reduce函数取出不常用的术语,将其余的相加,得到最终的对China Research Laboratory来源于Google 的核心思想。Hadoop是一个开源的分布式并行计算平台,它主要由MapReduce的算法执行和一个分布式的文件系统两部分组成。Hadoop起源于Doug Cutting领导开发的Nutch搜索引擎项目的子项目。现在是Apache软件基金会管理的开源项目。Hadoop简介Doug
41、CuttingChina Research LaboratoryHadoop工作原理China Research Laboratory使用MapReduce思想分析数据主程序:分配Map任务收集Map结果、洗牌、分配Reduce任务收集Reduce结果,打印Map程序Reduce程序China Research Laboratory主要参考资料PThread TutorialPThreads Primer-POSIX Multithread ProgrammingMPI并行程序设计OpenMP Application Program Interface Spec 3.0Hadoop-The Definitive Guide,Second Edition2023/5/26结束Questions&Answers2023/5/26