《大数据分析技术 习题答案项目五.docx》由会员分享,可在线阅读,更多相关《大数据分析技术 习题答案项目五.docx(1页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、任务3练习题.获取csv源数据1 .处理和分析数据,实现游戏占比饼图.根据游每年戏销售数据实现散点图,折线图和柱状图# -*- coding: utf-8 -*一1 .游戏销售数据可视化分析 import pandas as pd import numpy as npimport matplotlib.pyplot as pit1980-2020data = pd. read_csv(,. /vgsales. csvJ) print (data, info ()删除任意有空值的行,然后重置索引,再将年份这一列转成整型 data, dropna(how=, any , inplace=True)
2、 data. reset_index(drop=True, inplace=True) dataf YearJ = dataYear. astype (int) print (data, info ()# 根据游戏发行商分组得出每个发行商发行的游戏数量 Genre_count = data, groupby (dataJ Genre,). count () Genre count, sort values C Rank , ascending=False, inplace=True) print(Genrecount)labels = Genre_count. index.tolist ()p
3、ie_data = Genre_countRank,. values, tolist ()pit. rcParamsf font, sansserif = SimHei, explode = 0, 0, 0, 0, 0, 0, 0, 0, 0. 2, 0. 2, 0. 2, 0. 2 fig 二 pit. figure (figsize= (20,8),dpi=80) pit. pie(x=pie_data,labels=labels ,autopct=,%1. lf%? ,explode二explode ,startangle=50)pit. title(饼图例如-游戏类型占比) pit.
4、show()sale_year 二 data. groupby C Year) sum() print(sale_year)xdata=sale_year. loci 2005 : 2015 Global_Sales,. index, astypeC int,). tolist () data_sale二sale_year. locf 2005 : 2015, Global_Sales,. values. astype( int). tolist () print(data sale, x data)fig = pit. figure(figsize=(10, 8), dpi=80)pit.
5、subplot (221) # 2子图总行数 2子图总列数1子图位置pit. plot (x data, data sale, color=r , ls=,-,Iw=3)#ls (或 linestyle)设置线的样式(-,-.)lw(或linewidth)设置线的粗细 pit. xticks(x_data:21)pit.subplot (222)pit. scatter (x data, data sale, label=,销售额(百万),marker=, s=100) #s 点的大 小marker点的样式pit. xticks(x_data:21)pit. legend()pit. subplot (212)pit. bar (x data, data sale, color=,g , alpha=0. 6) ttalpha 设置透明度(0T 或 OTO) pit. xticks(x_data:21)pit. title (销售额变化趋势)pit. show()