这十套练习,教你如何用Pandas做数据分析

惊泓 健康知识 2024-12-13 4 0

这十套练习,教你如何用Pandas做数据分析

目录 练习题索引 对应的数据集文件路径查看 练习1-开始了解你的数据 探索Chipotle快餐数据 步骤1 导入必要的库 步骤2 从如下地址导入数据集 步骤3 将数据集存入一个名为chipo的数据框内 步骤4 查看前10行内容 步骤6 数据集中有多少个列(columns) 步骤7 打印出全部的列名称 步骤8 数据集的索引是怎样的 步骤9 被下单数最多商品(item)是什么? 步骤10 在item_name这一列中,一共有多少种商品被下单? 步骤11 在choice_description中,下单次数最多的商品是什么? 步骤12 一共有多少商品被下单? 步骤13 将item_price转换为浮点数 步骤14 在该数据集对应的时期内,收入(revenue)是多少 步骤15 在该数据集对应的时期内,一共有多少订单? 步骤16 每一单(order)对应的平均总价是多少? 步骤17 一共有多少种不同的商品被售出? 练习2-数据过滤与排序 探索2012欧洲杯数据 步骤1 - 导入必要的库 步骤2 - 从以下地址导入数据集 步骤3 - 将数据集命名为euro12 步骤4 只选取 Goals 这一列 步骤5 有多少球队参与了2012欧洲杯? 步骤6 该数据集中一共有多少列(columns)? 步骤7 将数据集中的列Team, Yellow Cards和Red Cards单独存为一个名叫discipline的数据框 步骤8 对数据框discipline按照先Red Cards再Yellow Cards进行排序 步骤9 计算每个球队拿到的黄牌数的平均值 步骤10 找到进球数Goals超过6的球队数据 步骤11 选取以字母G开头的球队数据 步骤12 选取前7列 步骤13 选取除了最后3列之外的全部列 步骤14 找到英格兰(England)、意大利(Italy)和俄罗斯(Russia)的射正率(Shooting Accuracy) 练习3-数据分组 探索酒类消费数据 步骤1 导入必要的库 步骤2 从以下地址导入数据 步骤3 将数据框命名为drinks 步骤4 哪个大陆(continent)平均消耗的啤酒(beer)更多? 步骤5 打印出每个大陆(continent)的红酒消耗(wine_servings)的描述性统计值 步骤6 打印出每个大陆每种酒类别的消耗平均值 步骤7 打印出每个大陆每种酒类别的消耗中位数 步骤8 打印出每个大陆对spirit饮品消耗的平均值,最大值和最小值 练习4-Apply函数 探索1960 - 2014 美国犯罪数据 步骤1 导入必要的库 步骤2 从以下地址导入数据集 步骤3 将数据框命名为crime 步骤4 每一列(column)的数据类型是什么样的? 步骤5 将Year的数据类型转换为 datetime64 步骤6 将列Year设置为数据框的索引 步骤7 删除名为Total的列 步骤8 按照Year对数据框进行分组并求和 步骤9 何时是美国历史上生存最危险的年代? 练习5-合并 探索虚拟姓名数据 步骤1 导入必要的库 步骤2 按照如下的元数据内容创建数据框 步骤3 将上述的数据框分别命名为data1, data2, data3 步骤4 将data1和data2两个数据框按照行的维度进行合并,命名为all_data 步骤5 将data1和data2两个数据框按照列的维度进行合并,命名为all_data_col 步骤6 打印data3 步骤7 按照subject_id的值对all_data和data3作合并 步骤8 对data1和data2按照subject_id作连接 步骤9 找到 data1 和 data2 合并之后的所有匹配结果 练习6-统计 探索风速数据 步骤1 导入必要的库 步骤2 从以下地址导入数据 步骤3 将数据作存储并且设置前三列为合适的索引 步骤4 2061年?我们真的有这一年的数据?创建一个函数并用它去修复这个bug 步骤5 将日期设为索引,注意数据类型,应该是datetime64[ns] 步骤6 对应每一个location,一共有多少数据值缺失 步骤7 对应每一个location,一共有多少完整的数据值 步骤8 对于全体数据,计算风速的平均值 步骤9 创建一个名为loc_stats的数据框去计算并存储每个location的风速最小值,最大值,平均值和标准差 步骤10 创建一个名为day_stats的数据框去计算并存储所有location的风速最小值,最大值,平均值和标准差 步骤11 对于每一个location,计算一月份的平均风速 步骤12 对于数据记录按照年为频率取样 步骤13 对于数据记录按照月为频率取样 练习7-可视化 探索泰坦尼克灾难数据 步骤1 导入必要的库 步骤2 从以下地址导入数据 步骤3 将数据框命名为titanic 步骤4 将PassengerId设置为索引 步骤5 绘制一个展示男女乘客比例的扇形图 步骤6 绘制一个展示船票Fare, 与乘客年龄和性别的散点图 步骤7 有多少人生还? 步骤8 绘制一个展示船票价格的直方图 练习8-创建数据框 探索Pokemon数据 步骤1 导入必要的库 步骤2 创建一个数据字典 步骤3 将数据字典存为一个名叫pokemon的数据框中 步骤4 数据框的列排序是字母顺序,请重新修改为name, type, hp, evolution, pokedex这个顺序 步骤5 添加一个列place 步骤6 查看每个列的数据类型 练习9-时间序列 探索Apple公司股价数据 步骤1 导入必要的库 步骤2 数据集地址 步骤3 读取数据并存为一个名叫apple的数据框 步骤4 查看每一列的数据类型 步骤5 将Date这个列转换为datetime类型 步骤6 将Date设置为索引 步骤7 有重复的日期吗? 步骤8 将index设置为升序 步骤9 找到每个月的最后一个交易日(business day) 步骤10 数据集中最早的日期和最晚的日期相差多少天? 步骤11 在数据中一共有多少个月? 步骤12 按照时间顺序可视化Adj Close值 练习10-删除数据 探索Iris纸鸢花数据 步骤1 导入必要的库 步骤2 数据集地址 步骤3 将数据集存成变量iris 步骤4 创建数据框的列名称 步骤5 数据框中有缺失值吗? 步骤6 将列petal_length的第10到19行设置为缺失值 步骤7 将缺失值全部替换为1.0 步骤8 删除列class 步骤9 将数据框前三行设置为缺失值 步骤10 删除有缺失值的行 步骤11 重新设置索引 结语 是入门做数据分析所必须要掌握的一个库。本文内容由科赛网翻译整理自Github,建议读者完成科赛网 从零上手Python关键代码 和 Pandas基础命令速查表 教程学习的之后,点击本篇Notebook右上角的 Fork 按钮对本教程代码进行调试学习。 转载本文请联系 科赛网 取得授权,科赛网 是聚合数据人才和行业问题的在线社区,率先打造国内首款K-Lab 在线数据分析协作平台,为数据工作者的学习与工作带来全新的体验。 点击习题编号即可跳转至习题内容。 习题编号内容相应数据集练习1 - 开始了解你的数据探索Chipotle快餐数据chipotle.tsv练习2 - 数据过滤与排序探索2012欧洲杯数据Euro2012_stats.csv练习3 - 数据分组探索酒类消费数据drinks.csv练习4 -Apply函数探索1960 - 2014 美国犯罪数据US_Crime_Rates_1960_2014.csv练习5 - 合并探索虚拟姓名数据练习中手动内置的数据练习6 - 统计探索风速数据wind.data练习7 - 可视化探索泰坦尼克灾难数据train.csv练习8 - 创建数据框探索Pokemon数据练习中手动内置的数据练习9 - 时间序列探索Apple公司股价数据Apple_stock.csv练习10 - 删除数据探索Iris纸鸢花数据iris.csv 返回练习题索引 步骤1 导入必要的库 步骤2 从如下地址导入数据集 步骤3 将数据集存入一个名为chipo的数据框内 步骤4 查看前10行内容  order_idquantityitem_namechoice_descriptionitem_price011Chips and Fresh Tomato SalsaNaN$2.39111Izze[Clementine]$3.39211Nantucket Nectar[Apple]$3.39311Chips and Tomatillo-Green Chili SalsaNaN$2.39422Chicken Bowl[Tomatillo-Red Chili Salsa (Hot), [Black Beans...$16.98531Chicken Bowl[Fresh Tomato Salsa (Mild), [Rice, Cheese, Sou...$10.98631Side of ChipsNaN$1.69741Steak Burrito[Tomatillo Red Chili Salsa, [Fajita Vegetables...$11.75841Steak Soft Tacos[Tomatillo Green Chili Salsa, [Pinto Beans, Ch...$9.25951Steak Burrito[Fresh Tomato Salsa, [Rice, Black Beans, Pinto...$9.25 步骤6 数据集中有多少个列(columns) 步骤7 打印出全部的列名称 步骤8 数据集的索引是怎样的 步骤9 被下单数最多商品(item)是什么?  item_namequantity17Chicken Bowl76118Chicken Burrito59125Chips and Guacamole50639Steak Burrito38610Canned Soft Drink351 步骤10 在item_name这一列中,一共有多少种商品被下单? 步骤11 在choice_description中,下单次数最多的商品是什么? 步骤12 一共有多少商品被下单? 步骤13 将item_price转换为浮点数 步骤14 在该数据集对应的时期内,收入(revenue)是多少 步骤15 在该数据集对应的时期内,一共有多少订单? 步骤16 每一单(order)对应的平均总价是多少? 步骤17 一共有多少种不同的商品被售出? 返回练习题索引 返回练习题索引 步骤1 - 导入必要的库 步骤2 - 从以下地址导入数据集 步骤3 - 将数据集命名为euro12  TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Saves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed CardsSubs onSubs offPlayers Used0Croatia4131251.9%16.0%32000...1381.3%416229099161Czech Republic4131841.9%12.9%39000...960.1%53738701111192Denmark4101050.0%20.0%27100...1066.7%253884077153England5111850.0%17.2%40000...2288.1%43456501111164France3222437.9%6.5%65100...654.6%36515601111195Germany10323247.8%15.6%80210...1062.6%634912401515176Greece581830.7%19.2%32111...1365.1%674812911212207Italy6344543.0%7.5%110200...2074.1%10189161601818198Netherlands2123625.0%4.1%60200...1270.6%353035077159Poland2152339.4%5.2%48000...666.7%4856371771710Portugal6224234.3%9.3%82600...1071.5%73901012014141611Republic of Ireland171236.8%5.2%28000...1765.4%4351116110101712Russia593122.5%12.5%59200...1077.0%3443460771613Spain12423355.9%16.0%100010...1593.8%102831911017171814Sweden5171947.2%13.8%39300...861.6%3551770991815Ukraine272621.2%6.0%38000...1376.5%48314509918 16 rows × 35 columns 步骤4 只选取 这一列 步骤5 有多少球队参与了2012欧洲杯? 步骤6 该数据集中一共有多少列(columns)? 步骤7 将数据集中的列Team, Yellow Cards和Red Cards单独存为一个名叫discipline的数据框  TeamYellow CardsRed Cards0Croatia901Czech Republic702Denmark403England504France605Germany406Greece917Italy1608Netherlands509Poland7110Portugal12011Republic of Ireland6112Russia6013Spain11014Sweden7015Ukraine50 步骤8 对数据框discipline按照先Red Cards再Yellow Cards进行排序  TeamYellow CardsRed Cards6Greece919Poland7111Republic of Ireland617Italy16010Portugal12013Spain1100Croatia901Czech Republic7014Sweden704France6012Russia603England508Netherlands5015Ukraine502Denmark405Germany40 步骤9 计算每个球队拿到的黄牌数的平均值 步骤10 找到进球数Goals超过6的球队数据  TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Saves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed CardsSubs onSubs offPlayers Used5Germany10323247.8%15.6%80210...1062.6%6349124015151713Spain12423355.9%16.0%100010...1593.8%1028319110171718 2 rows × 35 columns 步骤11 选取以字母G开头的球队数据  TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Saves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed CardsSubs onSubs offPlayers Used5Germany10323247.8%15.6%80210...1062.6%634912401515176Greece581830.7%19.2%32111...1365.1%67481291121220 2 rows × 35 columns 步骤12 选取前7列  TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)0Croatia4131251.9%16.0%321Czech Republic4131841.9%12.9%392Denmark4101050.0%20.0%273England5111850.0%17.2%404France3222437.9%6.5%655Germany10323247.8%15.6%806Greece581830.7%19.2%327Italy6344543.0%7.5%1108Netherlands2123625.0%4.1%609Poland2152339.4%5.2%4810Portugal6224234.3%9.3%8211Republic of Ireland171236.8%5.2%2812Russia593122.5%12.5%5913Spain12423355.9%16.0%10014Sweden5171947.2%13.8%3915Ukraine272621.2%6.0%38 步骤13 选取除了最后3列之外的全部列  TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Clean SheetsBlocksGoals concededSaves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed Cards0Croatia4131251.9%16.0%32000...01031381.3%41622901Czech Republic4131841.9%12.9%39000...1106960.1%53738702Denmark4101050.0%20.0%27100...11051066.7%25388403England5111850.0%17.2%40000...22932288.1%43456504France3222437.9%6.5%65100...175654.6%36515605Germany10323247.8%15.6%80210...11161062.6%634912406Greece581830.7%19.2%32111...12371365.1%674812917Italy6344543.0%7.5%110200...21872074.1%10189161608Netherlands2123625.0%4.1%60200...0951270.6%35303509Poland2152339.4%5.2%48000...083666.7%485637110Portugal6224234.3%9.3%82600...21141071.5%73901012011Republic of Ireland171236.8%5.2%28000...02391765.4%4351116112Russia593122.5%12.5%59200...0831077.0%344346013Spain12423355.9%16.0%100010...5811593.8%102831911014Sweden5171947.2%13.8%39300...1125861.6%355177015Ukraine272621.2%6.0%38000...0441376.5%4831450 16 rows × 32 columns 步骤14 找到英格兰(England)、意大利(Italy)和俄罗斯(Russia)的射正率(Shooting Accuracy)  TeamShooting Accuracy3England50.0%7Italy43.0%12Russia22.5% 返回练习题索引 返回练习题索引 步骤1 导入必要的库 步骤2 从以下地址导入数据 步骤3 将数据框命名为drinks  countrybeer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcoholcontinent0Afghanistan0000.0AS1Albania89132544.9EU2Algeria250140.7AF3Andorra24513831212.4EU4Angola21757455.9AF 步骤4 哪个大陆(continent)平均消耗的啤酒(beer)更多? 步骤5 打印出每个大陆(continent)的红酒消耗(wine_servings)的描述性统计值  countmeanstdmin25%50%75%maxcontinent        AF53.016.26415138.8464190.01.02.013.00233.0AS44.09.06818221.6670340.00.01.08.00123.0EU45.0142.22222297.4217380.059.0128.0195.00370.0OC16.035.62500064.5557900.01.08.523.25212.0SA12.062.41666788.6201891.03.012.098.50221.0 步骤6 打印出每个大陆每种酒类别的消耗平均值  beer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcoholcontinent    AF61.47169816.33962316.2641513.007547AS37.04545560.8409099.0681822.170455EU193.777778132.555556142.2222228.617778OC89.68750058.43750035.6250003.381250SA175.083333114.75000062.4166676.308333 步骤7 打印出每个大陆每种酒类别的消耗中位数  beer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcoholcontinent    AF32.03.02.02.30AS17.516.01.01.20EU219.0122.0128.010.00OC52.537.08.51.75SA162.5108.512.06.85 步骤8 打印出每个大陆对spirit饮品消耗的平均值,最大值和最小值  meanminmaxcontinent   AF16.3396230152AS60.8409090326EU132.5555560373OC58.4375000254SA114.75000025302 返回练习题索引 步骤1 导入必要的库 步骤2 从以下地址导入数据集 步骤3 将数据框命名为crime  YearPopulationTotalViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_Theft019601793231753384200288460309570091101719010784015432091210018554003282001196118299200034880002893903198600874017220106670156760949600191300033600021962185771000375220030151034507008530175501108601645709943002089600366800319631884830004109500316970379250086401765011647017421010864002297800408300419641911410004564600364220420040093602142013039020305012132002514400472800 步骤4 每一列(column)的数据类型是什么样的? 注意到了吗,Year的数据类型为 ,但是pandas有一个不同的数据类型去处理时间序列(time series),我们现在来看看。 步骤5 将Year的数据类型转换为 步骤6 将列Year设置为数据框的索引  PopulationTotalViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_TheftYear           1960-01-011793231753384200288460309570091101719010784015432091210018554003282001961-01-011829920003488000289390319860087401722010667015676094960019130003360001962-01-011857710003752200301510345070085301755011086016457099430020896003668001963-01-0118848300041095003169703792500864017650116470174210108640022978004083001964-01-011911410004564600364220420040093602142013039020305012132002514400472800 步骤7 删除名为Total的列  PopulationViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_TheftYear          1960-01-01179323175288460309570091101719010784015432091210018554003282001961-01-01182992000289390319860087401722010667015676094960019130003360001962-01-01185771000301510345070085301755011086016457099430020896003668001963-01-011884830003169703792500864017650116470174210108640022978004083001964-01-01191141000364220420040093602142013039020305012132002514400472800  PopulationViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_TheftYear          1960-01-01191505317541349304516090010618023672016335102158520133211002654770052921001970-01-01212119329896079309138380019223055457041590204702120284860005315780097399001980-01-01237137006914074328117048900206439865639538310976191303307349472040253119354111990-01-012612825258175270481190534992116649988275748930105689632675001577679366146244182000-01-01294796911713968056100944369163068922499423036686521242156517667970291114128342010-01-0115701463076072017440959507286742105917498093764142101251703040169835690802020-01-010000000000 步骤8 按照Year对数据框进行分组并求和 *注意Population这一列,若直接对其求和,是不正确的**  PopulationViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_TheftYear          1960-01-01201385000.041349304516090010618023672016335102158520133211002654770052921001970-01-01220099000.096079309138380019223055457041590204702120284860005315780097399001980-01-01248239000.014074328117048900206439865639538310976191303307349472040253119354111990-01-01272690813.0175270481190534992116649988275748930105689632675001577679366146244182000-01-01307006550.013968056100944369163068922499423036686521242156517667970291114128342010-01-01318857056.06072017440959507286742105917498093764142101251703040169835690802020-01-01NaN000000000 步骤9 何时是美国历史上生存最危险的年代? 返回练习题索引 返回练习题索引 步骤1 导入必要的库 步骤2 按照如下的元数据内容创建数据框 步骤3 将上述的数据框分别命名为 步骤4 将和两个数据框按照行的维度进行合并,命名为  subject_idfirst_namelast_name01AlexAnderson12AmyAckerman23AllenAli34AliceAoni45AyoungAtiches04BillyBonder15BrianBlack26BranBalwner37BryceBrice48BettyBtisan 步骤5 将和两个数据框按照列的维度进行合并,命名为  subject_idfirst_namelast_namesubject_idfirst_namelast_name01AlexAnderson4BillyBonder12AmyAckerman5BrianBlack23AllenAli6BranBalwner34AliceAoni7BryceBrice45AyoungAtiches8BettyBtisan 步骤6 打印  subject_idtest_id01511215231534614516571468157918106191116 步骤7 按照的值对和作合并  subject_idfirst_namelast_nametest_id01AlexAnderson5112AmyAckerman1523AllenAli1534AliceAoni6144BillyBonder6155AyoungAtiches1665BrianBlack1677BryceBrice1488BettyBtisan15 步骤8 对和按照作连接  subject_idfirst_name_xlast_name_xfirst_name_ylast_name_y04AliceAoniBillyBonder15AyoungAtichesBrianBlack 步骤9 找到 和 合并之后的所有匹配结果  subject_idfirst_name_xlast_name_xfirst_name_ylast_name_y01AlexAndersonNaNNaN12AmyAckermanNaNNaN23AllenAliNaNNaN34AliceAoniBillyBonder45AyoungAtichesBrianBlack56NaNNaNBranBalwner67NaNNaNBryceBrice78NaNNaNBettyBtisan 返回练习题索引 返回练习题索引 步骤1 导入必要的库 步骤2 从以下地址导入数据 步骤3 将数据作存储并且设置前三列为合适的索引  Yr_Mo_DyRPTVALROSKILSHABIRDUBCLAMULCLOBELMAL02061-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.0412061-01-0214.71NaN10.836.5012.627.6711.5010.049.799.6717.5413.8322061-01-0318.5016.8812.3310.1311.176.1711.25NaN8.507.6712.7512.7132061-01-0410.586.6311.754.584.542.888.631.795.835.885.4610.8842061-01-0513.3313.2511.426.1710.718.2111.926.5410.9210.3412.9211.83 步骤4 2061年?我们真的有这一年的数据?创建一个函数并用它去修复这个bug  Yr_Mo_DyRPTVALROSKILSHABIRDUBCLAMULCLOBELMAL01961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.0411961-01-0214.71NaN10.836.5012.627.6711.5010.049.799.6717.5413.8321961-01-0318.5016.8812.3310.1311.176.1711.25NaN8.507.6712.7512.7131961-01-0410.586.6311.754.584.542.888.631.795.835.885.4610.8841961-01-0513.3313.2511.426.1710.718.2111.926.5410.9210.3412.9211.83 步骤5 将日期设为索引,注意数据类型,应该是  RPTVALROSKILSHABIRDUBCLAMULCLOBELMALYr_Mo_Dy            1961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.041961-01-0214.71NaN10.836.5012.627.6711.5010.049.799.6717.5413.831961-01-0318.5016.8812.3310.1311.176.1711.25NaN8.507.6712.7512.711961-01-0410.586.6311.754.584.542.888.631.795.835.885.4610.881961-01-0513.3313.2511.426.1710.718.2111.926.5410.9210.3412.9211.83 步骤6 对应每一个location,一共有多少数据值缺失 步骤7 对应每一个location,一共有多少完整的数据值 步骤8 对于全体数据,计算风速的平均值 步骤9 创建一个名为的数据框去计算并存储每个location的风速最小值,最大值,平均值和标准差  minmaxmeanstdRPT0.6735.8012.3629875.618413VAL0.2133.3710.6443145.267356ROS1.5033.8411.6605265.008450KIL0.0028.466.3064683.605811SHA0.1337.5410.4558344.936125BIR0.0026.167.0922543.968683DUB0.0030.379.7973434.977555CLA0.0031.088.4950534.499449MUL0.0025.888.4935904.166872CLO0.0428.218.7073324.503954BEL0.1342.3813.1210075.835037MAL0.6742.5415.5990796.699794 步骤10 创建一个名为的数据框去计算并存储所有location的风速最小值,最大值,平均值和标准差  minmaxmeanstdYr_Mo_Dy    1961-01-019.2918.5013.0181822.8088751961-01-026.5017.5411.3363643.1889941961-01-036.1718.5011.6418183.6819121961-01-041.7911.756.6191673.1981261961-01-056.1713.3310.6300002.445356 步骤11 对于每一个location,计算一月份的平均风速 注意,1961年的1月和1962年的1月应该区别对待 步骤12 对于数据记录按照年为频率取样  RPTVALROSKILSHABIRDUBCLAMULCLOBELMALdatemonthyeardayYr_Mo_Dy                1961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.041961-01-011196111962-01-019.293.4211.543.502.211.9610.412.793.545.174.387.921962-01-011196211963-01-0115.5913.6219.798.3812.2510.0023.4515.7113.5914.3717.5834.131963-01-011196311964-01-0125.8022.1318.2113.2521.2914.7914.1219.5813.2516.7528.9621.001964-01-011196411965-01-019.5411.929.004.386.085.2110.256.085.718.6312.0417.411965-01-011196511966-01-0122.0421.5017.0812.7522.1715.5921.7918.1216.6617.8328.3323.791966-01-011196611967-01-016.464.466.503.216.673.7911.383.837.719.0810.6720.911967-01-011196711968-01-0130.0417.8816.2516.2521.7912.5418.1616.6218.7517.6222.2527.291968-01-011196811969-01-016.131.635.411.082.541.008.502.424.586.349.1716.711969-01-011196911970-01-019.592.9611.793.426.134.089.004.467.293.507.3313.001970-01-011197011971-01-013.710.794.710.171.421.044.630.751.541.084.219.541971-01-011197111972-01-019.293.6314.544.256.754.4213.005.3310.048.548.7119.171972-01-011197211973-01-0116.5015.9214.627.418.2911.2113.547.7910.4610.7913.379.711973-01-011197311974-01-0123.2116.5416.089.7515.8311.469.5413.5413.8316.6617.2125.291974-01-011197411975-01-0114.0413.5411.295.4612.585.588.128.969.295.177.7111.631975-01-011197511976-01-0118.3417.6714.838.0016.6210.1313.179.0413.135.7511.3814.961976-01-011197611977-01-0120.0411.9220.259.139.298.0410.755.889.009.0014.8825.701977-01-011197711978-01-018.337.127.713.548.507.5014.7110.0011.8310.0015.0920.461978-01-01119781 步骤13 对于数据记录按照月为频率取样  RPTVALROSKILSHABIRDUBCLAMULCLOBELMALdatemonthyeardayYr_Mo_Dy                1961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.041961-01-011196111961-02-0114.2515.129.045.8812.087.1710.173.636.505.509.178.001961-02-012196111961-03-0112.6713.1311.796.429.798.5410.2513.29NaN12.2120.62NaN1961-03-013196111961-04-018.386.348.336.759.339.5411.678.2111.216.4611.967.171961-04-014196111961-05-0115.8713.8815.379.7913.4610.179.9614.049.759.9218.6311.121961-05-015196111961-06-0115.929.5912.048.7911.546.049.758.299.3310.3410.6712.121961-06-016196111961-07-017.216.837.714.428.464.796.716.005.797.966.968.711961-07-017196111961-08-019.595.095.544.638.295.254.215.255.375.418.389.081961-08-018196111961-09-015.581.134.963.044.252.254.632.713.676.004.795.411961-09-019196111961-10-0114.2512.877.878.0013.007.755.839.007.085.2911.794.041961-10-0110196111961-11-0113.2113.1314.338.5412.1710.2113.0812.1710.9213.5420.1720.041961-11-0111196111961-12-019.677.758.003.966.002.757.252.505.585.587.7911.171961-12-0112196111962-01-019.293.4211.543.502.211.9610.412.793.545.174.387.921962-01-011196211962-02-0119.1213.9612.2110.5815.7110.6315.7111.0813.1712.6217.6722.711962-02-012196211962-03-018.214.839.004.836.002.217.961.874.083.924.085.411962-03-013196211962-04-0114.3312.2511.8710.3714.9211.0019.7911.6714.0915.4616.6223.581962-04-014196211962-05-019.629.543.583.338.753.752.252.581.672.377.293.251962-05-015196211962-06-015.886.298.675.215.004.255.915.414.799.255.2510.711962-06-016196211962-07-018.674.176.926.718.175.6611.179.388.7511.1210.2517.081962-07-017196211962-08-014.585.376.042.297.873.714.462.584.004.797.217.461962-08-018196211962-09-0110.0012.0810.969.259.297.627.418.757.679.6214.5811.921962-09-019196211962-10-0114.587.8319.2110.0811.548.3813.2910.638.2112.9218.0518.121962-10-0110196211962-11-0116.8813.2516.008.9613.4611.4610.4610.1710.3713.2114.8315.161962-11-0111196211962-12-0118.3815.4111.756.7912.218.048.4210.835.669.0811.5011.501962-12-0112196211963-01-0115.5913.6219.798.3812.2510.0023.4515.7113.5914.3717.5834.131963-01-011196311963-02-0115.417.6224.6711.429.218.1714.047.547.5410.0810.1717.671963-02-012196311963-03-0116.7519.6717.678.8719.0815.3716.2114.2911.299.2119.9219.791963-03-013196311963-04-0110.549.5912.467.339.469.5911.7911.879.7910.7113.3718.211963-04-014196311963-05-0118.7914.1713.5911.6314.1711.9614.4612.4612.8713.9615.2921.621963-05-015196311963-06-0113.376.8712.008.5010.049.4210.9212.9611.7911.0410.9213.671963-06-01619631...................................................1976-07-018.501.756.582.132.752.215.372.045.884.504.9610.631976-07-017197611976-08-0113.008.388.635.8312.928.2513.009.4210.5811.3414.2120.251976-08-018197611976-09-0111.8711.007.386.877.758.3310.346.4610.179.2912.7519.551976-09-019197611976-10-0110.966.7110.414.637.585.045.045.546.503.926.795.001976-10-0110197611976-11-0113.9615.6710.296.4612.799.0810.009.6710.2111.6323.0921.961976-11-0111197611976-12-0113.4616.429.214.5410.758.6710.884.838.795.918.8313.671976-12-0112197611977-01-0120.0411.9220.259.139.298.0410.755.889.009.0014.8825.701977-01-011197711977-02-0111.839.7111.004.258.588.716.175.668.297.5811.7116.501977-02-012197711977-03-018.6314.8310.293.756.638.795.008.127.876.4213.5413.671977-03-013197711977-04-0121.6716.0017.3313.5920.8315.9625.6217.6219.4120.6724.3730.091977-04-014197711977-05-016.427.128.673.584.584.006.756.133.334.5019.2112.381977-05-015197711977-06-017.085.259.712.832.213.505.291.422.000.925.215.631977-06-016197711977-07-0115.4116.2917.086.2511.8311.8312.2910.5810.417.2117.377.831977-07-017197711977-08-014.332.964.422.330.961.084.961.872.332.0410.509.831977-08-018197711977-09-0117.3716.3316.838.5814.4611.8315.0913.9213.2913.8823.2925.171977-09-019197711977-10-0116.7515.3412.259.4216.3811.3818.5013.9214.0914.4622.3429.671977-10-0110197711977-11-0116.7111.5412.174.178.547.1711.126.468.256.2111.0415.631977-11-0111197711977-12-0113.3710.9212.422.375.796.138.967.386.295.718.5412.421977-12-0112197711978-01-018.337.127.713.548.507.5014.7110.0011.8310.0015.0920.461978-01-011197811978-02-0127.2524.2118.1617.4627.5418.0520.9625.0420.0417.5027.7121.121978-02-012197811978-03-0115.046.2116.047.876.426.6712.298.0010.589.335.4117.001978-03-013197811978-04-013.427.582.711.383.462.082.674.754.831.677.3313.671978-04-014197811978-05-0110.5412.219.085.2911.0010.0811.1713.7511.8711.7912.8727.161978-05-015197811978-06-0110.3711.426.466.0411.257.506.465.967.795.465.5010.411978-06-016197811978-07-0112.4610.6311.176.7512.929.0412.429.6212.088.0414.0416.171978-07-017197811978-08-0119.3315.0920.178.8312.6210.419.3312.339.509.9215.7518.001978-08-018197811978-09-018.426.139.875.253.215.717.253.507.336.507.6215.961978-09-019197811978-10-019.506.8310.503.886.134.584.216.506.386.5410.6314.091978-10-0110197811978-11-0113.5916.7511.257.0811.048.338.1711.2910.7511.2523.1325.001978-11-0111197811978-12-0121.2916.2924.0412.7918.2119.2921.5417.2116.7117.8317.7525.701978-12-011219781 216 rows × 16 columns 返回练习题索引 返回练习题索引 步骤1 导入必要的库 步骤2 从以下地址导入数据 步骤3 将数据框命名为titanic  PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S4503Allen, Mr. William Henrymale35.0003734508.0500NaNS 步骤4 将PassengerId设置为索引  SurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedPassengerId           103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S503Allen, Mr. William Henrymale35.0003734508.0500NaNS 步骤5 绘制一个展示男女乘客比例的扇形图 步骤6 绘制一个展示船票, 与乘客年龄和性别的散点图 步骤7 有多少人生还? 步骤8 绘制一个展示船票价格的直方图 返回练习题索引 返回练习题索引 步骤1 导入必要的库 步骤2 创建一个数据字典 步骤3 将数据字典存为一个名叫pokemon的数据框中  evolutionhpnamepokedextype0Ivysaur45Bulbasauryesgrass1Charmeleon39Charmandernofire2Wartortle44Squirtleyeswater3Metapod45Caterpienobug 步骤4 数据框的列排序是字母顺序,请重新修改为这个顺序  nametypehpevolutionpokedex0Bulbasaurgrass45Ivysauryes1Charmanderfire39Charmeleonno2Squirtlewater44Wartortleyes3Caterpiebug45Metapodno 步骤5 添加一个列  nametypehpevolutionpokedexplace0Bulbasaurgrass45Ivysauryespark1Charmanderfire39Charmeleonnostreet2Squirtlewater44Wartortleyeslake3Caterpiebug45Metapodnoforest 步骤6 查看每个列的数据类型 返回练习题索引 返回练习题索引 步骤1 导入必要的库 步骤2 数据集地址 步骤3 读取数据并存为一个名叫apple的数据框  DateOpenHighLowCloseVolumeAdj Close02014-07-0896.2796.8093.9295.356513000095.3512014-07-0794.1495.9994.1095.975630540095.9722014-07-0393.6794.1093.2094.032289180094.0332014-07-0293.8794.0693.0993.482842090093.4842014-07-0193.5294.0793.1393.523817020093.52 步骤4 查看每一列的数据类型 步骤5 将这个列转换为类型 步骤6 将设置为索引  OpenHighLowCloseVolumeAdj CloseDate      2014-07-0896.2796.8093.9295.356513000095.352014-07-0794.1495.9994.1095.975630540095.972014-07-0393.6794.1093.2094.032289180094.032014-07-0293.8794.0693.0993.482842090093.482014-07-0193.5294.0793.1393.523817020093.52 步骤7 有重复的日期吗? 步骤8 将index设置为升序  OpenHighLowCloseVolumeAdj CloseDate      1980-12-1228.7528.8728.7528.751172584000.451980-12-1527.3827.3827.2527.25439712000.421980-12-1625.3725.3725.2525.25264320000.391980-12-1725.8726.0025.8725.87216104000.401980-12-1826.6326.7526.6326.63183624000.41 步骤9 找到每个月的最后一个交易日(business day)  OpenHighLowCloseVolumeAdj CloseDate      1980-12-3130.48153830.56769230.44307730.4430772.586252e+070.4730771981-01-3031.75476231.82666731.65476231.6547627.249867e+060.4938101981-02-2726.48000026.57210526.40789526.4078954.231832e+060.4110531981-03-3124.93772725.01681824.83636424.8363647.962691e+060.3877271981-04-3027.28666727.36809527.22714327.2271436.392000e+060.423333 步骤10 数据集中最早的日期和最晚的日期相差多少天? 步骤11 在数据中一共有多少个月? 步骤12 按照时间顺序可视化值 返回练习题索引 步骤1 导入必要的库 步骤2 数据集地址 步骤3 将数据集存成变量  5.13.51.40.2Iris-setosa04.93.01.40.2Iris-setosa14.73.21.30.2Iris-setosa24.63.11.50.2Iris-setosa35.03.61.40.2Iris-setosa45.43.91.70.4Iris-setosa 步骤4 创建数据框的列名称  sepal_lengthsepal_widthpetal_lengthpetal_widthclass05.13.51.40.2Iris-setosa14.93.01.40.2Iris-setosa24.73.21.30.2Iris-setosa34.63.11.50.2Iris-setosa45.03.61.40.2Iris-setosa 步骤5 数据框中有缺失值吗? 步骤6 将列的第10到19行设置为缺失值  sepal_lengthsepal_widthpetal_lengthpetal_widthclass05.13.51.40.2Iris-setosa14.93.01.40.2Iris-setosa24.73.21.30.2Iris-setosa34.63.11.50.2Iris-setosa45.03.61.40.2Iris-setosa55.43.91.70.4Iris-setosa64.63.41.40.3Iris-setosa75.03.41.50.2Iris-setosa84.42.91.40.2Iris-setosa94.93.11.50.1Iris-setosa105.43.7NaN0.2Iris-setosa114.83.4NaN0.2Iris-setosa124.83.0NaN0.1Iris-setosa134.33.0NaN0.1Iris-setosa145.84.0NaN0.2Iris-setosa155.74.4NaN0.4Iris-setosa165.43.9NaN0.4Iris-setosa175.13.5NaN0.3Iris-setosa185.73.8NaN0.3Iris-setosa195.13.8NaN0.3Iris-setosa 步骤7 将缺失值全部替换为1.0  sepal_lengthsepal_widthpetal_lengthpetal_widthclass05.13.51.40.2Iris-setosa14.93.01.40.2Iris-setosa24.73.21.30.2Iris-setosa34.63.11.50.2Iris-setosa45.03.61.40.2Iris-setosa55.43.91.70.4Iris-setosa64.63.41.40.3Iris-setosa75.03.41.50.2Iris-setosa84.42.91.40.2Iris-setosa94.93.11.50.1Iris-setosa105.43.71.00.2Iris-setosa114.83.41.00.2Iris-setosa124.83.01.00.1Iris-setosa134.33.01.00.1Iris-setosa145.84.01.00.2Iris-setosa155.74.41.00.4Iris-setosa165.43.91.00.4Iris-setosa175.13.51.00.3Iris-setosa185.73.81.00.3Iris-setosa195.13.81.00.3Iris-setosa205.43.41.70.2Iris-setosa215.13.71.50.4Iris-setosa224.63.61.00.2Iris-setosa235.13.31.70.5Iris-setosa244.83.41.90.2Iris-setosa255.03.01.60.2Iris-setosa265.03.41.60.4Iris-setosa275.23.51.50.2Iris-setosa285.23.41.40.2Iris-setosa294.73.21.60.2Iris-setosa..................1206.93.25.72.3Iris-virginica1215.62.84.92.0Iris-virginica1227.72.86.72.0Iris-virginica1236.32.74.91.8Iris-virginica1246.73.35.72.1Iris-virginica1257.23.26.01.8Iris-virginica1266.22.84.81.8Iris-virginica1276.13.04.91.8Iris-virginica1286.42.85.62.1Iris-virginica1297.23.05.81.6Iris-virginica1307.42.86.11.9Iris-virginica1317.93.86.42.0Iris-virginica1326.42.85.62.2Iris-virginica1336.32.85.11.5Iris-virginica1346.12.65.61.4Iris-virginica1357.73.06.12.3Iris-virginica1366.33.45.62.4Iris-virginica1376.43.15.51.8Iris-virginica1386.03.04.81.8Iris-virginica1396.93.15.42.1Iris-virginica1406.73.15.62.4Iris-virginica1416.93.15.12.3Iris-virginica1425.82.75.11.9Iris-virginica1436.83.25.92.3Iris-virginica1446.73.35.72.5Iris-virginica1456.73.05.22.3Iris-virginica1466.32.55.01.9Iris-virginica1476.53.05.22.0Iris-virginica1486.23.45.42.3Iris-virginica1495.93.05.11.8Iris-virginica 150 rows × 5 columns 步骤8 删除列  sepal_lengthsepal_widthpetal_lengthpetal_width05.13.51.40.214.93.01.40.224.73.21.30.234.63.11.50.245.03.61.40.2 步骤9 将数据框前三行设置为缺失值  sepal_lengthsepal_widthpetal_lengthpetal_width0NaNNaNNaNNaN1NaNNaNNaNNaN2NaNNaNNaNNaN34.63.11.50.245.03.61.40.2 步骤10 删除有缺失值的行  sepal_lengthsepal_widthpetal_lengthpetal_width34.63.11.50.245.03.61.40.255.43.91.70.464.63.41.40.375.03.41.50.2 步骤11 重新设置索引  sepal_lengthsepal_widthpetal_lengthpetal_width04.63.11.50.215.03.61.40.225.43.91.70.434.63.41.40.345.03.41.50.2 返回练习题索引 恭喜你已经完成了这10套题目的练习。欢迎查看科赛网用户贡献的科赛项目以及科赛数据集获取更多优秀学习内容。
免责声明:本网站部分内容由用户自行上传,若侵犯了您的权益,请联系我们处理,谢谢!

分享:

扫一扫在手机阅读、分享本文

惊泓

这家伙太懒。。。

  • 暂无未发布任何投稿。

最近发表

站长推荐