R语言习题(28页).doc
-一组学生参加了数学、科学和英语考试。为了给所有的学生确定一个单一的成绩衡量指标,需要将这些科目的成绩组合起来。另外,还想将前20%的学生评定为A,接下来20%的学生评定为B,以此类推。最后,希望按字母顺序对学生排序。Excel中的数据表StuIdStuNameMathScienceEnglish1John Davis50295252Angela Williams46567123Bull Jones62178224Cheryl Cushing57566185Reuven Ytzrhak45496156Joel Knox63489307Mary Rayburn57678378Greg England42156129Brad Tmac599682210Tracy Mcgrady666100381:输入数据R语言导入xlsx>install.packages("xlsx")>library(xlsx)>workbook<-"D:/R /StuScore.xlsx">StuScore<-read.xlsx(workbook,1)>StuScore2:数据预处理将变量进行标准化> options(digits=2)#限定为2位小数> afterscale<-scale(StuScore,3:5)> afterscale Math Science English 1, -0.58 1.040 0.20 2, -1.02 -0.815 -1.17 3, 0.82 -0.086 -0.12 4, 0.28 -0.881 -0.54 5, -1.15 1.106 -0.86 6, 0.98 0.643 0.73 7, 0.29 -0.086 1.47 8, -1.54 -1.544 -1.17 9, 0.56 -0.749 -0.12 10, 1.35 1.372 1.57attr(,"scaled:center") Math Science English 551 79 23 attr(,"scaled:scale") Math Science English 84.7 15.1 9.53:通过函数mean()来计算各行的均值以及获得综合得分,并使用cbind()将其添加到花名册中> #3在afterscale中计算标准差均值,并将其添加到StuScore> score<-apply(afterscale,1,mean)#1表示行,mean表示均值函数> StuScore<-cbind(StuScore,score)> StuScore StuId StuName Math Science English score1 1 John Davis 502 95 25 0.222 2 Angela Williams 465 67 12 -1.003 3 Bull Jones 621 78 22 0.214 4 Cheryl Cushing 575 66 18 -0.385 5 Reuven Ytzrhak 454 96 15 -0.306 6 Joel Knox 634 89 30 0.787 7 Mary Rayburn 576 78 37 0.568 8 Greg England 421 56 12 -1.429 9 Brad Tmac 599 68 22 -0.1010 10 Tracy Mcgrady 666 100 38 1.434:函数quantile()给出学生综合得分的百分位数quantile(x,probs):求分位数,其中x为待求分位数的数值型向量,probs为一个由0,1之间的概率值组成的数值向量 > afterquantile<-quantile(score,c(.8,.6,.4,.2) > afterquantile 80% 60% 40% 20% 0.60 0.21 -0.18 -0.505:使用逻辑运算符,把score转为等级(离散型)> StuScore$gradescore>=afterquantile1<-"A" > StuScore$gradescore<afterquantile1&&score>=afterquantile2<-"B" > StuScore$gradescore<afterquantile2&&score>=afterquantile3<-"C" > StuScore$gradescore<afterquantile3&&score>=afterquantile4<-"D" > StuScore$gradescore<afterquantile4<-"E" > StuScore StuId StuName Math Science English score grade1 1 John Davis 502 95 25 0.22 B2 2 Angela Williams 465 67 12 -1.00 E3 3 Bull Jones 621 78 22 0.21 B4 4 Cheryl Cushing 575 66 18 -0.38 E5 5 Reuven Ytzrhak 454 96 15 -0.30 E6 6 Joel Knox 634 89 30 0.78 B7 7 Mary Rayburn 576 78 37 0.56 B8 8 Greg England 421 56 12 -1.42 E9 9 Brad Tmac 599 68 22 -0.10 E10 10 Tracy Mcgrady 666 100 38 1.43 B6:使用strsplit()以空格为界把学生姓名拆分为姓氏和名字> StuScore$StuName<-as.character(StuScore$StuName)> is.character(StuScore$StuName)1 TRUE> name<-strsplit(StuScore$StuName," ")> name11 "John" "Davis"21 "Angela" "Williams"31 "Bull" "Jones"41 "Cheryl" "Cushing"51 "Reuven" "Ytzrhak"61 "Joel" "Knox"71 "Mary" "Rayburn"81 "Greg" "England" 91 "Brad" "Tmac"101 "Tracy" "Mcgrady"7:把name分成Firstname和LastName,加入到StuScore中> FirstName<-sapply(name,"",1)> LastName<-sapply(name,"",2)> StuScore<-cbind(FirstName,LastName,StuScore,-1)> StuScore FirstName LastName LastName StuName Math Science English score grade1 John Davis Davis John Davis 502 95 25 0.22 B2 Angela Williams Williams Angela Williams 465 67 12 -1.00 E3 Bull Jones Jones Bull Jones 621 78 22 0.21 B4 Cheryl Cushing Cushing Cheryl Cushing 575 66 18 -0.38 E5 Reuven Ytzrhak Ytzrhak Reuven Ytzrhak 454 96 15 -0.30 E6 Joel Knox Knox Joel Knox 634 89 30 0.78 B7 Mary Rayburn Rayburn Mary Rayburn 576 78 37 0.56 B8 Greg England England Greg England 421 56 12 -1.42 E9 Brad Tmac Tmac Brad Tmac 599 68 22 -0.10 E10 Tracy Mcgrady Mcgrady Tracy Mcgrady 666 100 38 1.43 B8:order排序> StuScoreorder(LastName,FirstName), FirstName LastName LastName StuName Math Science English score grade4 Cheryl Cushing Cushing Cheryl Cushing 575 66 18 -0.38 E1 John Davis Davis John Davis 502 95 25 0.22 B8 Greg England England Greg England 421 56 12 -1.42 E3 Bull Jones Jones Bull Jones 621 78 22 0.21 B6 Joel Knox Knox Joel Knox 634 89 30 0.78 B10 Tracy Mcgrady Mcgrady Tracy Mcgrady 666 100 38 1.43 B7 Mary Rayburn Rayburn Mary Rayburn 576 78 37 0.56 B9 Brad Tmac Tmac Brad Tmac 599 68 22 -0.10 E2 Angela Williams Williams Angela Williams 465 67 12 -1.00 E5 Reuven Ytzrhak Ytzrhak Reuven Ytzrhak 454 96 15 -0.30 E9:为StuScore绘制分组条形图install.packages("vcd")library(vcd)fill_colors<-c() #不同的等级,不同的颜色显示for(i in 1:length(StuScore$Science)if(StuScore$Sciencei=100) fill_colors<-c(fill_colors,"red")else if(StuScore$Sciencei<100&&StuScore$Sciencei>=80) fill_colors<-c(fill_colors,"yellow") else if(StuScore$Sciencei<80&&StuScore$Sciencei>=60) fill_colors<-c(fill_colors,"blue") else fill_colors<-c(fill_colors,"green") barplot(StuScore$Science, #条形图 main="Science Score", xlab="Name",ylab="ScienceScore", col=fill_colors, names.arg=(paste(substr(FirstName,1,1),".",LastName), #设定横坐标名称 border=NA, #条形框不设置边界线 font.main=4, font.lab=3, beside=TRUE) legend(x=8.8,y=100, #左上角点的坐标 cex=.8, #缩放比例 inset=5, c("Excellent","Good","Ordinary","Bad"), pch=c(15,16,17,19), #图例中的符号 col=c("red","yellow","blue","green"), bg="#821122", #背景色 xpd=TRUE, #可以在绘图区之外显示 text.font=8, text.width=.8, text.col=c("red","yellow","blue","green")10:现有6名患者的身高和体重,检验体重除以身高的平方是否等于22.5.编号123456身高m1.751.801.651.901.741.91体重kg607257909572height<-c(1.75,1.80,1.65,1.90,1.74,1.91)weight<-c(60,72,57,90,95,72)sq.height<-height2ratio<-weight/sq.heightt.test(ratio,mu=22.5) #t检验11:将三种不同菌型的伤寒病毒a,b,c分别接种于100,9,11只小白鼠上,观察其存活天数,问三种菌型下小白鼠的平均存活天数是否有显著差异。a菌株:2,4,3,2,4,7,7,2,5,4b菌株:5,6,8,5,10,7,12,6,6c菌株:7,11,6,6,7,9,5,10,6,3,10准备数据表,day和type各位一列。#数据读取,将test.txt中的内容保存到bac中,header=T表示保留标题行。bac<-read.table(“D:/anova.data.txt”,header=T)#将ba数据框中的type转换为因子(factor)bac$type<-as.factor(bac$type)ba.an<-aov(lm(daytype,date=bac)summary(ba.an)boxplot(daytype,data=bac,col=”red”)12: Calculate the first 50 powers of 2, 2*2, 2*2*2, etc.Calculate the squares of the integer numbers from 1 to 50.Which pairs are equal, i.e. which integer numbers fulfill the condition .How many pairs are there?(Use R to solve all these questions!)> n=c(1:50)> a=2n> b=n2> x=a-b> nx=01 2 4> sum(x=0)1 2> n!(x>0)|(x<0)1 2 4> sum(!(x>0)|(x<0)1 213: Calculate the sine, cosine, and the tangent for numbers ranging from 0 to (with distance 0.1 between them).Remember that tan(x)=sin(x)/cos(x). Now calculate the difference between tan(x) and sin(x)/cos(x) for the values above. Which values are exactly equal? What is the maximum difference? What is the cause of the differences?> A=seq(0,2*pi,0.1)> for(x in A)+ if(sin(x)/cos(x)=tan(x)+ print(x)1 01 0.41 0.51 0.81 1.41 1.61 1.71 1.81 1.91 21 2.11 2.31 2.41 2.51 2.71 2.81 2.91 31 3.11 3.21 3.31 3.41 3.61 3.71 3.81 41 4.11 4.21 4.51 4.61 4.81 4.91 51 5.11 5.21 5.31 5.41 5.51 5.71 5.81 61 6.11 6.2> x=seq(from=0,to=2*pi,by=0.1)> s=sin(x)> c=cos(x)> t=tan(x)> d=s/c-t> xmd=abs(d)1 4.7> xd=0 1 0.0 0.4 0.5 0.8 1.4 1.6 1.7 1.8 1.9 2.0 2.1 2.3 2.4 2.5 2.7 2.8 2.9 3.019 3.1 3.2 3.3 3.4 3.6 3.7 3.8 4.0 4.1 4.2 4.5 4.6 4.8 4.9 5.0 5.1 5.2 5.337 5.4 5.5 5.7 5.8 6.0 6.1 6.214: Use the R help routines (not the manuals) to find out how to use the functions floor, trunc, round, ceiling, and what they do. Predict what each of these functions will give as an answer for the numbers -3.7 and +3.8. Use R to test your predictions.ceiling 向上取整floor 向下取整trunc 截尾取整round 按所保留的小数点位数四舍五入signif 按所需的有效数位数四舍五入> ceiling(-3.7)1 -3> ceiling(-3.3)1 -3> ceiling(3.1)1 4> floor(-3.7)1 -4> floor(3.8)1 3> trunc(-3.7)1 -3> trunc(-3.3)1 -3> trunc(3.8)1 3> round(-3.7)1 -4> round(3.8)1 4> round(-3.74,digits=1)1 -3.7> round(-3.79,digits=1)1 -3.8> round(3.89,digits=1)1 3.9> round(3.84,digits=1)1 3.8> signif(-3.7)1 -3.7> signif(3.8)1 3.8> signif(3.8,digits=2)1 3.8> signif(-3.7,digits=1)1 -4> signif(-3.3,digits=1)1 -3> signif(3.1,digits=1)1 3> signif(3.8,digits=1)1 415:编写函数定义函数:rcal<-function(x,y) Z<-x2+y2;Result<-sqrt(z);Result;调用函数:Rcal(3,4)16:在原有图形上添加元素X<-rnorm(100) #生成随机数Hist(x,freq=F) #绘制直方图Curve(dnom(x),add=T) #添加曲线H<-hist(x,plot=F) #绘制直方图Ylim<-range(0,h$density,dnorm(0) #设定纵轴的取值范围Hist(x,freq=F,ylim=ylim) #绘制直方图Curve(dnorm(x),add=T,col=”red”) #添加曲线17:生成0到2之间的50个随机数,分别命名为x,yX<-runif(50,0,2)Y<-runif(50,0,2)绘图:将主标题命名为“散点图”,横轴命名为“横坐标”,纵轴命名为“纵坐标”Plot(x,y,main=”散点图”,xlab=”横坐标”,ylab=”纵坐标”)Test(0.6,0.6,”text at(0.6,0.6)”)Abline(h=.6,v=.6)18:分步绘图:Plot(x,y,type=”n”,xlab=”,axes=F) #打开绘图窗口,不绘制任何对象Point(x,y) #添加坐标点Axis(at=seq(0.2,1.8,0.2),side=3) #添加纵轴Box() #补齐散点图的边框Title(main=”main title”,sub=”subtitle”,xlab=”x-lable”,”ylab =”y=lable”) #添加标题、副标题、横轴说明、纵轴说明19:一页多图(par())Par(mfrow=c(2,2)20:对一批涂料进行研究,确定搅拌速度对杂质含量的影响,数据如下,试进行回归分析 表:搅拌速度对涂料中杂质的影响转速Rpm202224262830323436384042杂质率%8.49.511.810.413.314.813.214.716.416.518.918.5#将以下代码粘贴到编辑器中,另存为regression.r文件Rate<-c(20,22,24,26,28,30,32,34,36,38,40,42)Impurity<-c(8.4,9.5,11.8,10.4,13.3,14.8,13.2,14.7,16.4,16.5,18.9,18.5)Plot(impurityrate)Reg<-lm(impurityrate)Abline(reg,col=”red”)Summary(reg)三种运行方式1 通过source()函数运行Source(“D:/regression.r”)2 通过R搅拌编辑器运行路径:RGui>File>Open Scrip #Ctrl+R运行3 直接粘贴到R控制台Ctrl+c,Ctrl+v第 29 页-