概率论与统计学基本知识chapt2english.ppt
CH2 Methods for Describing Sets of Data,1.Describe Data Graphically通过图表 2.Describe Data Numerically用数表示地,Learning Objects,Example APHASIA,The researchers want to determine whether one type of aphasia occurs more often than any other, and, if so, how often.,Describing Qualitative data,Qualitative data are nonnumerical in nature, thus the value of a qualitative variable can only be classified into categories called classes. We can summarise such data numerically in two ways: (1) by counting the class组 frequency频数 the number of observations in the data set that fall into each class, or (2) by calculating the class relative frequency相对频数 the proportion of the total number of observations falling into each class.,Describing Qualitative data,DEF 2.1 A class is one of the categories分类 into which qualitative data can be classified. DEF 2.2 The class frequency is the number of observations in the data set falling into a particular class. DEF 2.3 The class relative frequency is the class frequency divided by the total number (denote as n, the size of the data set) of observations in the data set, i.e., class relative frequency = class frequency/n.,Example APHASIA,Bar Graph and Pie Chart,The most widely used graphical methods for summarizing qualitative data are bar graphs and pie chart. Bar graph shows the amount of data that belongs to each class as proportionally sized rectangular areas Pie chart shows the amount of data that belongs to each class as a proportional part of a circle,Bar Graph,Bar Length Shows Frequency or %,Equal Bar Widths,Type,Frequency,Bar Graph,Pie Chart饼图,1.Shows Breakdown of Total Quantity into Categories 2.Useful for Showing Relative Differences 3.Angle Size (360)(Percent),Graphical Methods for describing Quantitative data,Quantitative data sets consist of data that are recorded on a meaningful numerical scale. For describing, summarizing, such data sets, we introduce here three graphical methods: dot plots, stem-and-leaf displays, and histograms.,Example EPAGAS,EPA Mileage Ratings on 100 Cars,Dot Plots 点图,Dot plot condenses the data by groping all values that are the same together in the plot. In the dot plot, the horizontal axis is a scale for the quantitative variable and the numerical value of each measurement in the data set is located on the horizontal scale by a dot. When data values repeat, the dots are placed above one another. See the figure in the example below.,Dot Plot,Stem-and-Leaf Display 茎叶图,Stem-and-leaf display combines graphic technique and sorting technique. It is very popular for summarizing numerical data.,1.Divide Each Observation into Stem Value and Leaf Value the leading digit(s) becomes the stem the trailing digit(s) becomes the leaf,2. Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41,26,Stem-and-Leaf Display,Stem-and-Leaf Display,Example Construct a stem-and-leaf display of the following set of 20 test scores. 82 74 88 66 58 74 78 84 96 76 62 68 72 92 86 76 52 76 82 78,Stem-and-Leaf Display,Figure 1:20 Exam Scores 5 2 8 6 2 6 8 7 2 4 4 6 6 6 8 8 8 2 2 4 6 8 9 2 6,Stem-and-Leaf Display,Stem-and-leaf of MPG N=100 Leaf Unit=0.10 30 0 31 8 32 5 7 9 9 33 1 2 6 8 9 9 34 0 2 4 5 8 8 35 0 1 2 3 5 6 6 7 8 9 9 36 0 1 2 3 3 4 4 5 5 6 6 7 7 8 9 9 37 0 0 0 0 1 1 1 2 2 3 3 4 4 5 6 6 7 7 8 9 9 38 0 1 2 2 3 4 5 6 7 8 39 0 0 3 4 5 7 8 9 40 0 1 2 3 5 5 7 41 0 0 2 42 1 43 44 9,Histograms,Divide the data set into class intervals of equal size Count the class frequency or calculate the class relative frequency Conduct the histogram,1.Determine Range 2. Compute Class Intervals (Width) 3.Select Number of Classes Usually Between 5 that is x = + . If z = -2, the corresponding value of x is two standard deviations less than the mean; that is x = - 2 .,Z-Score,Interpretation of z-scores for mound-shaped distribution of data: 1. Approximately 68% of the measurements will have a z-score between -1 and 1. 2. Approximately 95% of the measurements will have a z-score between -2 and 2. 3. Approximately 99.7% (almost all) of the measurements will have a z-score between -3 and 3.,Distorting the Truth with Descriptive Techniques,In fact, both the pictures in statisticshistograms, bar graph, and the numerical descriptive measuresare susceptible to distortion, so we have to examine each of them carefully.,