INFO 370 - L07 - October 20, 2004 Notes by: Aaron, Yaptinchay, Sean, Prins Tomorrow's lab to start @ 5PM Google Talk in EE105 Mean greater than median on a positive skew Median greater than the mean on a negative skew Central Limit Theorem: > Various distributions and means were taken from samples > 70-80 observations gives you a pretty good estimate of the mean // Begin Slides Generalization > Sample generalization: form a sample to the whole population ex. using info students to generalize UW students > Cross-population generalization: from one population to another ex. using UW students to generalize other university students Evaluating Sampling Quality > How clearly was the population defined? > What methods were used to select the sample? > Do the cases selected represent the population from which they were selected? > Is cross-population generalization valid? Factors Acting Against Quality > The sampling frame is incomplete - ex.: "Dewey defeats Truman" newspaper flop - people may have been left out - cell phone users nowadays > Low participation rate - self-selection bias - make sure your questions aren't too probing, personal, turns off and stops answering question - once subjects gets past 50% of the survey, momentum makes them finish and they are more willing to answer tougher questions > Unwarranted cross-population generalizations - include all socio-economical biases Sample Size Factors determining the selection of size: > The degree of confidence required > The homogeneity of the population (they all respond in a similiar fashion) > The complexity of data analysis Conclusion > Selecting units in such a way that what we learn about the sample holds true for the population > Sampling method determines the degree of generalizability Random Samples *CHECK* Non-random samples *X* # END #