Categorical Data Analysis Categorical data arise whenever counts (as opposed to measurements) are made. Subjects (sample items) are classified as belonging to one of a set of categories and the numbers in the categories (the frequencies) are recorded.
Example Eye colours: eye colours of males visiting an optician, in four categories Colour Frequency observed A 89
B 66 C D 60 85 Example Tonsils: Relationship between nasal carrier status for Streptococcus pyogenes and size of tonsils
among 1398 children aged 0-15 years. Normal Enlarged Much enlarged Total Carriers
19 29 24 72
Noncarriers 497 560 269 1326
516 589 293 1398
Total Example Prussian cavalry deaths: numbers of cavalry soldiers killed by horsekicks in each of 14 units of the Prussian army over a 20-year period (1875-1894). Number killed 0
5 1 2 3 4
Frequency observed 144 91 32
11 2 Total 0 280
Often we wish to decide whether the categorical variables follow some well known distribution A chi-squared test will provide a method of testing the hypothesis that a data set follows a particular distribution. Often we wish to decide whether the
categorical variables follow some well known distribution A chi-squared test will provide a method of testing the hypothesis that a data set follows a particular distribution. It works by summing the quantity (Observed Expected)2/Expected The chi-squared test in the R program is fairly
limited it copes well with testing whether there is a significant relationship between nasal carrier status for Streptococcus pyogenes and size of tonsils among 1398 children aged 0-15 years (as in the second example) but gives us a problem with the other two. Consider now data from Standard and Poors 500 - an index of 500 of the largest, most
actively traded stocks on the New York Stock Exchange These data are available in R as sp500.R from the module website. Technique: To look at any one of the variables in a data frame such as sp.500, the $ sign is helpful.
Without attaching the data, typing adjclose produces nothing. Instead use >sp500$adjclose or >plot(sp500$adjclose) We are interested in the distribution of the
change in returns from day to day. We suspect that the logs of these changes may follow a normal distribution. These are placed in an R vector by using the command >d=diff(log(sp500$adjclose)) The chisq function that is pre-defined in R is
not powerful enough to test the values of d to see if they conform to a normal distribution, so a program is written instead. We wish to test whether a normal distribution with the same mean and standard deviation of d will look similar to this histogram. Calculate, for example, the approximate
expected number between -0.04 and -0.02 by This can be repeated and made more sophisticated with more than 4 comparisons by writing a program. The one considered has 100 comparisons.
Multiple Sclerosis: What You Need to Know About the Disease * * * * * * The U.S. Food and Drug Administration has recently approved the marketing of Ampyra™ (dalfampridine, formerly known as fampridine SR, from Acorda Therapeutics) for its...
See Exhibit A - Differentiating "Traditional" and "Red Dot" Letters) Twenty-four Scrabble® letter holders. (4 each included in regular Scrabble game x 6 games needed) Twelve letter holders should have labels representing each of the 100 Scrabble® letter tiles in...
Methods of Product Costing Cost Accumulation System defines cost object method of assigning costs to production Valuation Method specifies how product costs will be measured Six Possibilities Job Order Actual Normal Standard Process Actual Normal Standard Tracking Material Requisition Form...
Types of Web Cookies. E-business (constructed by Dr. Hanh Pham) Session cookie: in-memory cookie or transient cookie, exists only in temporary memory while the user navigates the website. Persistent cookie: is stored until expiration date, often used as tracking cookies....
Kyle R. Stephenson, B.S., Tierney K. Ahrold, M.A. & Cindy M. Meston, Ph.D. Department of Psychology, University of Texas at Austin Demographic Characteristics of the Participant Sample Hierarchical Linear Regression Outputs; Sexual Motives as Predictors of Sexual Satisfaction Table 1:...
Without an HSC - 5 days 8 hour shifts augmenting flight line. CURRENT FLEET HSC FLOW. All 6 HSCs consume approximately the same amount of man hours and calendar days to complete. Workload for APG/Propulsion is offset to balance man...
Ready to download the document? Go ahead and hit continue!