Intro to Statistical Inference - Google Sites

Intro to Statistical Inference - Google Sites

Intro to Statistical Inference Estimation Each slide has its own narration in an audio file. For the explanation of any slide click on the audio icon to start it. Professor Friedman's Statistics Course by H & L Friedman is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Statistical Inference involves: Estimation Hypothesis Testing Both activities use sample statistics (for example, X) to make inferences about a population parameter (). ). Estimation

2 Estimation Why dont we just use a single number (a point estimate) like, say, X to estimate a population parameter, ). ? The problem with using a single point (or value) is that it will very probably be wrong. In fact, with a continuous random variable, the probability that the variable is equal to a particular value is zero. So, P(X =) = 0. ). ) =) = 0. 0. This is why we use an interval estimator.

We can examine the probability that the interval includes the population parameter. Estimation 3 Confidence Interval Estimators How wide should the interval be? That depends upon how much confidence you want in the estimate. For instance, say you wanted a confidence interval estimator for the mean income of a college graduate:

You might have 100% confidence 95% confidence 90% confidence 80% confidence 0% confidence That the mean income is between $0 and $ $35,000 and $41,000 $36,000 and $40,000 $37,500 and $38,500 $38,000 (a point estimate) The wider the interval, the greater the confidence you will have in it as containing the true population parameter ). .

Estimation 4 Confidence Interval Estimators To construct a confidence interval estimator of ). , we use: X Z /n (1-) confidencen (1-) confidence) confidence where we get Z from the Z table. When we dont know we should really be using a different table (future lectures will cover this) but, often, if n is large (say n30), we may use s instead since we assume that it is close

to the value of . Estimation 5 Confidence Interval Estimators To be more precise, the is split in half since we are constructing a two-) confidencesided confidence interval. However, for the sake of simplicity, we call the z-) confidencevalue Z rather than Za/2 . /2 /2

-Z/2 Z/2 Estimation 6 Question You work for a company that makes smart TVs, and your boss asks you to determine with certainty the exact life of a smart TV. She tells you to take a random sample of 100 TVs. What is the exact life of a smart TV made by this company?

Sample Evidence: n =) = 0. 100 X =) = 0. 11.50 years s =) = 0. 2.50 years Estimation 7 Answer Take 1 Since your boss has asked for 100% confidence, the only answer you can accurately provide is: -) confidence to + years. After you are fired, perhaps you can get your job back by explaining to your boss that statisticians cannot work with 100% confidence if they are working with data from a sample. If you want 100% confidence, you must take a

census. With a sample, you can never be absolutely certain as to the value of the population parameter. This is exactly what statistical inference is: Using sample statistics to draw conclusions (e.g., estimates) about population parameters. Estimation 8 The Better Answer n =) = 0. 100 X =) = 0. 11.50 years S =) = 0. 2.50 years at 95% confidence: 11.50 1.96*(2.50/n (1-) confidence100) 11.50 1.96*(.25)

11.50 .49 The 95% CIE is: 11.01 years -) confidence-) confidence-) confidence-) confidence 11.99 years [Note: Ideally we should be using but since n is large we assume that s is close to the true population standard deviation.] Estimation 9 The Better Answer Interpretation We are 95% confident that the interval from 11.01 years to 11.99 years contains the true population parameter, ). . Another way to put this is, in 95 out of 100

samples, the population mean would lie in intervals constructed by the same procedure (same n and same ). Remember the population parameter (). ) is fixed, it is not a random variable. Thus, it is incorrect to say that there is a 95% chance that the population mean will fall in this interval. Estimation 10 EXAMPLE: Life of a Refrigerator The sample: n =) = 0. 100 X =) = 0. 18 years s =) = 0. 4 years Construct a confidence interval estimator (CIE) of the true population mean life (), at

each of the following levels of confidence: (a)100% (b) 99% (c) 95% (d) 90% (e) 68% Estimation 11 EXAMPLE: Life of a Refrigerator Again, in this example, we should ideally be using but since n is large we assume that s is close to the true population standard deviation. It should be noted that s2 is an unbiased estimator of 2: E(s2) =) = 0. 2 2 =) = 0.

s2 =) = 0. Estimation 12 EXAMPLE: Life of a Refrigerator (a) 100% Confidence [ =) = 0. 0, Z =) = 0. ] 100% CIE: years + years (b) 99% Confidence =) = 0. .01, Z =) = 0. 2.575 (from Z table) 18 2.575 (4/n (1-) confidence100) 18 1.03 99% CIE: 16.97 years 19.03 years (c) 95% Confidence =) = 0. .05, Z =) = 0. 1.96 (from Z table)

18 1.96 (4/n (1-) confidence100) 18 0.78 95% CIE: 17.22 years 18.78 years Estimation 13 EXAMPLE: Life of a Refrigerator (d) 90% Confidence =) = 0. .10, Z =) = 0. 1.645 (from Z table) 18 1.645 (4/n (1-) confidence100) 18 0.66 90% CIE: 17.34 years 18.66 years (e) 68% Confidence =) = 0. .32, Z =) = 0. 1.0 (from Z table) 18 1.0 (4/n (1-) confidence100) 18 0.4 68% CIE: 17.60 years 18.40 years

Estimation 14 Balancing Confidence and Width in a CIE How can we keep the same level of confidence and still construct a narrower CIE? Lets look at the formula one more time: X Z /n (1-) confidencen The sample mean is in the center. The more confidence you want, the higher the value of Z, the larger the half-) confidence width of the interval. The larger the sample size, the smaller the half-) confidencewidth, since we divide by n (1-) confidencen. So, what can we do? If you want a narrower interval, take a larger sample. What about a smaller standard deviation? Of course, this depends on

the variability of the population. However, a more efficient sampling procedure (e.g., stratification) may help. That topic is for a more advanced statistics course. Estimation 15 Key Points Once you are working with a sample, not the entire population, you cannot be 100% certain of population parameters. If you need to know the value of a parameter certainty, take a census.

The more confidence you want to have in the estimator, the larger the interval is going to be. Traditionally, statisticians work with 95% confidence. However, you should be able to use the Z-) confidencetable to construct a CIE at any level of confidence. Estimation 16 More Homework for you. Do the rest of the problems in the lecture notes. Estimation 17

Recently Viewed Presentations

  • KR Discussions - Ms. Christina Baumeister

    KR Discussions - Ms. Christina Baumeister

    This activity serves as a pre-reading exercise to focus your reading if preparation of The Kite Runner. Close relationships often define who we are as a person. Our family and friends are often the only people we can count on...
  • The Discontented Fish - Loudoun County Public Schools

    The Discontented Fish - Loudoun County Public Schools

    The Discontented Fish. An African Tale. Once upon a time there was a colony of little fishes who lived together in their own small pool, isolated from the rest of the fish in the river. Most of these fish were...
  • Electric Propulsion for Future Space Missions

    Electric Propulsion for Future Space Missions

    Electric Propulsion for Future Space Missions Part I ... 4 mF Capacitors Types Of Electric Thrusters Electrostatic Ion Hall Electrothermal Arcjet Resistojet Electromagnetic Magneto plasma dynamic (MPD) Many others Types Of Electric Thrusters Ion Thruster Ion Thruster Ion Thruster Layout...
  • Section A Briefing Presentation Final

    Section A Briefing Presentation Final

    Assess ethical problems with using brain structure as an explanation of risk taking behaviour. Assess the usefulness of research into the collection and processing of forensic evidence. ... Marking Grid. Options Question C - Application to a Novel Source.
  • IST722 Data Warehousing - Syracuse University

    IST722 Data Warehousing - Syracuse University

    imensional data model . is optimized for maximum query performance / ease of use. An . attribute . is a business performance measurement. Order date & Shipping date use the same data. This is an example of a . conformed...
  • Structure and Properties of Matter Models of the

    Structure and Properties of Matter Models of the

    Models of the Atom. Thomson's Plum Pudding Model of the Atom The findings led to Thomson's "plum pudding model" in 1904. In this model, the atom consists of negative electrons that float in a sphere of positive charge, in a...
  • Clean Architecture -

    Clean Architecture -

    A Cautionary tale from the Book. They started with a few devsand had great success. Note: graphs removed, see Chapter 1. Release 1 was delivered with a few thousand and one developer.
  • Wassily Kandinsky and COLOR

    Wassily Kandinsky and COLOR

    Wassily Kandinsky. Kandinsky was born in Russia in 1866. He is known to have created one of the first purely abstract works. Besides art, he studied law and economics and later became a professor at the University in Moscow.