- Go
here to read how the Gallup Organization
draws its sample
and deals with sample size.
Go
here to read about a post-September 11
poll of the public's
international attitudes by the Pew Research Center
- Types of
samples
- HAPHAZARD or
accidental:
- Chosen at "whim"
rather than "at random"
- Does not produce
representative samples because of subtle bias in
selection
- PURPOSIVE samples
- Chosen with some
controls in mind: e.g., four states east of the
Mississippi and four west
- Designed to achieve
"representativeness" but mathematical estimates of
error do not apply
- QUOTA samples
- More systematic
attempt to reflect a population distribution: sex,
race, etc.
- May be
"representative" on these factors but biased on
others
- CLUSTER or STRATIFIED
samples
- Divide population
into groups (e.g., census areas) and sample the
groups
- Then do a simple
random sample within groups
- SYSTEMATIC RANDOM
samples
- Choose every Nth
entity from an ordered list
- Depends on the
principles used in ordering the list
- SIMPLE RANDOM
samples
- Every case has an
equal chance of being drawn, due to purely chance
factors
- Your samples of 10
states were supposed to be drawn this way
- Statistically, the
most efficient form of sample, but not the most
economically efficient.
-
Computing Examples
An example of computing a
confidence for a population parameter
- A confidence interval is
a RANGE OF VALUES around a point estimate that expresses
the probability that the interval contains the population
parameter between the upper and lower limits of the
interval.
- A confidence
interval is computed by adding and subtracting
standard error units around the mean.
- A confidence interval
is always associated with a LEVEL OF CONFIDENCE that
the estimate will be correct.
- A point estimate and
confidence interval in a 11/15/85 article on the Geneva
summit in the NEW YORK TIMES, based on a NYT/CBS
telephone poll of 1,659 adults in 48 continental
states.
Factors in the confidence
interval estimate of a population mean
- Variability in the
sampling distribution of the mean, based on
- variation of values
in the population:
- size of the
sample:
- both factors are
combined in the standard error of the
mean:
- Degree of confidence in
making the estimate
- Level of confidence:
e.g., .95, .99, .999
- Complement of the
alpha value: e.g., .05, .01, .001
- Note that the PROPORTION
that the sample is of the population is NOT a major
factor in the accuracy of the estimate, which is
counter-intuitive (i.e., goes against reason)
- Two forms of simple
random sampling
- Sampling with
replacement
- After each case is
selected, it is replaced in the sample.
- If replacement is
not done with small populations (e.g., a deck of 52
cards), probability calculations can be materially
affected.
- Sampling
without replacement.
- If the population
is very large (e.g, thousands of cases),
probability calculations will not be materially
affected.
- The population
decreases by one each time a case is
drawn.
- This is because
accuracy of inferences from samples to populations
is due primarily to the AMOUNT of information
(i.e., the size of the sample) and not the
PROPORTION of information (i.e., the percent the
sample is of the population).
- In truth, the s.e. of
the mean can be lowered by multiplying the standard error
by a correction factor based on p (p = the proportion the
sample is of the population)
-
--where is
the correction factor is the correction
factor
- But this correction
factor has little effect unless p > .20.
- Because most samples do
not approach this figure, this correction factor is
usually ignored in computing the s.e. of the
mean.
|