Math Portal
Introductory Statistics
Section 3.2 - Sampling Techniques
It is important for the researcher to clearly define the target population. There are no strict rules to follow, and the researcher must rely on logic and judgment. The population is defined in keeping with the objectives of the study.
Sometimes, the entire population will be sufficiently small, and the researcher can include the entire population in the study. This type of research is called a census or census study because data is gathered on every member of the population.
Usually, the population is too large for the researcher to attempt to survey all of its members. A small, but carefully chosen sample can be used to represent the population. The sample should reflect the characteristics of the population from which it is drawn.
Sampling methods are classified as either probability or nonprobability. In probability samples, each member of the population has a known non-zero probability of being selected. Probability methods include random sampling, systematic sampling, and stratified sampling. In nonprobability sampling, members are selected from the population in some nonrandom manner. These include convenience sampling, judgment sampling, and quota sampling. The advantage of probability sampling is that sampling error can be calculated. Sampling error is the degree to which a sample might differ from the population. When inferring to the population, results are reported plus or minus the sampling error. In nonprobability sampling, the degree to which the sample differs from the population remains unknown and in many cases the results are less accuracte or valid.
Probability Methods
Random sampling is the purest form of probability sampling. Each member of the population has an equal and known chance of being selected. When there are very large populations, it is often difficult or impossible to identify every member of the population, so the pool of available subjects becomes biased.
Systematic sampling is often used instead of random sampling. It is also called an Nth name selection technique. After the required sample size has been calculated, every Nth record is selected from a list of population members. As long as the list does not contain any hidden order, this sampling method is as good as the random sampling method. Its only advantage over the random sampling technique is simplicity. Systematic sampling is frequently used to select a specified number of records from a computer file.
Stratified sampling is commonly used probability method that is superior to random sampling because it reduces sampling error. A stratum is a subset of the population that share at least one common characteristic. Examples of stratums might be males and females, or freshman, sophomores, juniors and seniors. The researcher first identifies the relevant stratums and their actual representation in the population. Random sampling is then used to select a sufficient number of subjects from each stratum. Sufficient refers to a sample size large enough for us to be reasonably confident that the stratum represents the population. Stratified sampling is often used when one or more of the stratums in the population have a low incidence relative to the other stratums.
Cluster sampling Suppose an organization wants to poll voters in a town. It might first select some streets at random in the town, then select some households at random on these streets, and then poll everyone in these households. This sample is not a convenience sample, because at no time does the interviewer decide who to include in the sample. However, it is also not a (simple) random sample, even though each voter in the town has an equal chance of being part of the sample. The reason it is not a random sample is that the people are not chosen independently of one another. If one person is in the sample, every voter in his or her household will be, too; moreover, neighbors on that person's street are more likely to be included than are residents on other streets. This type of sampling is called cluster sampling, because the items enter the sample in clusters, not individually.
Nonprobability Methods
Convenience sampling is used in exploratory research where the researcher is interested in getting an inexpensive approximation of the truth. As the name implies, the sample is selected because they are convenient. This nonprobability method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.
Self-Selection sampling is when people participate in a survey by voluntarily returning a form printed in a newspaper or magazine, they make up a self-selected sample, which is one type of convenience sample. People who care enough to respond may not be representative of the whole population. For example, in a mail-in survey of 5,400 USA Today readers, an amazing 43% of the respondents in Delaware, Indiana, Kentucky, Michigan, New York, Ohio, and Pennsylvania reported symptoms that pointed to a serious risk for clinical depression. The newspaper notes, however, that "Mail-in surveys always attract the most concerned and motivated. It's not a random sample . . . ." (USA Today, July 12, 1985). Such a study cannot reliably tell us the percentage of the overall population at risk for depression.
Judgment sampling is a common nonprobability method. The researcher selects the sample based on judgment. This is usually and extension of convenience sampling. For example, a researcher may decide to draw the entire sample from one "representative" city, even though the population includes all cities. When using this method, the researcher must be confident that the chosen sample is truly representative of the entire population.
Quota sampling is the nonprobability equivalent of stratified sampling. Like stratified sampling, the researcher first identifies the stratums and their proportions as they are represented in the population. Then convenience or judgment sampling is used to select the required number of subjects from each stratum. This differs from stratified sampling, where the stratums are filled by random sampling.
It is important for the researcher to clearly define the target population. There are no strict rules to follow, and the researcher must rely on logic and judgment. The population is defined in keeping with the objectives of the study.
Sometimes, the entire population will be sufficiently small, and the researcher can include the entire population in the study. This type of research is called a census or census study because data is gathered on every member of the population.
Usually, the population is too large for the researcher to attempt to survey all of its members. A small, but carefully chosen sample can be used to represent the population. The sample should reflect the characteristics of the population from which it is drawn.
Sampling methods are classified as either probability or nonprobability. In probability samples, each member of the population has a known non-zero probability of being selected. Probability methods include random sampling, systematic sampling, and stratified sampling. In nonprobability sampling, members are selected from the population in some nonrandom manner. These include convenience sampling, judgment sampling, and quota sampling. The advantage of probability sampling is that sampling error can be calculated. Sampling error is the degree to which a sample might differ from the population. When inferring to the population, results are reported plus or minus the sampling error. In nonprobability sampling, the degree to which the sample differs from the population remains unknown and in many cases the results are less accuracte or valid.
Probability Methods
Random sampling is the purest form of probability sampling. Each member of the population has an equal and known chance of being selected. When there are very large populations, it is often difficult or impossible to identify every member of the population, so the pool of available subjects becomes biased.
Systematic sampling is often used instead of random sampling. It is also called an Nth name selection technique. After the required sample size has been calculated, every Nth record is selected from a list of population members. As long as the list does not contain any hidden order, this sampling method is as good as the random sampling method. Its only advantage over the random sampling technique is simplicity. Systematic sampling is frequently used to select a specified number of records from a computer file.
Stratified sampling is commonly used probability method that is superior to random sampling because it reduces sampling error. A stratum is a subset of the population that share at least one common characteristic. Examples of stratums might be males and females, or freshman, sophomores, juniors and seniors. The researcher first identifies the relevant stratums and their actual representation in the population. Random sampling is then used to select a sufficient number of subjects from each stratum. Sufficient refers to a sample size large enough for us to be reasonably confident that the stratum represents the population. Stratified sampling is often used when one or more of the stratums in the population have a low incidence relative to the other stratums.
Cluster sampling Suppose an organization wants to poll voters in a town. It might first select some streets at random in the town, then select some households at random on these streets, and then poll everyone in these households. This sample is not a convenience sample, because at no time does the interviewer decide who to include in the sample. However, it is also not a (simple) random sample, even though each voter in the town has an equal chance of being part of the sample. The reason it is not a random sample is that the people are not chosen independently of one another. If one person is in the sample, every voter in his or her household will be, too; moreover, neighbors on that person's street are more likely to be included than are residents on other streets. This type of sampling is called cluster sampling, because the items enter the sample in clusters, not individually.
Nonprobability Methods
Convenience sampling is used in exploratory research where the researcher is interested in getting an inexpensive approximation of the truth. As the name implies, the sample is selected because they are convenient. This nonprobability method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.
Self-Selection sampling is when people participate in a survey by voluntarily returning a form printed in a newspaper or magazine, they make up a self-selected sample, which is one type of convenience sample. People who care enough to respond may not be representative of the whole population. For example, in a mail-in survey of 5,400 USA Today readers, an amazing 43% of the respondents in Delaware, Indiana, Kentucky, Michigan, New York, Ohio, and Pennsylvania reported symptoms that pointed to a serious risk for clinical depression. The newspaper notes, however, that "Mail-in surveys always attract the most concerned and motivated. It's not a random sample . . . ." (USA Today, July 12, 1985). Such a study cannot reliably tell us the percentage of the overall population at risk for depression.
Judgment sampling is a common nonprobability method. The researcher selects the sample based on judgment. This is usually and extension of convenience sampling. For example, a researcher may decide to draw the entire sample from one "representative" city, even though the population includes all cities. When using this method, the researcher must be confident that the chosen sample is truly representative of the entire population.
Quota sampling is the nonprobability equivalent of stratified sampling. Like stratified sampling, the researcher first identifies the stratums and their proportions as they are represented in the population. Then convenience or judgment sampling is used to select the required number of subjects from each stratum. This differs from stratified sampling, where the stratums are filled by random sampling.