How Stratified Random Sampling Works, With Examples

What Is Stratified Random Sampling?

Stratified random sampling is a method of sampling that involves the division of a population into smaller subgroups known as strata. In stratified random sampling, or stratification, the strata are formed based on members’ shared attributes or characteristics, such as income or educational attainment. Stratified random sampling has numerous applications and benefits, such as studying population demographics and life expectancy.

Stratified random sampling is also called proportional random sampling or quota random sampling.

Key Takeaways

  • Stratified random sampling allows researchers to obtain a sample population that best represents the entire population being studied.
  • Sampling involves statistical inference made using a subset of a population.
  • Stratified random sampling is done by dividing the entire population into homogeneous groups called strata.
  • Proportional stratified random sampling involves taking random samples from stratified groups, in proportion to the population. In disproportionate sampling, the strata are not proportional to their occurrence in the population.
  • Stratified random sampling differs from simple random sampling, which involves the random selection of data from an entire population, so each possible sample is equally likely to occur.
Stratified Random Sampling Definition

Investopedia / Xiaojie Liu

Understanding Stratified Random Sampling

When completing an analysis or research on a group of entities with similar characteristics, a researcher may find that the population size is too large to complete research on it. To save time and money, an analyst may take on a more feasible approach by selecting a small group from the population.

The small group is referred to as a sample size, which is a subset of the population used to represent the entire population. A sample may be selected from a population in many ways, one of which is the stratified random sampling method.

Stratified random sampling involves dividing the entire population into homogeneous groups called strata (plural for stratum). Random samples are then selected from each stratum. For example, consider an academic researcher who would like to know the number of MBA students in a specific graduating year who received a job offer within three months of graduation.

The researcher will soon find that there were almost 200,000 MBA graduates for the year. They might decide just to take a simple random sample of 50,000 graduates and run a survey. Better still, they could divide the population into strata and take a random sample from the strata.

To do this, they would create population groups based on gender, age range, race, country of nationality, and career background. A random sample from each stratum is taken in a number proportional to the stratum’s size compared with the population. These subsets of the strata are then pooled to form a random sample.

Stratified sampling is used to highlight differences among groups in a population, as opposed to simple random sampling, which treats all members of a population as equal, with an equal likelihood of being sampled.

Example of Stratified Random Sampling

Suppose a research team wants to determine the grade point average (GPA) of college students across the United States. The research team has difficulty collecting data from all 21 million college students; it decides to take a random sample of the population by using 4,000 students.

Now assume that the team looks at the different attributes of the sample participants and wonders if there are any differences in GPAs relative to students’ majors. Suppose it finds that 560 students are English majors, 1,135 are science majors, 800 are computer science majors, 1,090 are engineering majors, and 415 are math majors. The team wants to use a proportional stratified random sample where the stratum of the sample is proportional to the random sample in the population.

Assume the team researches the demographics of college students in the U.S. and finds the percentage of what students major in: 12% major in English, 28% major in science, 24% major in computer science, 21% major in engineering, and 15% major in mathematics. Thus, five strata are created from the stratified random sampling process.

The team then needs to confirm that the stratum of the population is in proportion to the stratum in the sample; however, they find the proportions are not equal. The team then needs to resample 4,000 students from the population and randomly select 480 English, 1,120 science, 960 computer science, 840 engineering, and 600 mathematics students.

With those groups, it has a proportionate stratified random sample of college students, which provides a better representation of students’ college majors in the U.S. The researchers can then highlight specific strata, observe the varying types of studies of U.S. college students, and observe the various GPAs.

Simple vs. Stratified Random Samples

Simple random samples and stratified random samples are both statistical measurement tools. A simple random sample is used to represent the entire data population. A stratified random sample divides the population into smaller groups, or strata, based on shared characteristics. However, stratified sampling is more complicated, time-consuming, and potentially more expensive to carry out than simplified random sampling.

The simple random sample is often used when there is very little information available about the data population when the data population has far too many differences to divide into various subsets, or when there is only one distinct characteristic among the data population.

For instance, a candy company may want to study the buying habits of its customers to determine the future of its product line. If there are 10,000 customers, it may choose 100 of those customers as a random sample. It can then apply what it finds from those 100 customers to the rest of its base. Unlike stratification, it will sample 100 members purely at random without any regard for their individual characteristics.

Proportionate vs. Disproportionate Stratification

Stratified random sampling ensures that each subgroup of a given population is adequately represented within the whole sample population of a research study. Stratification can be proportionate or disproportionate.

In a proportionate stratified method, the sample size of each stratum is proportionate to the population size of the stratum. This type of stratified random sampling is often a more precise metric because it’s a better representation of the overall population.

For example, if the researcher wanted a sample of 50,000 graduates using age range, the proportionate stratified random sample will be obtained using this formula: (sample size/population size) × stratum size. The table below assumes a population size of 180,000 MBA graduates per year.

Age group 24–28 29–33 34–37 Total
Number of people in stratum 90,000 60,000 30,000 180,000
Strata sample size 25,000 16,667 8,333 50,000

The strata sample size for MBA graduates in the age range of 24 to 28 years old is calculated as (50,000/180,000) × 90,000 = 25,000. The same method is used for the other age-range groups. Now that the strata sample size is known, the researcher can perform simple random sampling in each stratum to select their survey participants.

In other words, 25,000 graduates from the 24 to 28 age group will be selected randomly from the entire population, 16,667 graduates from the 29 to 33 age range will be selected from the population randomly, and so on.

In a disproportional stratified sample, the size of each stratum is not proportional to its size in the population. The researcher may decide to sample half of the graduates within the 34 to 37 age group and one-third of the graduates within the 29 to 33 age group.

It is important to note that one person cannot fit into multiple strata. Each entity must only fit in one stratum. Having overlapping subgroups means that some individuals will have higher chances of being selected for the survey, which completely negates the concept of stratified sampling as a type of probability sampling.

Note

Portfolio managers can use stratified random sampling to create portfolios by replicating an index such as a bond index.

Advantages and Disadvantages of Stratified Random Sampling

Advantages

The main advantage of stratified random sampling is that it captures key population characteristics in the sample. Similar to a weighted average, this method of sampling produces characteristics in the sample that are proportional to the overall population. Stratified random sampling works well for populations with a variety of attributes but is otherwise ineffective if subgroups cannot be formed.

Stratification gives a smaller error in estimation and greater precision than the simple random sampling method. The greater the differences among the strata, the greater the gain in precision.

Disadvantages

Unfortunately, this method of research cannot be used in every study. The method’s disadvantage is that several conditions must be met for it to be used properly. Researchers must identify every member of a population being studied and classify each of them into one, and only one, subpopulation.

As a result, stratified random sampling is disadvantageous when researchers can’t confidently classify every member of the population into a subgroup. Also, finding an exhaustive and definitive list of an entire population can be challenging.

Overlapping can be an issue if there are subjects that fall into multiple subgroups. When simple random sampling is performed, those who are in multiple subgroups are more likely to be chosen. The result could be a misrepresentation or inaccurate reflection of the population.

The above examples make it easy: age range, degree type, and GPA results are clearly defined groups. In other situations, however, it might be far more difficult. Imagine incorporating characteristics such as race, ethnicity, or religion. The sorting process becomes more difficult, rendering stratified random sampling an ineffective and less-than-ideal method.

When Would You Use Stratified Random Sampling?

Stratified random sampling is often used when researchers want to know about different subgroups or strata based on the entire population being studied—for instance, if one is interested in differences among groups based on race, gender, or education.

Which Sampling Method Is Best?

The best method of sampling to use will depend on the nature of the analysis and the data being used. In general, simple random sampling is often the easiest and cheapest, but stratified sampling can produce a more accurate sample relative to the population under study.

What Are the Two Types of Stratified Random Sampling?

There are two main types of stratified random sampling: proportionate and disproportionate sampling. Proportionate sampling takes each stratum in the sample as proportionate to the population size of the stratum. In disproportionate sampling, the analyst will over- or under-sample certain strata based on the research question or study design being employed.

How are Strata Chosen for Stratified Random Sampling?

The strata will depend on the subgroups in which you are interested that appear in your population. These subgroups are based on shared characteristics among participants such as gender, race, educational attainment, geographic location, or age group.

The Bottom Line

Stratified random sampling is the process of creating subgroups in a dataset according to various factors, such as age, gender, income level, or education. Subsequently, a random sample is taken from each of the strata, which allows researchers to obtain samples from various subgroups, including those that may be under-represented.

In this way, a stratified random sample may provide a more comprehensive picture of a broader dataset. However, using this method may not be possible across all studies based on the size, level of information, and the time and resources available. Overall, the benefit of stratified random sampling is that it allows for a more accurate and nuanced representation of a population, compared with a simple sampling method.

Open a New Bank Account
×
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.
Sponsor
Name
Description
Open a New Bank Account
×
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.