Trends in total fertility rates and changes in perceptions of marital fertility: Visualize in R

Total fertility rate trendis an important factor in predicting the future of a society, and societal opinions and changing attitudes toward marriage and childbearing are key to setting policy direction and understanding cultural trends.

In this post, we'll take a look at R's ggplot2and patchwork package to track aggregate fertility rate trends and Changing perceptions of childbearing in marriageand analyze the correlation between the two. The examples presented below use fictional data based on real-world data to visually show trends and correlations.

Generate data on total fertility rate trends and changes in perceptions of marital fertility

This analysis is based on data on the trend of the total fertility rate from 2018 to 2024 and the change in social opinion on marriage and childbearing. Simply generate the data and visualize it to analyze trends (data is available in the Office for National Statistics).

R visualization code

# Load the necessary libraries
library(ggplot2)
library(dplyr)
library(tidyr)
library(patchwork)
library(ggrepel)

Generate # data: Total fertility rate
tfr_data <- data.frame(
    Year = 2018:2024,
    TFR = c(0.977, 0.918, 0.837, 0.808, 0.778, 0.720, 0.701)
)

Generate # data: Changes in perceptions of marital childbearing
opinion_data <- data.frame(
    Year = c(2018, 2020, 2022, 2024),
    Strongly_Agree = c(25.4, 25.5, 21.6, 23.4),
    Somewhat_Agree = c(44.1, 42.6, 43.8, 44.9),
    Somewhat_Disagree = c(21.9, 22.1, 23.9, 22.7),
    Strongly_Disagree = c(8.6, 9.8, 10.8, 9.0)
)

Convert to # long form
opinion_long %
    pivot_longer(
        cols = starts_with("Strongly") | starts_with("Somewhat"),
        names_to = "Opinion",
        values_to = "Percentage"
    )

Create the # combined data
merged_data <- left_join(opinion_long, tfr_data, by = "Year")

Prepare data for # correlation analysis
correlation_data %
    select(-Year) %>%
    mutate(TFR = tfr_data$TFR[match(opinion_data$Year, tfr_data$Year)])

Calculate the # correlation coefficient
cor_matrix <- cor(correlation_data, use = "complete.obs")
cor_data <- as.data.frame(cor_matrix["TFR", ])
cor_data$Variable <- rownames(cor_data)
colnames(cor_data)[1] <- "TFR"
cor_data %
    filter(Variable != "TFR") %>%
    arrange(desc(abs(TFR)))

# Visualize the correlation
correlation_plot = 0, -0.5, 1.5),
        size = 4
    ) + (vjust
    labs(
        title = "Correlation Analysis: TFR vs Opinion Categories",
        x = "Opinion Category",
        y = "Correlation Coefficient"
    ) + +
    coord_flip() +
    theme_minimal() +
    theme(
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 12, face = "bold"),
        plot.title = element_text(size = 14, face = "bold"),
        legend.position = "none"
    )

Create a # scatter matrix
scatter_matrix <- ggplot(merged_data, aes(x = TFR, y = Percentage)) +
    geom_point(aes(color = Opinion), size = 3, alpha = 0.6) +
    geom_smooth(aes(color = Opinion), method = "lm", se = FALSE, linetype = "dashed") +
    scale_color_manual(
        values = c(
            "Strongly_Agree" = "#1F78B4",
            "Somewhat_Agree" = "#33A02C",
            "Somewhat_Disagree" = "#FB9A99",
            "Strongly_Disagree" = "#E31A1C"
        )
    )
    labs(
        title = "TFR vs Opinion Categories Relationship",
        x = "Total Fertility Rate",
        y = "Percentage (%)"
    ) + +
    theme_minimal() +
    theme(
        legend.position = "bottom",
        plot.title = element_text(size = 14, face = "bold"),
        axis.title = element_text(size = 12, face = "bold")
    )

Output the # visualization
print(correlation_plot)
print(scatter_matrix)

Analytics and insights

합계출산율 추이와 결혼 출산 인식 변화 상관계수 이미지
(Image of correlation coefficient between total fertility rate trend and change in perception of marital fertility)

1. Correlate TFR with social perception

  • Correlation with positive opinions (Agree):
    • "Strongly Agree" has the highest correlation coefficient with TFR (0.644), showing that strong positive views on marriage and childbearing are strongly associated with higher fertility rates.
    • "Somewhat Agree" has a correlation coefficient of -0.314, indicating that positive but somewhat reserved opinions have a relatively small impact on TFR.
  • Correlation with Disagree:
    • "Somewhat Disagree" and "Strongly Disagree" have correlation coefficients of -0.589 and -0.388, respectively, showing that more negative opinions contribute to a decrease in TFR.
    • Notably, "Somewhat Disagree" has the highest correlation coefficient, suggesting that partial disagreement may have a greater impact on fertility decline than strong disagreement.
출산합계율과 결혼 출산 인식 변화 데이터 간의 산점도
(Scatter plot between total fertility rate and change in marital fertility perception data)

2. Relationship between social awareness and TFR through correlation analysis

  • Visualize the results of your correlation analysis Correlation Plothighlights the strong positive correlation between "Strongly Agree" and TFR.
  • Scatter Matrix PlotThe distribution and relationship between TFR and different opinion categories is clearly visible, with more positive opinions tending to increase TFR, and more negative opinions reinforcing the trend toward lower fertility.

Conclusions and policy recommendations

출산율 문제 해결 요약 그림

1. a multi-pronged approach to fertility decline is needed

  • Declining fertility is a multi-layered problem that cannot be solved simply with economic assistance. Changing social perceptionsand Policy supportin parallel.
  • Key policy directions:
    • Housing stabilization: Reducing housing cost burden to create a stable family environment.
    • Expanding child support: Strengthening public child care systems and subsidizing child care.
    • Supporting work-family balance: improving parental leave and preventing women's career breaks.

2. Work to increase positive perceptions

  • Spread positive messages about marriage and childbirth Campaignsand Social discourseshould be strengthened.
  • We need to promote the positive aspects of childbearing and marriage, especially among young people, and create an environment that respects choice.

3. Mitigate negative comments

  • A radical approach is needed to analyze and address the social, economic, and cultural factors that make people hesitant to marry and have children.
  • Ensure that childbearing and marriage are perceived as processes that bring personal growth and fulfillment, not as personal sacrifices.

4. Build data-driven policies

  • The correlation analysis shows that the change in social perception is having a real impact on fertility rates.
  • Continuous monitoring of data and policy formulation based on it will go a long way to stabilizing fertility rates in the future.

The analysis and visualizations above provide a clearer understanding of how changing attitudes toward marriage and childbearing affect fertility rates. Based on this, we hope that social discussions and policies will develop more deeply and effectively.

# Code Explained

Load the required libraries

library(ggplot2)
library(dplyr)
library(tidyr)
library(patchwork)
library(ggrepel)
  • ggplot2: R's flagship graphing package for data visualization.
  • dplyr: Packages that make data manipulation and transformation easy.
  • tidyr: Used to convert or clean data to a long format.
  • patchwork: Used to combine multiple graphs into a single layout.
  • ggrepel: Used to add non-overlapping labels in a graph.

Data generation: Total fertility rate

tfr_data <- data.frame(
    Year = 2018:2024,
    TFR = c(0.977, 0.918, 0.837, 0.808, 0.778, 0.720, 0.701)
)
  • Year: Years from 2018 to 2024.
  • TFR: Generate a fictitious value for the total fertility rate data for the year.

Data generation: Changing marriage and fertility perceptions

opinion_data <- data.frame(
    Year = c(2018, 2020, 2022, 2024),
    Strongly_Agree = c(25.4, 25.5, 21.6, 23.4),
    Somewhat_Agree = c(44.1, 42.6, 43.8, 44.9),
    Somewhat_Disagree = c(21.9, 22.1, 23.9, 22.7),
    Strongly_Disagree = c(8.6, 9.8, 10.8, 9.0)
)
  • Year: The year in which the opinion change was measured.
  • There are four comment categories (Strongly_Agree, Somewhat_Agree, Somewhat_Disagree, Strongly_Disagree) to generate ratio (%) data.

Convert to long form

opinion_long %
    pivot_longer(
        cols = starts_with("Strongly") | starts_with("Somewhat"),
        names_to = "Opinion",
        values_to = "Percentage"
    )
  • pivot_longer(): Convert your data to "long form" so that you can use Opinion column with the comment category, Percentage Save the percentage of each comment in a column.
    • cols = starts_with("Strongly") | starts_with("Somewhat"): if the column name is Strongly or SomewhatSpecify columns beginning with as the conversion target.
    • names_to = "Opinion": Set the comment category to Opinion Save to column.
    • values_to = "Percentage": Set the percentage value of each category to Percentage Save to column.

Create combined data

merged_data <- left_join(opinion_long, tfr_data, by = "Year")
  • left_join: Two dataframes (opinion_longand tfr_data) to YearCombine based on.
  • As a result, for each year, the percentage of opinions (Percentage) and the total fertility rate (TFR) in the dataframe.

Preparing data for correlation analysis

correlation_data %
    select(-Year) %>%
    mutate(TFR = tfr_data$TFR[match(opinion_data$Year, tfr_data$Year)])
  1. select(-Year): Year Create a dataframe that excludes columns.
  2. mutate(TFR = ...): TFR Add a column. matchusing the opinion_dataand tfr_databy matching the year of the TFR Gets the value.

Calculate the correlation coefficient

cor_matrix <- cor(correlation_data, use = "complete.obs")
cor_data <- as.data.frame(cor_matrix["TFR", ])
cor_data$Variable <- rownames(cor_data)
colnames(cor_data)[1] <- "TFR"
cor_data %
    filter(Variable != "TFR") %>%
    arrange(desc(abs(TFR)))
  1. cor(): Computes the correlation coefficient matrix. use = "complete.obs"is calculated excluding missing values.
  2. cor_matrix["TFR", ]: TFRExtract only correlation coefficients between and other columns.
  3. filter(Variable != "TFR"): TFRand the correlation coefficient of self (TFR).
  4. arrange(desc(abs(TFR))): Sort correlations by absolute value in descending order.

Visualize correlations

correlation_plot = 0, -0.5, 1.5),
        size = 4
    ) + (vjust
    labs(
        title = "Correlation Analysis: TFR vs Opinion Categories",
        x = "Opinion Category",
        y = "Correlation Coefficient"
    ) + +
    coord_flip() +
    theme_minimal()
  1. aes(x = reorder(Variable, abs(TFR)), y = TFR): Sort the correlations by absolute value and place them on the graph axis.
  2. geom_bar(): Create a bar graph.
  3. scale_fill_gradient2(): Converts the color based on the correlation coefficient value (green for positive, red for negative).
  4. geom_text(): Add correlation coefficients as text to the graph.
  5. coord_flip(): Flip the x and y axes to create a horizontal bar graph.

Create a scatter plot matrix

scatter_matrix <- ggplot(merged_data, aes(x = TFR, y = Percentage)) +
    geom_point(aes(color = Opinion), size = 3, alpha = 0.6) +
    geom_smooth(aes(color = Opinion), method = "lm", se = FALSE, linetype = "dashed") +
    geom_text_repel(
        aes(
            label = paste0("(", round(TFR, 3), ", ", round(Percentage, 1), ")"),
            color = Opinion
        ),
        size = 3,
        box.padding = 0.5,
        point.padding = 0.3,
        force = 2,
        max.overlaps = Inf,
        show.legend = FALSE
    ) + Β
    labs(
        title = "TFR vs Opinion Categories Relationship",
        x = "Total Fertility Rate",
        y = "Percentage (%)"
    ) + Β
    theme_minimal()
  1. geom_point(): Generate a dot plot by comments.
  2. geom_smooth(): Add a linear regression line for each comment category.
  3. geom_text_repel(): Add a non-overlapping text label to each point. The text can contain TFRand Percentage The value is displayed.
  4. scale_color_manual(): Assign different colors to different comment categories.

Output

print(correlation_plot)
print(scatter_matrix)
  • Analyze the data visually by outputting a correlation bar graph and scatter plot, respectively.

Similar Posts