This analysis focuses on data from the National Financial Well-Being Survey to learn more about how a wide range of factors relate to consumers’ financial well-being in 2017 by the Consumer Financial Protection Bureau (cfpb).
# load packages to uselibrary(tidyr)library(ggplot2)library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
This analysis seeks to find out:
The relationship between the household size and gender
What is the relation between the financial well being of a respondent against the household size
Visualize the average financial well being score against the house hold income reported in the survey
Import data via url
# import the datafwb_data <-read.csv("https://www.consumerfinance.gov/documents/5614/NFWBS_PUF_2016_data.csv")View(fwb_data)
Our analysis will focus on responses on individuals in the lower income households bracket i.e. households that earn less than $50,000. To achieve this, we are guided by the data dictionary available in the public use file cookbook Public Use File Codebook
# subset respondents who earn less than $50,000 incomelow_inc <- fwb_data |>filter(PPINCIMP <=4)# let us validate what our subset achived by observing the household income variabletable(low_inc$PPINCIMP)
1 2 3 4
719 506 614 467
# To narrow down our analysis we will focus on three variable respondents gender, respondents household size and the respondents incomelow_income_ndwn <- low_inc |>select(PPGENDER, PPHHSIZE, PPINCIMP, FWBscore, finalwt)
creating a binary variable
We will achieve this by focusing on the Household Size variable.
First, we use the data dictionary available in the public use file cookbook Public Use File Codebook to look into the interested variable name in this case Household Size.
Our focus will be on the PPGENDER variable which contains observations of gender of respondents in the survey. After referencing the data dictionary I observed that value 1 represents male while value 2 represents female gender. With this in mind we can use the recode() function from dplyr to achieve our objective
In our survey data let’s create a variable based on the gender variable and smaller house variable
low_income_ndwn$gen_small_house <-if_else(low_income_ndwn$PPGENDER =="Female"& low_income_ndwn$smaller_house ==1, "Female living in a small house",if_else(low_income_ndwn$PPGENDER =="Female"& low_income_ndwn$smaller_house ==0, "Female living in a large house",if_else(low_income_ndwn$PPGENDER =="Male"& low_income_ndwn$smaller_house ==1, "Male living in a small house","Male living in a large house")))
Recode the income variable to the actual values in dollars
# these values are refernced from the documentation# recode the levels of the income variablelow_income_ndwn$PPINCIMP <-recode(low_income_ndwn$PPINCIMP,"1"="less than $20,000","2"="$20,000 to $29,000","3"="$30,000 to $39,000","4"="$40,000 to $49,000" )
Let’s create a statistical summary table to:
estimate their financial wellbeing score abbreviated by FWBscore in the reference document whereby:
A higher score denotes more satisfaction with the finances and a lower score denotes the vice versa.
`summarise()` has grouped output by 'gen_small_house'. You can override using
the `.groups` argument.
summ_stat_tbl
# A tibble: 16 × 6
# Groups: gen_small_house [4]
gen_small_house PPINCIMP average_FWBscore average_FWBscore_wei…¹
<chr> <chr> <dbl> <dbl>
1 Female living in a large ho… $20,000… 46.3 46.7
2 Female living in a large ho… $30,000… 47.5 47.7
3 Female living in a large ho… $40,000… 48.2 48.8
4 Female living in a large ho… less th… 46.7 47.4
5 Female living in a small ho… $20,000… 52.2 50.8
6 Female living in a small ho… $30,000… 53.8 52.7
7 Female living in a small ho… $40,000… 56.2 55.4
8 Female living in a small ho… less th… 45.3 45.2
9 Male living in a large house $20,000… 45.6 45.5
10 Male living in a large house $30,000… 49.2 47.4
11 Male living in a large house $40,000… 49.6 49.6
12 Male living in a large house less th… 44.7 43.6
13 Male living in a small house $20,000… 50.2 49.0
14 Male living in a small house $30,000… 52.7 51.6
15 Male living in a small house $40,000… 56.4 54.0
16 Male living in a small house less th… 47.0 47.1
# ℹ abbreviated name: ¹average_FWBscore_weighted
# ℹ 2 more variables: median_FWBscore <dbl>, SD_FWBscore <dbl>