Lab 7: 📝 General Social Survey
Probability and Statistics
We expect all the results to be interpreted in the task context.
Input
The General Social Survey (GSS) is a sociological survey used to collect data on demographic characteristics and attitudes of residents of the United States. The GSS has been conducted each year since 1972 by the National Opinion Research Center at the University of Chicago. The GSS collects data on demographic characteristics and attitudes of residents of the United States. The GSS sample is designed as a multistage stratified sample.
For the exercises in this part, we’ll use the gss16
data set from the dsbox
package. You can find out more about the dataset by inspecting its documentation, which you can access by running ?gss16
in the Console or using the Help menu in RStudio to search for gss16
. You can also find this information here.
In 2016, the GSS added a new question on harassment at work. The question is phrased as the following.
Over the past five years, have you been harassed by your superiors or co-workers at your job, for example, have you experienced any bullying, physical or psychological abuse?
Answers to this question are stored in the harass5
variable in gss16
set.
Task
1. Analysis of the variable harass5
- Review the data structure and familiarize yourself with the variable
harass5
. - Build a frequency table for the variable
harass5
. - Using the \(\chi^2\)-squared test, test whether there are statistically significant differences in the responses to the question about harassment at work (
harass5
). - Interpret the \(\chi^2\)-square test results in the context of the assignment.
2. Analyzing the relationship between variables
- Select another categorical variable from the
gss16
dataset. - Construct a contingency table for the variables
harass5
and the selected variable. - Using the \(\chi^2\)-square test, check for a statistically significant relationship between the variables.
- Interpret the \(\chi^2\)-square test results in the context of the problem.