Lab 7: 📝 General Social Survey

Probability and Statistics

Important

We expect all the results to be interpreted in the task context.

Input

The General Social Survey (GSS) is a sociological survey used to collect data on demographic characteristics and attitudes of residents of the United States. The GSS has been conducted each year since 1972 by the National Opinion Research Center at the University of Chicago. The GSS collects data on demographic characteristics and attitudes of residents of the United States. The GSS sample is designed as a multistage stratified sample.

For the exercises in this part, we’ll use the gss16 data set from the dsbox package. You can find out more about the dataset by inspecting its documentation, which you can access by running ?gss16 in the Console or using the Help menu in RStudio to search for gss16. You can also find this information here.

In 2016, the GSS added a new question on harassment at work. The question is phrased as the following.

Over the past five years, have you been harassed by your superiors or co-workers at your job, for example, have you experienced any bullying, physical or psychological abuse?

Answers to this question are stored in the harass5 variable in gss16 set.

Task

1. Analysis of the variable `harass5`

Review the data structure and familiarize yourself with the variable harass5.
Build a frequency table for the variable harass5.
Using the \(\chi^2\)-squared test, test whether there are statistically significant differences in the responses to the question about harassment at work (harass5).
Interpret the \(\chi^2\)-square test results in the context of the assignment.

2. Analyzing the relationship between variables

Select another categorical variable from the gss16 dataset.
Construct a contingency table for the variables harass5 and the selected variable.
Using the \(\chi^2\)-square test, check for a statistically significant relationship between the variables.
Interpret the \(\chi^2\)-square test results in the context of the problem.

Input

Task

1. Analysis of the variable harass5

2. Analyzing the relationship between variables

1. Analysis of the variable `harass5`