I'm trying to understand how three categorical variables affect several binary variables. I am roughly following these instructions. Here is what my data look like (not my real data):
Binary answers for multiple questions: "do you like dogs?", "are you allergic to peanuts?" (yes/no)
Three categorical variables for each question answerer: "male/female", "area1/area2/area3", "occupation1/occupation2/occupation3"
I am planning to use log-linear models to explore the affect of the three categorical variables on each binary variable separately. That is, I want to know if some combinations of sex/area/occupation answer "yes" to "do you like dogs?" at a significantly higher frequency than expected (null hypothesis being that none of the categories have an effect on the binary variable), and if some combinations of sex/area/occupation answer "yes" to "are you allergic to peanuts?" at a significantly higher frequency than expected etc. The answers to some of the questions may or may not be independent from each other (e.g., people who like dogs might be more likely to be allergic to peanuts) but I'm not particularly interested in that.
Two questions:
- Is the approach of using log-linear models (as in the link above) appropriate for me? Or is there a better way?
- Do I need to do a correction for multiple comparisons because I'm interested in testing my multiple questions (i.e., binary variables) separately? And if so, how?
Any help much appreciated!