Skip to main content

Questions tagged [categorical-data]

Categorical (also called nominal) data can take on a limited number of possible values called categories. Categorical values "label", they do not "measure". Please use [ordinal-data] tag for discrete but ordered data types.

Filter by
Sorted by
Tagged with
0 votes
0 answers
36 views

In our questionnaire the answers are in the categorical format therefore we used dummy trapping for the regression part, however we have a doubt to use which of the following 2 ways: (i) For models ...
Gayuth Waidyaratne's user avatar
2 votes
1 answer
115 views

I’m trying to use the R poly() function with degree 1 to force glm to interpret a factor linearly. I’m puzzled by the fact that the size of the sample seems to increase the coefficient of the ...
Guillaume's user avatar
1 vote
0 answers
24 views

I am trying to perform a correspondence analysis on a dataset of anatomical measurements of ecologically relevant features. Most of these variables are ordered factor variables representing binning of ...
user2352714's user avatar
1 vote
0 answers
13 views

I'm trying to understand how three categorical variables affect several binary variables. I am roughly following these instructions. Here is what my data look like (not my real data): Binary answers ...
Hapless ankylosaur's user avatar
2 votes
1 answer
314 views

I've been following the method illustrated here: Polynomial contrasts for regression to transform the results .L, .Q, .C, etc. of a glm ordinal factor regression in the values for each of the levels ...
Guillaume's user avatar
0 votes
0 answers
102 views

I'm struggling to understand the linearity assumption when running OLS with continuous dependent var and categorical independent variables that have been mean-encoded (simple group mean per category). ...
user avatar
0 votes
0 answers
70 views

I am a newbie at conducting difference checking test (Chi square test). When I make contegency table for doing Chi square test (classical and Bayesian tests), I get some phenomena that they would be ...
Student coding's user avatar
2 votes
1 answer
59 views

I am trying to analyze some survey data in R but I am a bit confused about how to run the right type of analysis. In the survey of college students, the participants were put in a hypothetical ...
Alex Fischer's user avatar
0 votes
0 answers
65 views

I have individual level data with a performance measure (good/bad) and characteristic variables for the individual (e.g. gender). I usually analyse this using a chi-squared test to see if the ...
Rob Green's user avatar
4 votes
1 answer
151 views

Problem in brief I would like to generate several samples of iid categorical data. The standard approach does not work because the potential number of categories is large, and I do not want to impose ...
g g's user avatar
  • 2,954
0 votes
0 answers
68 views

The free Statistics package "JASP" has a data library that illustrates various tests and operations. One of them is Factor Analysis. They use the data from Spearman's 1904 "General ...
David's user avatar
  • 11
0 votes
0 answers
47 views

I am running a GLM (Gaussian Family; Identity link) on some medical data. I intend to find out if the level of disease severity has any effect on task performance. A minimum reproducible example (...
AvadaMouse's user avatar
6 votes
2 answers
163 views

I’ve had a reviewer suggest that I use ethnicity as a covariate in a linear regression. Some ethnic groups in the sample are small enough that I am a little worried that I will overfit if I do this. ...
Mohan's user avatar
  • 1,091
2 votes
1 answer
124 views

I am trying to do a GLM with a dataset. My dataset consists of days individuals go on a social outing, and whether the outing was "better than average" (subjective). I have recorded the ...
MisterMonster314's user avatar
9 votes
2 answers
291 views

I am attempting to do analysis on a dataset using a GLM. In this dataset I have two columns with codes in about individuals, and trying to infer whether an individual passes. For example: ...
MisterMonster314's user avatar
4 votes
2 answers
260 views

I am working on doing a path analysis and using lavaan(). One of my endogenous variables is an ordered factor HOWEVER, the difference between each group is not ...
Mike Thompson's user avatar
1 vote
0 answers
32 views

I am fitting a mixed effect model where some levels of the categorical variable are correlated with the intercept for the following formula, resulting in a singular fit: ...
MCH's user avatar
  • 21
0 votes
1 answer
135 views

My general rule of thumb is that histograms should be used for continuous data, and bar charts for categorical data. (obviously not my rule) What about dates? They are non-continuous (unlike, say, ...
Bob's user avatar
  • 11
1 vote
0 answers
62 views

This is the first time I used mixed effect logit model with effect coding, and I am a very confused. I have been trying to understand this for a few weeks, and would be deeply grateful for your ...
silverfox_33's user avatar
2 votes
1 answer
173 views

I'm trying to setup a custom contrast using emmeans but am a bit unsure on how to do so properly. I have two factors, let's call them A and B, with three levels each. I want to test the following ...
Adam S's user avatar
  • 21
0 votes
0 answers
35 views

I just have a quick question. I am trying to make a panel analysis, comparing different EU member-states over multiple years. My dependent variable is 'trust in EU institutions', and my independent ...
adisiki's user avatar
4 votes
1 answer
242 views

In my study, students are divided into three groups and each group read one text. After that, all the three groups completed the same reading comprehension test. There are 15 test items, consisting of ...
silverfox_33's user avatar
3 votes
3 answers
369 views

I have following data: I am trying to analyze it by applying Chi-square test in Excel with CHITEST(Data B, Data E) function: I also tried with using only the &...
rnso's user avatar
  • 10.4k
0 votes
0 answers
34 views

I'm doing a comparative research regarding difference of $X$ (independent variables) $Y$ (dependent variables) relation between urban and rural groups. The design is cross sectional. Here are several ...
Lyn's user avatar
  • 1
0 votes
0 answers
82 views

I’m new to statistics and working in SPSS. I have a 5-level categorical independent variable and several categorical dependent variables, some binary (yes/no), some with more than two levels. I also ...
Psych Col's user avatar
0 votes
0 answers
128 views

I am interested in how the relationship between two traits (trait1 and trait2) varies between groups (A and B) and treatments (C and T). Specifically, I want to know whether the relationship between ...
Teagan LeVar's user avatar
4 votes
1 answer
199 views

How can an SEM model be fitted when the dataset includes both continuous and categorical variables?
SABAH's user avatar
  • 49
5 votes
2 answers
176 views

I am struggling how to define a random variable which represents complete random treatment assignment in an experiment when there are $K$ levels of treatment. To define some terms Simple random ...
Vefeagins's user avatar
  • 1,106
0 votes
0 answers
56 views

I am looking to determine if the percent cover distribution across canopy/growth form classes differs significantly between ForestType1 and ForestType2. In each forest type, I have about 10 canopy/...
Abigail's user avatar
1 vote
1 answer
321 views

I have a dataset that includes both numeric and categorical variables, and I want to perform cluster analysis. Thus, I choose the Gower distance as distance metric. Next, I perform agglomerative ...
Elena O.'s user avatar
0 votes
0 answers
95 views

As part of my research I am running ANCOVAs, using the baseline score of the measure of interest as the covariate. 2 of my outcome measures are subscales made up of 2 items and after testing for ...
Simone's user avatar
  • 21
2 votes
1 answer
153 views

First ever question here so I apologize if I miss any appropriate information. I'm working on some ecological count data of different vegetation classifications (Oaks, pines, grasses, forbs, etc...) ...
Saunders 's user avatar
0 votes
0 answers
46 views

I have a species occurrence dataset (community matrix) where I analyze beetle preferences for tree species and treatment using a GLMM + Tukey HSD. My issue is that species absent in some tree species (...
Claudio Sbaraglia's user avatar
0 votes
0 answers
73 views

I am struggling with the collinearity. I have a dataset including 10,000 observations, and all the the independent variables are factor variables, such as age group, household size group, ...
Chao's user avatar
  • 333
0 votes
0 answers
53 views

I had to learn about statistical models to approach a genetics project that I inherited: we obtained genotypes for hundreds of biallelic SNPs (possible values for each SNP: 0 = non-carrier, 1 = ...
txen's user avatar
  • 1
6 votes
0 answers
320 views

Assume we are only able to observe two-way entry table counting the number of observations of a pair of categorical features $x_i,x_j$. $$ \begin{array}{c|ccc} & & x_j & \\ \hline ...
Three Diag's user avatar
0 votes
0 answers
75 views

I have a population with $k$ categorical variables. I know the distribution of these categories. I would like to randomly choose a sample from my population so that the marginals are uniform. I don'...
Him's user avatar
  • 2,517
1 vote
1 answer
92 views

I ran a series of models in r with two factors involved using the "lmer" function, testing also for the factor interaction. For example: ...
Alice's user avatar
  • 51
1 vote
1 answer
112 views

Comparing differences in preference with 3 values including neutral Scenario: Analyzing preference data with 3 values (For example: Which do you prefer: Football, Baseball, or no preference (i.e., ...
Brannan's user avatar
  • 13
1 vote
0 answers
31 views

I have data of 2 players playing a card game 21 times. Each game consists of 30 turns. Data has been collected at the end of each turn, specifically, how long the turn took and what type of action was ...
Joshua's user avatar
  • 11
0 votes
0 answers
37 views

I'm modeling a phenomenon which has 10 nominal categorical outcomes. The relative probabilities of these categories are affected by a handful of variables (that I know of and can record). Question 1 ...
NomNomNomenclature's user avatar
5 votes
1 answer
124 views

I am new to working with regression analyses and I am planning to calculate an ordinal regression model. I have an ordinal dependent variable, with 5 possible outcomes. I have several predictors, some ...
user459821's user avatar
0 votes
0 answers
31 views

I was advised to use Multilevel modeling for my data analysis on this platform. The model speaks to the data structure and my study's theoretical framework. I thus find it suitable. Here is the model ...
Amelia Nicodemus's user avatar
0 votes
0 answers
70 views

I have data on 4 roughly but not perfectly balanced groups, about 60 subjects per group. For each subject, I observe each of 10 binary variables with overall prevalences between 10% and 90%, no ...
mathducky's user avatar
0 votes
2 answers
73 views

Don't think I can show the data but in an linear regression model, I have (in addition to a couple other variables) an interaction term between continuous variable age and categorical variable health. ...
Matthew's user avatar
3 votes
2 answers
223 views

I am trying to run a binary logistic on different factors, to establish which factors shape the study phenomena. This data is for the people above a specified age. This means the number of people in ...
Amelia Nicodemus's user avatar
2 votes
2 answers
165 views

I have a numeric variable with village sizes (in hectares) and a categorical variable with four soil types. I would like to investigate if soil type is associated with village size using R. I have ...
Kelly_Lee's user avatar
3 votes
1 answer
207 views

A MANOVA test seems to be a good fit for the following study except that the dependent variables are not interval as required, I want to examine how heritage and non-heritage learners differ on two ...
Binnan G's user avatar
1 vote
0 answers
52 views

I'm reading the documentation of the Amelia R package. In the Ordinal section of the documentation there is written that ordinal variables include dichotomous variables, and one example is gender, ...
robertspierre's user avatar
3 votes
1 answer
130 views

I have categorized my education dataset for the analysis below. However, I have one occurrence of a respondent who attended a Missionary school that I do not know its level and I am unsure where to ...
Amelia Nicodemus's user avatar

1
2 3 4 5
73