@shashwat.2014 wrote:
Hello everyone,
I was working on a dataset where I want to see trends of a Categorical Variable (Minimum_Duration) for different sets of rows.
Set 1: For all the rows of the dataset.(total)
Set 2: For the rows for which another categorical variable has some particular values(The ones with maximum frequency)
I want to make this analysis in order to determine whether the two variables have a different kind of relation for this particular subset.Here is what I did :
This is a snapshot of the 2 variables I am dealing with right now.
I arranged the variable in decreasing order of frequency, and formed a dataset of the top 10 levels.loc<-as.data.frame(table(total$Preferred_location))
loc<-arrange(loc,desc(Freq))
top<-loc[1:10,]Now the dataframe top looks like this :
Now I want to see the frequencies of all the levels of the variable Minimum_Duration for the rows with these 10 Preferred_locations in comparison with actual frequencies of those levels.
However, the command to plot them is giving errors:
ggplot(total,aes(x=total$Minimum_Duration[which(total$Preferred_location %in% top$Var1)],fill=as.factor(total$Minimum_Duration)))+geom_bar(position="dodge")
Error: Aesthetics must be either length 1 or the same as the data (300010): x, fill
Please suggest an alternate way to do this.
Thanks in advance!
Posts: 1
Participants: 1