How Rows increasing when i try to do feature engineering?
@premsheth wrote: Hello, I am begginer in Data science and I am working in Big market sale dataset using R. Now when I do Data manipulation and adding "Outlet_year" columns from...
View ArticleGini criterion in decision tree classifier
@syed.danish wrote: Hi, I am trying to apply to decision trees for classification using following code : from sklearn import treeclf=tree.DecisionTreeClassifier(criterion = 'entropy')clf.fit(X,Y)I am...
View ArticleHow to identify the best suited algorithm for classification?(Discriminant...
@shashwat.2014 wrote: Hello everyone, Could anyone please explain the criterion for selecting LDA(Linear Discriminant Analysis) over Logistic Regression in certain kind of problems? While logistic...
View ArticleText categorization
@Sunil0108 wrote: Hi Everyone I need help in categorizing the texts ..I have a list of merchants like this and we can see that first few belong to CENTURYLINK next to SMART ATT ..is there a way to...
View ArticleWhat is the default threshold when applying logistic regression in sklearn?
@syed.danish wrote: I am trying Logistic Regression on Titanic data set. The code that I used is : from sklearn.linear_model import...
View ArticleEliminating Homoscedasticity from the data set before linear regression
@Prateek123 wrote: Hello everyone, There are several assumptions that we take before applying linear regression.These include : 1) Eliminating multicollinearity.2) Normalization of the variables3)...
View ArticleDownloading images from different url and saving it
@Tapojyoti_Paul wrote: I have a list of URLs.Each url containing an image. I want to download image from each URL and want to save it in a folder by R. I saw that there are some solutions for linux...
View ArticleConfusion in determining significance of categorical variable using GLM function
@newton2304 wrote: Hi, When trying to predict Dependents variable using Married categorical variable(with values YES and NO), passing through GLM function gives following output Coefficients: Estimate...
View ArticleHow to tune C parameter in SVM in R?
@Corporate_Cowboy wrote: Howdy! I was using svm in a classification problem. The changes in the cost parameter change the decision boundary. Can you please help me out in getting the optimum cost...
View ArticlePrinting the tree after applying decison tree algo for classification in sklearn
@ravi_6767 wrote: Hi all,After applying decision tree for the classification using following code : from sklearn clf_tree=tree.DecisionTreeClassifier()clf_tree.fit(train_x,label_x) I want to print the...
View ArticleRFdist for unsupervised learning
@nehak wrote: i am trying to use Random Forest for unsupervised learning-creating clusters.But receiving an error could not find Rfdist..I have tried to google the error but couldnt understand and get...
View ArticleWhat is the reason for effect of gamma parameter in non-linear classification...
@Corporate_Cowboy wrote: Howdy, The kernel equations in Support Vector Machines are given by : Pasted image722x214 47.9 KB In the RBF kernel, parameter 'gamma' is present. I found it online that as we...
View ArticleVisualizing K-Means Clustering Algorithm
@syed.danish wrote: This website visualizes the two steps of k-means clustering : 1. Assign : Assigning every point in the data to the cluster whose centroid is nearest to it.2. Optimize :...
View ArticlePlotting two plots in a single row using ggplot in R
@NSS wrote: Hi there, I encountered a problem today while trying to plot two ggplots side by side.I tried using the par command but it seemed to work only for base plotting package and not lattice and...
View ArticleExtracting coefficients of a linear model fitted to a ggplot using...
@NSS wrote: Hi, So, I was doing a quick anaysis on a data set and I plotted a scatter plot between two variables and a regression line using geom_smooth(method="lm") to see the degree of relationship...
View ArticleAnalytics career advice for undergrad java professionals to move in Analytics?
@tarunjain07 wrote: Hello Everyone, I am completely novice in analytics and desperately looking for some of the answers. It would be really great if someone can answer below questions for me. Thanks a...
View ArticleHow to run decision tree algorithm on Spark R?
@Prateek123 wrote: Hi everyone,I am new to the spark R interface and was trying to implement classification using decision trees. While we do have MLlib in Spark to implement decision trees, I could...
View ArticlePredictive Modelling on Large Data Set
@ravi_6767 wrote: Hi everybody, I want to do predictive modelling on a kaggle data set having 29 million observations. When I try to apply KNN or Logistic Regression in python my screen freezes, after...
View ArticleBest ways to pick up Deep Learning skills
@jalFaizy wrote: Thought I should share this resource I found helpful. Greg Brockman, Co-Founder & CTO at OpenAI shares his views on how to go about understanding deep learning. Posts: 2...
View ArticleWhat is feature space and how to map into feature spaces?
@shashwat.2014 wrote: While studying the KNN algorithm, I came across the fact that KNN assumes data to be in feature space. This fact was given under the assumptions section.for KNN. Is there any...
View Article