Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Browsing all 4448 articles
Browse latest View live

McKinsey Analytics Online Hackathon

@him4318 wrote: Where can I get a solution to the McKinsey Analytics Online Hackathon problem statement? Posts: 1 Participants: 1 Read full topic

View Article


Collinearity among categorical variables

@kpksr wrote: Hi, How do I check the multi collinearity within the categorical variables. My target variable as well is categorical with 2 possible values. Can I use VIF as in linear regression model...

View Article


Test file loan status missing

@atybzz wrote: Hey guys, I am new to data science and very excited to start learning! I took on this loan prediction challenge. So I clean the data, applied some feature extraction technique (pca) and...

View Article

Image may be NSFW.
Clik here to view.

Outer Join in Pandas

@kailash_negi wrote: Hi, I’ve two pandas dataframes and have to perform an outer join on these two dataframes but I am not getting desired results. So what I want is when I perform join, missing data...

View Article

Image may be NSFW.
Clik here to view.

How do I use the parameters after I have found them using multiple linear...

@Tan_Moy wrote: I wrote an algorithm for multiple linear regression from scratch using python, which works as expected. My question is, how do I use the parameters that I got to predict values for the...

View Article


Analysis of road accidents in New York

@raj6218a wrote: Identify dataset of new york Apply data model algorithm using Rapid Miner Tool Apply the same data model(Which was applied in Rapid miner) in python related tools Do a comparison of...

View Article

R studio - Read a file in unix from R

@Hari3289 wrote: Is it possible to connect to remote ssh server(unix) with username and password and, read a file? I’m using windows server and I need to do data profiling for that file. Need your...

View Article

How to avoid self-fulfilling prediction in recommendation systems?

@vjk wrote: I have two question on recommendation ML systems 1. If a ML system predicts that a user is likely to buy another item wouldn’t it mean that he is going to buy the recommended item anyway...

View Article


Why Does Cleaning and Collecting Data Take So Long?

@peters64s wrote: It is said a vast majority of a data scientist’s time is spent gathering and cleaning data. Why don’t data scientists use data cleaning and wrangling software to save time? What are...

View Article


Rstudio importing problem

@premsheth wrote: I am Premal Sheth. I am using Rstudio on windows 7 machine I was trying to import text files in Rstudio using VCorpus command for text analysis. But in some file there is word " 'll...

View Article

How to deal with Input(text written by human) -> LOCODE

@mhadjis wrote: What algorithm (machine learning) do you use for solving this problem: input(text written by human) -> United Nations Code for Trade and Transport Locations (UN/LOCODE). For...

View Article

How to connect TM1 REST API to R

@mukund840 wrote: Hi All I have recently working on cognos TM1 which is a planning,budgeting and forecasting tool. But limited with statistical calculations. To add on to its capability I want to...

View Article

Dealing zeros in log transformation

@santoshb7 wrote: How can we deal with the 0s in data while transforming them to log scale. Also We don’t want to discard the 0s. Posts: 1 Participants: 1 Read full topic

View Article


How to realize dbscan algorithm in python without using any packages?

@owenhe wrote: I want to use python to perform a data analysis task which is going to use dbscan algorithm, because I prefer python than other languages or tools. However, I am not allowed to use any...

View Article

t-SNE and subsequent clustering

@bici_sancta wrote: Hello, I have a question about the validity or problems associated to using clustering methods (e.g., k-means, or spectral, or dbscan, …) on a data set that has been dimensionally...

View Article


Are PCA eigenvalues equivalent to semipartial correlation coefficients in...

@Mah1510 wrote: Hi! I am a PhD student in my third month with a background in cognitive neuroscience. I loved statistics and thus developed a rather deep understanding of regression analyses. Now, I...

View Article

Regular intervals - Time series analysis

@karanam6uday wrote: Hi, I have two columns with dates in first column and values in second column. I would like to check if first column has dates in regular intervals. data Jan-2017 234 Mar-2017 567...

View Article


Image may be NSFW.
Clik here to view.

Interactive visuals

@Fahim_Ahmad wrote: Hi dears, I have a data set with 4 variables, below is how my data set looks. m8 m7 q1 percent 2007 Kabul right 43.60567 2007 Kapisa right 37 2007 Parwan right 62.68935 2007 Wardak...

View Article

Logistic Regression in R

@amrita4friends wrote: In SAS to model 1s rather than 0s, we use the descending option. We do this because by default, proc logistic models 0s rather than 1s. What happens in case of R’s glm function?...

View Article

Using SMOTE function to handle imbalanced data set

@milind275 wrote: I am working on a problem of loan default prediction for a financial risk assessment. I would like to know the good approach to use SMOTE function for handling the imbalanced dataset...

View Article
Browsing all 4448 articles
Browse latest View live