Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Browsing all 4448 articles
Browse latest View live

Natural Language Processing

@rishu4398 wrote: I am working on a dataset that has links to some tweets instead of actual tweets in text format.Can I apply NLP to same dataset and if yes how can I achieve this? Posts: 1...

View Article


Copy of files from one HDFS to another

@sandeepkrishna1 wrote: We are trying to make a copy of HDFS. How do we ensure that copy is same as that of the source? will there be any intermediate checks? Posts: 1 Participants: 1 Read full topic

View Article


Assistance on where to start for analyzing employee engagement data

@chadh712 wrote: I don’t have a background in data science, I am an HR guy who was just pushed into a role of taking our HR data and turning it into something meaningful. I am not totally worthless, I...

View Article

Image may be NSFW.
Clik here to view.

Adressing seasonality in predictive modelling(specifically decision trees)

@roma1 wrote: Can anyone tell me techniques use in industry to handle seasonality in data while building predictive models using decision trees? I cam across this article in analytics Vidhya, but it...

View Article

Training Object Detection Algorithm

@cuhsailus wrote: Hello, I am a student working on a project to create an autonomous race car. In order to stay on track, the car needs to be able to detect the cones which mark the lane. I am new to...

View Article


Image may be NSFW.
Clik here to view.

NLP Tasks & Techniques

@jainayush007 wrote: I found the below post very useful. However, I am sure there are newer developments for each of these tasks and many more applications since the release of this post. Can we have...

View Article

Test statistic best for questionnaire

@querida wrote: In case of a questionnaire, there are 4 dependent variables and 1 independent variable. Which test statistic can I use to analyse? Posts: 1 Participants: 1 Read full topic

View Article

Which log transformation to use for moderately skewed data with zero values

@sajjid wrote: Hello, I am geospatially analyzing a school district’s data that includes a variety of variables such as salary, turnover percentage, number of schools within district, students count,...

View Article


Seeking guidance to detect create knowledge base for chatbot using ML

@genigaus wrote: Hi, Please guide me to detect question and answers from pdf documents containing medicinal drug related information to create knowledge base for chatbot as I am completely new in...

View Article


Statistical summarizing data

@kynda wrote: I have a big data frame for omics data. Samples are named as Genotype_Time_Replicate (e.g. AOX_1h_4). Each sample has 4 replicates for each time point. # Sample data frame given df <-...

View Article

Image may be NSFW.
Clik here to view.

Regression analysis wrong predictions? Help appreciated (full analysis...

@dujegilja wrote: Hi all, I created web scraper that would collect car data from a particular site. I collected little more than 1000 cars (different Audi models) and some information about them. I...

View Article

How can we protect roads from getting damaged by water

@mandora3 wrote: What alternative can be implemented in making roads so that the damage can be reduced. Posts: 1 Participants: 1 Read full topic

View Article

Am getting NAN error when working on loan prediction problem

@sravya97 wrote: am getting NAN error when working on loan prediction problem could any one help on this Posts: 1 Participants: 1 Read full topic

View Article


Image may be NSFW.
Clik here to view.

Number of features of the model must match the input

@Jayanti_Bhanushali wrote: I have developed a Linear Regression model using SKlearn that involves dummy variables (due to categorical variables as input). I created a pickle file and loaded it in...

View Article

One vs one SVM for classes 1 and 5 but I have 10 classes in total. Should I...

@shounakrockz47 wrote: I want to use soft margin SVM for my dataset. My dataset contains 10 digits - 0 to 9. I need to train using one-vs-one SVM classifier(one digit is class +1 and another digit is...

View Article


AutoSpearman, feature selection R

@m.social.insurance wrote: I am still new in the field of ML Now, I am working on NASA Database (JM1), I want to use a feature selection and one of the software metrics is called AutoSpearman in R....

View Article

The judgment of the balance of dataset.

@mayona wrote: When we can say the dataset is unbalanced? my dataset is quite small. I have 527 rows with 354 class 1 and 173 class 0. Is this consider as unbalanced dataset? Also, I wonder how to...

View Article


Negative Adjusted R squared

@borahdurlove wrote: Can adjusted R squared be neagtive? If yes, how? Posts: 1 Participants: 1 Read full topic

View Article

A Beginner’s Guide to Channel Attribution Modeling in Marketing (using Markov...

@ashrayber wrote: Hi all! I am trying to deploy the channel attribution package in R, following the steps provided in the article “A Beginner’s Guide to Channel Attribution Modeling in Marketing...

View Article

How to display the count during reduceByKey

@vidhkarthigeyan wrote: Hi, I have written a function in pyspark to count occurance of each words in text file. Almost i have followed one of the article in your site but unable to display the count...

View Article
Browsing all 4448 articles
Browse latest View live