Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Is there any data set which have duplicate documents and in categorized form?

$
0
0

@shipika wrote:

I have created a model which takes new documents and tells in which category it lies, along with that if same document keeps on arriving, then my model also detect the duplicates.
to check the efficiency of my model i want a huge document data set with duplicates in it and also categorized according to labels . please tell me if anyone knows how to retrieve such data set.

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles