Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Is it possible to classify text with Regex?

$
0
0

@vitaly1 wrote:

I’m trying to create MODEL like decision tree style that receive series of STRINGs.

I’m using WEKA , with J48 classifier and stringToWordVector as a filter.

As I know a lot of classifiers run with numbers instead of strings (like regression , currently I don’t want map between string <-> numbers).

I’ve create an .arff file training data and test data.

@relation test

@attribute class-att {OUTPUT_1,OUTPUT_2,OUTPUT_3}
@attribute Text1 string
@attribute Text2 string
@attribute Text3 string
@attribute Text4 string
@attribute Text5 string

@data 
OUTPUT_1,'a','b','c','d','e'
OUTPUT_2,'a','b','c','d','?' 
OUTPUT_2,'a','b','?','?','?'

OUTPUT_3,'f','g','h','i','j'
OUTPUT_3,'f','g','h','i','?'

   % -- here where instead of '?' I want to be 
         string regex any char-- %

Test data:

@relation test

@attribute class-att {OUTPUT_1,OUTPUT_2,OUTPUT_3}
@attribute Text1 string
@attribute Text2 string
@attribute Text3 string
@attribute Text4 string
@attribute Text5 string

@data
?,'a','b','c','d','e'
?,'a','b','c','d','x'
?,'a','b','q','w','r'
?,'f','g','h','i','j'
?,'f','g','h','i','x'

How can I classify data as regex when ‘?’ appears…?

Any suggestions please :slight_smile:

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles