5th Place solution Amexpert

@kanav wrote:

Thanks American Express and Analytics vidhya for organizing such a learning hackathon. The dataset was really amazing and had limitless new things to explore.
Approach:

Step 1
I started my problem with very basic approach changing all the features (user_id,product,webpage_id,campaign_id and other user features to category . Made some common features like Hour,weekday,day as time features and applied simple catboost model.I used simple train and test split initially and got local CV of .61 and public score of .601.

Then I introduced groupby count features like:

How many times user has appeared/ how many times he come across a particular product or webpage or campaign/how many time in day he appeared and so on… I this manner i tried various combinations and made total 16 new features

Using History data:

Made similar kind of count features again using {day,minute,date,user_id,product,week_day} as above. In addition to above i made some features like how many times a person showed interest (mean,sum and count) Similarly taking various combinations of features i made total 19 new intuitive variables. This boosted my local CV = .6460 and public leader score to .631+

At this stage I was pretty sure that my local and public score were very much synced and feature engineering is key to win:

I started brainstorming on plenty of new features some worked and some didn’t.

Time(in seconds) between previous appearance. Example for each row if there exist just previous user than how much time difference is there.

Example time between to see previous session for a A product or some webpage so on. In this manner Now i did it for again various combinations of product user webpage .{ Local cv=.647 and leader board - almost same.

Then i added next click time using same way, woooelaa my score boosted Local score .652 and public score of .637 .

At this stage I was really happy with my score and thought of starting prep for my exam tomorrow but all thanks to russian masters and mohsin sir. _/_ Again the cycle of feature engineering start.

I created new features like. Which was previous category(product/webpage./product_categ) used. {7combinatinatons}

Target encoding using below trick: “https://kaggle2.blob.core.windows.net/forum-message-attachments/225952/7441/high%20cardinality%20categoricals.pdf“

Calculated each category contribution building log confidence.

Concat history and merged file and calculated total count and features till now.

These features helped me to get into .64 category on leader board and also a very powerful local score of .659 .

Finally tuned the Lightgm model using Bayesian optimization and got the optimal parameters as follows:

“clf2=lgb.LGBMClassifier(max_depth=9,num_leaves=44,n_estimators=200,learning_rate=.1,reg _alpha=.5914,subsample=.8747,colsample_bytree=.3668,reg_lambda=.14,min_split_gain=.008 3,min_child_weight=36)”

Training 5 LightGBM , 4 catboost and 1 XGB model and ensembling them crossed .661 on local and .6418 on Public. At this stage i realized that i have build a very robust model which proved to be very stable on private LB as well with a score of .6433
Thanks again Av & Amex

Posts: 1

Participants: 1

Read full topic

5th Place solution Amexpert

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List