Porto Seguro Challenge

João Paulo Martins
Data Scientist XNV

Introduction:

In the modern world the competition for marketing space is fierce, nowadays every company that wants the slight advantage needs AI to select the best customers and increase the ROI of marketing campaigns. And of course, our team at Amalgam.ai is developing solutions to this field.

The Challenge:

In this competition we were challenged to build a model that predicts the probability of a customer purchasing a product.

The score chosen to measure the quality of the prediction was the F1 Score. This metric measures the harmonic mean of the precision and the recall scores from the prediction.

The Dataset:

The dataset for this challenge was composed of 70 columns with the following order:

1. 1 target column;

2. 1 id column;

3. 68 columns of different types of variables

All the columns of this dataset were anonymized for this challenge, which created a different level of complexity for the problem.

Our Approach:

Feature Engineering:

To tackle this problem we focused heavily on the feature engineering of the dataset. As the first step we tried to locate and analyze the variables which contributed the most to the labels.

Once we tracked these variables we started to generate new features based on them. These features were generated by extracting statistics, creating relations with other features and grouping with different sets of parameters.

Modeling:

When we get the feature engineering just right we started the modeling part. This was done using classical machine learning models, such as: XGBoost, LightGBM and CatBoost.

One approach that was very effective was to initialize the same model with different seeds and average their predictions, this method created very robust predictions in simple manner.

Other approach that worked very well was to use AutoGluon, an automl library that create stacking and ensembles in an automatic way.

Once the models were trained we did an ensemble with all of them. In my solution we used an ensemble of LightGBM + XGBoost + CatBoost + AutoGluon stacking. To do the ensemble we used majority voting between all the models.

Conclusion

This was a very fun and interesting challenge, we learned a lot, mostly how to handle anonymized datasets. With the approach presented here we finished the competition in 15th out of 174 teams.

THE BLOG

News, lessons, and content from our companies and projects.

Sem categoria 12/01/23

41% of small businesses that employ people are operated by women.

We’ve been talking to several startups in the past two weeks! This is a curated list of the top 5 based on the analysis made by our models using the data we collected. This is as fresh as ...

Lucas

Amalgam 26/10/21

Porto Seguro Challenge – 2nd Place Solution

We are pleased to announce that we got second place in the Porto Seguro Challenge, a competition organized by the largest insurance company in Brazil. Porto Seguro challenged us to build an ...

Adriano Marques
CEO at XNV

Amalgam 15/10/21

Predicting Reading Level of Texts – A Kaggle NLP Competition

Introduction: One of the main fields of AI is Natural Language Processing and its applications in the real world. Here on Amalgam.ai we are building different models to solve some of the problems ...

João Paulo Martins
Data Scientist XNV

Amalgam 15/10/21

Porto Seguro Challenge

Introduction: In the modern world the competition for marketing space is fierce, nowadays every company that wants the slight advantage needs AI to select the best customers and increase the ROI ...

João Paulo Martins
Data Scientist XNV

Sem categoria 16/09/21

Sales Development Representative

At Exponential Ventures, we’re working to solve big problems with exponential technologies such as Artificial Intelligence, Quantum Computing, Digital Fabrication, Human-Machine ...

Rodolfo Egarter
COO @ Pluo

Exponential 15/09/21

Exponential Hiring Process

The hiring process is a fundamental part of any company, it is the first contact of the professional with the culture and a great display of how things work internally. At Exponential Ventures it ...

Rodolfo Egarter
COO @ Pluo

Sem categoria 04/08/21

Exponential Ventures annonce l’acquisition de PyJobs, FrontJobs et RecrutaDev

Fondé en 2017, PyJobs est devenu l’un des sites d’emploi les plus populaires du Brésil pour la communauté Python. Malgré sa croissance agressive au cours de la dernière année, ...

Adriano Marques
CEO at XNV

Exponential Technology Sem categoria 04/08/21

Exponential Ventures announces the acquisition of PyJobs, FrontJobs, and RecrutaDev

Founded in 2017, PyJobs has become one of Brazil’s most popular job boards for the Python community. Despite its aggressive growth in the past year, PyJobs retained its community-oriented ...

Adriano Marques
CEO at XNV

Sem categoria 02/08/21

Sales Executive

At Exponential Ventures, we’re working to solve big problems with exponential technologies such as Artificial Intelligence, Quantum Computing, Digital Fabrication, Human-Machine ...

Rodolfo Egarter
COO @ Pluo

Sem categoria 28/07/21

What is a Startup Studio?

Spoiler: it is NOT an Incubator or Accelerator I have probably interviewed a few hundred professionals in my career as an Entrepreneur. After breaking the ice, one of the first things I do is ask ...

Adriano Marques
CEO at XNV

Sem categoria 23/07/21

Social Media

At Exponential Ventures, we’re working to solve big problems with exponential technologies such as Artificial Intelligence, Quantum Computing, Digital Fabrication, Human-Machine ...

Rodolfo Egarter
COO @ Pluo

Sem categoria 14/07/21

Hunting for Unicorns

Everybody loves unicorns, right? But perhaps no one loves them more than tech companies. When hiring for a professional, we have an ideal vision of who we are looking for. A professional with X ...

Rodolfo Egarter
COO @ Pluo

see all

Stay In The Loop!

Receive updates and news about XNV and our child companies. Don't worry, we don't SPAM. Ever.

Porto Seguro Challenge

Share

Tags

Introduction:

The Challenge:

The Dataset:

Our Approach:

Feature Engineering:

Modeling:

Conclusion

THE BLOG

41% of small businesses that employ people are operated by women.

Porto Seguro Challenge – 2nd Place Solution

Predicting Reading Level of Texts – A Kaggle NLP Competition

Porto Seguro Challenge

Sales Development Representative

Exponential Hiring Process

Exponential Ventures annonce l’acquisition de PyJobs, FrontJobs et RecrutaDev

Exponential Ventures announces the acquisition of PyJobs, FrontJobs, and RecrutaDev

Sales Executive

What is a Startup Studio?

Social Media

Hunting for Unicorns

Stay In The Loop!

Company

Child Companies

Community