Biography

I am a Data Scientist working for ING Wholesale Bank Advanced Analytics (WBAA) in Amsterdam.

Interests

  • Natural Language Processing
  • Machine Learning
  • Evolutionary Computation

Education

  • PhD in Computer Science, 2016

    Federal University of Minas Gerais

  • MSc in Computer Science, 2012

    Universidade Federal do Rio Grande do Sul

  • BSc in Computer Science, 2010

    Universidade Federal de Itajuba

Recent Posts

Experience

 
 
 
 
 

Data Scientist

ING WBAA

Feb 2019 – Present Amsterdam, Netherlands
  • Name matching project: Fuzzy matching millions of names using multiple stages (cosine similarity + MLP on (py)Spark)
  • Name screening: Reducing false positives from name screening using Levenshtein distance, cos. similarity, Jaro-Winkler for feature extraction with LightGBM
  • Anomaly detection (One-class SVM, Isolation Forest), address parsing (libpostal), database statistics monitor
  • Snorkel labelling: Predicting label dependencies using robust PCA
 
 
 
 
 

Data Scientist

Corl Financial Technologies

Mar 2018 – Nov 2018 Toronto, Canada

I was involved in the process of building predictive models for investment risk in startups. The process involves:

  • Retrieving data from different sources, including scrapping data from the web (using selenium with Beautiful Soup).
  • Analysing the data, using Jupyter notebooks, pandas, matplotlib.
  • Building Machine Learning models using Random Forest and SVM to fit our prediction problems.
  • Feature selection/engineering.
 
 
 
 
 

Project Manager

Universidade Federal de Minas Gerais

Mar 2018 – May 2018 Belo Horizonte, Brazil

I worked as Project Manager on the EU-Brazil project ATMOSPHERE.

I managed resources, following the status of the deliverables and delegating activities for three Brazilian Universities: UFMG, Unicamp and UFAM.

 
 
 
 
 

Contributing Researcher

Universidade Federal de Minas Gerais

Aug 2017 – Jan 2019 Belo Horizonte, Brazil

I was working on two different projects with the LaIC (Computational Intelligence Laboratory) research group:

  1. Reducing the exponential size of solutions generated by the Geometric Semantic Genetic Programming (GSGP) framework. The main objective is to reduce the solutions in order to improve their interpretability and reduce memory and computational cost.

  2. Analysing datasets used as benchmark for Genetic Programming (GP)-based methods under a Data Science perspective: We intend to gather datasets employed by the main publications in GP field in the last five years and analyse the viability of using GP to induce regression models.

 
 
 
 
 

Postdoctoral Researcher

University College Cork

Aug 2017 – Dec 2017 Cork, Ireland

I worked on the development of autonomous tugs, capable of towing the aircraft in the airport ground, from the runway to the gates—during arrivals—and vice versa—during departures.

Activities performed:

  1. Surveyed current advances in autonomous vehicles and worked on optimization of ground routes in airports;

  2. Worked in coordination with United Technologies;

  3. Worked with Python (build a parser for XML airport maps) and Java/CPLEX (route optimization).

 
 
 
 
 

Postdoctoral Researcher

Universidade Federal de Minas Gerais

Oct 2016 – Aug 2017 Belo Horizonte, Brazil

I worked with Geometric Semantic Genetic Programming (GSGP) on two main projects:

  1. A study investigating aspects related to the semantic distribution of the functions employed by geometric semantic operators;

  2. an investigation of the impact of different instance selection techniques on GSGP and its robustness to noisy data.

Comprehending:

  • Management of a research team (three researchers);
  • Development using Java (Genetic Programming framework), R (hypothesis tests and plotting) and shell script (text/data manipulation).

Accomplish­ments

Nomination for Best Paper Award

Solving the Exponential Growth of Symbolic Regression Trees in Geometric Semantic Genetic Programming

Joao Francisco Martins, Luiz Otavio V. B. Oliveira, Luis Fernando Miranda, Felipe Casadei, Gisele Pappa

Nomination for Best Paper Award

How Noisy Data Affects Geometric Semantic Genetic Programming

Luis F. Miranda, Luiz Otavio V. B. Oliveira, Joao Francisco B. S. Martins, Gisele L. Pappa

Best Paper Award

A Dispersion Operator for Geometric Semantic Genetic Programming

Luiz Otavio V. B. Oliveira, Fernando E. B. Otero, Gisele Lobo Pappa

Nomination for Best Paper Award

The Effect of Distinct Geometric Semantic Crossover Operators in Regression Problems

Julio Albinati, Gisele Lobo Pappa, Luiz Otavio V. B. Oliveira, Fernando Otero

Publications

Quickly discover relevant content by filtering publications.
  • Frankemaheerd 2, Amsterdam, 1102AN
  • ING Cedar, B Tower