About me
I'm a graduate student at the University of Toronto and Vector Institute for Artificial Intelligence researching Machine Learning and Information Theory with my advisors Ashish Khisti and Alireza Makhzani. I have a B.Sc. in Electronics Engineering from the Federal University of Santa Catarina (Brazil) where I was advised by Danilo Silva. Previously, I was an engineer at 37.78 working with Machine Learning for healthcare.

Originally, I am from Florianópolis, Brazil but I’ve lived in New Jersey, Orlando, Toronto and São Paulo as well as other smaller cities in the south of Brazil. I enjoy reading, playing american football and KSP.


16/Oct/2020 I’ve joined the Vector Institute as a graduate student researcher.
  • 14/May/2020 I am a Vector Scholarship in Artificial Intelligence Recipient 2020-21.
  • 27/Feb/2020 Starting graduate studies at University of Toronto in Fall/2020.
  • 03/Dec/2019 Proof of Novelty was awarded by Blockchain@UBC.
  • 15/Nov/2019 Finished writing Proof of Novelty.
  • 21/Oct/2019 Preprint of Ward2ICU posted on arXiv.
  • 02/Oct/2019 This page was created.


Predicting Multiple ICD-10 Codes from Brazilian-Portuguese Clinical Notes (BRACIS 2020)

Arthur D Reys, Danilo Silva, Daniel Severo, Saulo Pedro, Marcia M Sá, Guilherme AC Salgado


ICD coding from electronic clinical records is a manual, time-consuming and expensive process. Code assignment is, however, an important task for billing purposes and database organization. While many works have studied the problem of automated ICD coding from free text using machine learning techniques, most use records in the English language, especially from the MIMIC-III public dataset. This work presents results for a dataset with Brazilian Portuguese clinical notes. We develop and optimize a Logistic Regression model, a Convolutional Neural Network (CNN), a Gated Recurrent Unit Neural Network and a CNN with Attention (CNN-Att) for prediction of diagnosis ICD codes. We also report our results for the MIMIC-III dataset, which outperform previous work among models of the same families, as well as the state of the art. Compared to MIMIC-III, the Brazilian Portuguese dataset contains far fewer words per document, when only discharge summaries are used. We experiment concatenating additional documents available in this dataset, achieving a great boost in performance. The CNN-Att model achieves the best results on both datasets, with micro-averaged F1 score of 0.537 on MIMIC-III and 0.485 on our dataset with additional documents.


Proof of Novelty

Daniel Severo


We propose a design for securing novelty of archived content in distributed ledgers, called Proof of Novelty. What constitutes as novel is decided through a consensus mechanism together with a similarity function, which is selected according to the content type (e.g. full-motion videos, textual documents). Scalability is guaranteed by forming a validation committee with cryptographic sortition, which use statistical hypothesis testing to decide on the probability of a content being novel or not. The system can trade-off computational with statistical performance by manipulating parameters. We discuss the usage of this design to secure the novelty of full-motion videos and end with a proposal of future lines of research that can extend the systems capabilities.

Ward2ICU: A Vital Signs Dataset of Inpatients from the General Ward

Daniel Severo, Flávio Amaro, Estevam R Hruschka Jr, André Soares de Moura Costa


We present a proxy dataset of vital signs with class labels indicating patient transitions from the ward to intensive care units called Ward2ICU. Patient privacy is protected using a Wasserstein Generative Adversarial Network to implicitly learn an approximation of the data distribution, allowing us to sample synthetic data. The quality of data generation is assessed directly on the binary classification task by comparing specificity and sensitivity of an LSTM classifier on proxy and original datasets. We initialize a discussion of unintentionally disclosing commercial sensitive information and propose a solution for a special case through class label balancing

A Report on the Ziggurat Method

Daniel Severo


This report outlines, as well as provides a mathematical proof of functionality, of a highly efficient pseudo-random number generator: The Ziggurat Method. A simple ready-to-use code has been provided by previous authors. We contribute to this with a speed test on a modern Intel processor, as well as a Python script that generates all the necessary information to implement a specific version of the algorithm.


Vector Scholarship in Artificial Intelligence Recipient 2020-21

The Vector Scholarship in Artificial Intelligence supports the recruitment of top students to AI-related master’s programs in Ontario. Valued at $17,500 for one year of full-time study at an Ontario university, these merit-based entrance awards recognize exceptional candidates pursuing a master’s program recognized by the Vector Institute or who are following an individualized study path that is demonstrably AI-focused.


NSERC Applied Research Rapid Response to COVID-19 Grant

Our project titled “Canadian Hospital Simulator For Management of COVID19 Cases and Contact Tracing” was awarded $75,000.00.


Virtual Design Challenge Winner 2019

Won 1st place at the VDC hosted by The University of British Columbia with my paper Proof of Novelty. Received a cash prize of $ 3,000.00.


Student Merit Award & Medal 2015
Graduated with the highest GPA ever obtained (at the time) for my major. Elected ”Best Student” by the faculty of Electrical & Electronics Engineering at the Federal University of Santa Catarina
Science Without Borders Scholarship 2013
Awarded a full scholarship that covered tuition, transportation, necessary materials and living costs to study 2 academic semesters at the University of Toronto.

Talks and Media

You may reach me at