Posts

An introduction to ROC curves with animated examples

Overview Receiver operating characteristic (ROC) curves is one of the concepts I have struggled most. As a personal view, I do not find it intuitive or clear at first glance.

An introduction to uncertainty with Bayesian models

Overview In this post, we will get a first approximation to the “uncertainty” concept. First, we will train two models: logistic regression and its “Bayesian version” and compare their performance.

Poisson distribution applied in genomics

Overview In this post, I will discuss briefly what is the Poisson distribution and describe two examples extracted from research articles in the genomics field. One of them based on the distribution of structural variants across the genome and other about de novo variants in a patient cohort.

Estimating pi value with Monte Carlo simulation

# Load of libraries library(tidyverse) library(sp) library(gganimate) n_simulations <- 3000 df <- tibble( values_x = runif(n_simulations,0,1), values_y = runif(n_simulations,0,1) ) circleFun <- function(center=c(0,0), diameter=1, npoints=100, start=0, end=2) { tt <- seq(start*pi, end*pi, length.

Exploring world flights using a network approach

Introduction Recently, I started to read this free accessible book written by Albert-László Barabási. In the Chapter 4 of his book, it depicts the USA airport networks to represent scale-free networks.

Can we predict cases of dengue with climate variables?

Recently, I discovered a new website about competitions that it is not called Kaggle! Its name is Drivendata. DrivenData offers different competitions related with multiple types of field, such as health (oh yes!