Human genetics as a tool for drug discovery

For children with a rare disease, an accurate diagnosis is crucial to provide advice, possible therapies and assess the potential risk for family members in future generations. Public initiatives such as the International Rare Diseases Research Consortium (IRDiRC) set the goal for 2017-2027 to “enable all people living with a rare disease to receive an accurate diagnosis, care, and available therapy soon after seeking medical care” (1).

How many genes have been associated with cancer in PubMed?

In the biomedical literature, it is common to find sentences like: “Besides, the gene [gene symbol] has been associated with [type of cancer(s)] [References]” The structure of these sentences can change from article to article, but the underlying idea and goal are the same.

Extracting gene panels from the Genomics England Panelapp

The Genomics England PanelApp provides panels of genes related to human disorders manually curated by healthcare experts. From a clinical and research perspective, this is a remarkable resource. At the time of writing this post, over 320 panels have been published.

An introduction to ROC curves with animated examples

Overview Receiver operating characteristic (ROC) curves is one of the concepts I have struggled most. As a personal view, I do not find it intuitive or clear at first glance.

An introduction to uncertainty with Bayesian models

Overview In this post, we will get a first approximation to the “uncertainty” concept. First, we will train two models: logistic regression and its “Bayesian version” and compare their performance.

Poisson distribution applied in genomics

Overview In this post, I will discuss briefly what is the Poisson distribution and describe two examples extracted from research articles in the genomics field. One of them based on the distribution of structural variants across the genome and other about de novo variants in a patient cohort.

Estimating pi value with Monte Carlo simulation

# Load of libraries library(tidyverse) library(sp) library(gganimate) n_simulations <- 3000 df <- tibble( values_x = runif(n_simulations,0,1), values_y = runif(n_simulations,0,1) ) circleFun <- function(center=c(0,0), diameter=1, npoints=100, start=0, end=2) { tt <- seq(start*pi, end*pi, length.

Exploring world flights using a network approach

Introduction Recently, I started to read this free accessible book written by Albert-László Barabási. In the Chapter 4 of his book, it depicts the USA airport networks to represent scale-free networks.

Can we predict cases of dengue with climate variables?

Recently, I discovered a new website about competitions that it is not called Kaggle! Its name is Drivendata. DrivenData offers different competitions related with multiple types of field, such as health (oh yes!