Jose R. Zapata

Jose R. Zapata

Data Science Technical Leader
PhD in Information and Communication Technologies

Mercado Libre

Data Science Technical Leader at Mercado Libre. Researcher at GIDATIC and Professor at the Faculty of Information and Communication Technologies (TIC), Universidad Pontificia Bolivariana (UPB).

My interests are based around Data science, MLOps and Audio and Music information technologies, which includes Music information retrieval, Machine Learning, CI/CD/CT, Audio analysis and Data analysis. EN ESPAÑOL

  • Data science
  • MLOps
  • Music Information Retrieval
  • Audio signal processing
  • Python
  • Guitar
  • PhD in Information and Communication Technologies, 2013

    Universitat Pompeu Fabra (UPF), Spain

  • MEng in Telecommunications, 2008

    Universidad Pontificia Bolivariana (UPB), Colombia

  • BSc in in Electronic Engineering, 2002

    Universidad Pontificia Bolivariana (UPB), Colombia



Pandas, Numpy, Matplotlib, Plotly, Seaborn, BeautifulSoup, Selenium

Data Science

Python, SQL, Pandas, BigQuery

Machine Learning &
Deep Learning

Scikit-learn, Pytorch, NLP, Spacy


Kedro, MLflow, Deepchecks, DVC, great expectations

Signal Processing

and Frequency Analysis of signals


SQL, Git, CI/CD, Bash, Linux


En Español


Estructura Base para Proyectos de Ciencia de datos

Proceso de creación de una estructura base para proyectos de ciencia de datos con Python, teniendo en cuenta buenas prácticas de desarrollo de software y las herramientas de MLOps.

Album Lofi con Python Version Cero

Album Lofi con Python Version Cero

Generacion automatica musica LoFi y videos con Python utilizando técnicas de inteligencia artificial y procesamiento de audio.

Paso a paso en un Proyecto Machine Learning

Paso a paso en un Proyecto Machine Learning

Checklist y preguntas para realizar un proyecto de machine learning

Visualizacion Datos Coronavirus (COVID19) Mundial con Plotly

Visualizacion Datos Coronavirus (COVID19) Mundial con Plotly

Visualizaciones con Python y Plotly de los datos mundiales del corona virus COVID19

Pyspark con Google Colab

Pyspark con Google Colab

Configuracion de Google Colab para usar pyspark

Tutorial de Produccion de Audio con Reaper

Tutorial de Produccion de Audio con Reaper

Tutorial en Video y material produccion basica de Audio con Reaper

Tips para la Exposición del Proyecto de Grado

Tips para la Exposición del Proyecto de Grado

Tips para crear las diapositivas y para realizar la exposición del anteproyecto del trabajo de grado


Data science project template
Template for a data science projects with software development tools
Research Project to explore the use of machine learning techniques for computational musicology, digital music archive managment, and music information retrieval.Two main elements are the core of our project: 1. Emphasis on semi-supervised and unsupervised machine learning techniques that minimally rely on the availability of annotated data for a specific task. 2. Traditional Colombian music as the main focus of our study.
Multifeature Beat tracker
Matlab implementation of the Multi Feature Beat Tracker (Information Gain and Regularity), The essentia beat tracker, More details in Multi-Feature Beat Tracking
Multifeature Beat tracker
Open-source C++ library for audio analysis and audio-based music information retrieval. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. More details in Essentia: An Audio Analysis Library for Music Information Retrieval
SMC Beat tracking Dataset
This beat tracking dataset contains 217 excerpts around 40s each, of which 19 are easy and the remaining 198 are hard. This dataset has been designed for radically new techniques which can contend with challenging beat tracking situations like: quiet accompaniment, expressive timing, changes in time signature, slow tempo, poor sound quality etc. More details in Selective Sampling for Beat Tracking Evaluation
SMC Beat tracking Dataset


Quickly discover relevant content by filtering publications.
(2016). Tempo Estimation. In Music Data Analysis: Foundations and Applications (pp. 493-510), Claus Weihs, Dietmar Jannach, Igor Vatolkin, Guenter Rudolph (Eds.), Chapman & Hall/CRC, 2016, ISBN 9781498719568.


(2014). Multi-Feature Beat Tracking. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22, No. 4, P. 816-825.

PDF Cite Code DOI

(2013). Using voice suppression algorithms to improve beat tracking in the presence of highly predominant vocals. IEEE The 38th International Conference on Acoustics, Speech, and Signal Processing , ICASSP 2013, P. 51-55 , Vancouver, Canada.

PDF Dataset Slides DOI Conference

(2012). Selective Sampling for Beat Tracking Evaluation. IEEE Transactions on Audio, Speech and Language Processing, Vol. 20, No. 9, P. 2539-2548.

PDF Cite Dataset DOI

(2012). Assigning a confidence threshold on automatic beat annotation in large datasets. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), P.157-162, Porto, Portugal.

PDF Slides DOI

(2012). Improving Beat Tracking in the presence of highly predominant vocals using source separation techniques: Preliminary study. The 9th International Symposium on Computer Music Modeling and Retrieval (CMMR 2012), London, UK.

PDF Dataset Conference

(2012). On the automatic identification of difficult examples for beat tracking: towards building new evaluation datasets. IEEE The 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), P. 89-92 Kyoto, Japan.

PDF Dataset DOI

(2011). Comparative Evaluation and Combination of Audio Tempo Estimation Approaches. 42nd AES International Conference, Semantic Audio (Audio Engineering Society), P.198-207, Ilmenau, Germany..

PDF Dataset Poster AES

(2008). Efficient Detection of Exact Redundancies in Audio Signals. Audio Engineering Society Convention 125, Paper = 7504, San Francisco, EE.UU.

PDF AES Conference

(2004). Control de Rango Dinamico en Audio con Logica Difusa. 3ra Conferencia Iberoamericana en Sistemas, Cibernética e Informática CISCI 2004, Orlando, EE.UU.



Current and past, postgraduate and undergraduate courses at the Universidad Pontificia Bolivariana (UPB)

Postgraduate CoursesUnderGraduate Courses
Python for Data ScienceMachine Learning
Audio Data MiningSignals and Systems
Principles of Image ProcessingData Mining
Audio ProcessingPrinciples of Audio
Signal ProcessingAudio Applications
Telecommunications PrinciplesTelecommunications Theory & Lab
MatlabTx Lines and antennas


Mercado Libre
Data Science Technical Leader
Sep 2021 – Present Medellin - Colombia
Data Scientist Sr
Dec 2020 – Sep 2021 Medellin - Colombia
Universidad Pontificia Bolivariana
Full Professor
Sep 2013 – Dec 2021 Medellin - Colombia
Music Technology Group - Universitat Pompeu Fabra
Phd. Student
Sep 2009 – Sep 2013 Barcelona - Spain
Thesis: Comparative evaluation and combination of automatic rhythm description systems, advisor. Emilia Gomez
Phd. Internship
Apr 2011 – Jun 2011 Porto - Portugal
Advisor. Fabien Gouyon, Matthew E.P. Davies
Universidad Pontificia Bolivariana
Assistant Professor
Jun 2003 – Sep 2009 Medellin - Colombia