Jose R. Zapata

Jose R. Zapata

Data Science Technical Leader
PhD in Information and Communication Technologies

Mercado Libre

Data Science Technical Leader at Mercado Libre. Researcher at GIDATIC and Professor at the Faculty of Information and Communication Technologies (TIC), Universidad Pontificia Bolivariana (UPB).

My interests are based around Data science, MLOps and Audio and Music information technologies, which includes Music information retrieval, Machine Learning, CI/CD/CT, Audio analysis and Data analysis. EN ESPAÑOL

Interests
  • Data science
  • MLOps
  • Music Information Retrieval
  • Audio signal processing
  • Python
  • Guitar
Education
  • PhD in Information and Communication Technologies, 2013

    Universitat Pompeu Fabra (UPF), Spain

  • MEng in Telecommunications, 2008

    Universidad Pontificia Bolivariana (UPB), Colombia

  • BSc in in Electronic Engineering, 2002

    Universidad Pontificia Bolivariana (UPB), Colombia

Skills

Python

Pandas, Polars, Numpy, Matplotlib, Plotly, Seaborn

Data Science

Python, SQL, Pandas, Polars, BigQuery

Machine Learning & DL

Deep Learning, Scikit-learn, Pytorch, NLP, Spacy

MLOps

Kedro, MLflow, Deepchecks, DVC, great expectations

Signal Processing

and Frequency Analysis of signal

Programming

Git, CI/CD, Bash, pre-commit, Linux

Blog

En Español

*

Estructura Base para Proyectos de Ciencia de datos

Proceso de creación de una estructura base para proyectos de ciencia de datos con Python, teniendo en cuenta buenas prácticas de desarrollo de software y las herramientas de MLOps.

Album Lofi con Python Version Cero

Album Lofi con Python Version Cero

Generacion automatica musica LoFi y videos con Python utilizando técnicas de inteligencia artificial y procesamiento de audio.

Paso a paso en un Proyecto Machine Learning

Paso a paso en un Proyecto Machine Learning

Checklist y preguntas para realizar un proyecto de machine learning

Visualizacion Datos Coronavirus (COVID19) Mundial con Plotly

Visualizacion Datos Coronavirus (COVID19) Mundial con Plotly

Visualizaciones con Python y Plotly de los datos mundiales del corona virus COVID19

Pyspark con Google Colab

Pyspark con Google Colab

Configuracion de Google Colab para usar pyspark

Tutorial de Produccion de Audio con Reaper

Tutorial de Produccion de Audio con Reaper

Tutorial en Video y material produccion basica de Audio con Reaper

Tips para la Exposición del Proyecto de Grado

Tips para la Exposición del Proyecto de Grado

Tips para crear las diapositivas y para realizar la exposición del anteproyecto del trabajo de grado

Projects

Data science project template
Template for a data science projects with software development tools
Social Media Behaviour with exclusive Facebook data
Research Project that combines exclusive Facebook data (Condor Dataset, Crowdtangle, Ad’s) and public data to analyze the Social Media behavior to determine if there is coordinated non-authentic behavior. This project in founded by the Social Media and Democracy Research Grants from the Social Science Research Council and access to Facebook data via Social Science One.
Social Media Behaviour with exclusive Facebook data
Acmus
Research Project to explore the use of machine learning techniques for computational musicology, digital music archive managment, and music information retrieval.Two main elements are the core of our project: 1. Emphasis on semi-supervised and unsupervised machine learning techniques that minimally rely on the availability of annotated data for a specific task. 2. Traditional Colombian music as the main focus of our study.
Acmus
Multifeature Beat tracker
Matlab implementation of the Multi Feature Beat Tracker (Information Gain and Regularity), The essentia beat tracker, More details in Multi-Feature Beat Tracking
Multifeature Beat tracker
Essentia
Open-source C++ library for audio analysis and audio-based music information retrieval. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. More details in Essentia: An Audio Analysis Library for Music Information Retrieval
Essentia
SMC Beat tracking Dataset
This beat tracking dataset contains 217 excerpts around 40s each, of which 19 are easy and the remaining 198 are hard. This dataset has been designed for radically new techniques which can contend with challenging beat tracking situations like: quiet accompaniment, expressive timing, changes in time signature, slow tempo, poor sound quality etc. More details in Selective Sampling for Beat Tracking Evaluation
SMC Beat tracking Dataset

Talks

Teaching

Current and past, postgraduate and undergraduate courses at the Universidad Pontificia Bolivariana (UPB)

Postgraduate CoursesUnderGraduate Courses
Python for Data ScienceMachine Learning
Audio Data MiningSignals and Systems
R for Data ScienceData Mining
Audio ProcessingPrinciples of Audio
Signal ProcessingAudio Applications
Telecommunications PrinciplesTelecommunications Theory & Lab
MatlabTx Lines and antennas

Experience

 
 
 
 
 
Mercado Libre
Data Science Technical Leader
September 2021 – Present Medellin - Colombia
 
 
 
 
 
Globant
Data Scientist Sr
December 2020 – September 2021 Medellin - Colombia
 
 
 
 
 
Universidad Pontificia Bolivariana
Full Professor
September 2013 – December 2021 Medellin - Colombia
 
 
 
 
 
Music Technology Group - Universitat Pompeu Fabra
Phd. Student
September 2009 – September 2013 Barcelona - Spain
Thesis: Comparative evaluation and combination of automatic rhythm description systems, advisor. Emilia Gomez
 
 
 
 
 
SMC Group - INESC TEC
Phd. Internship
April 2011 – June 2011 Porto - Portugal
Advisor. Fabien Gouyon, Matthew E.P. Davies
 
 
 
 
 
Universidad Pontificia Bolivariana
Assistant Professor
June 2003 – September 2009 Medellin - Colombia