Exploratory Data Analysis (EDA)

By Jose R. Zapata

Last update: 19/Feb/2026

Exploratory Data Analysis (EDA) aims to understand the structure, patterns, and main characteristics of a dataset before applying any model or advanced technique. Through visualizations and descriptive statistics, EDA allows us to detect outliers, missing data, unexpected distributions, and possible errors in the data, providing a solid foundation for decision-making in later stages of the project.

Univariate analysis focuses on examining each variable individually to understand its distribution, central tendency, dispersion, and shape. This makes it possible to identify how each feature behaves on its own, detect outliers, and assess whether transformations need to be applied before moving on to more complex analyses.

Bivariate analysis studies the relationship between two variables simultaneously, seeking to identify associations, correlations, or dependencies between them. This type of analysis is essential for discovering which variables might have an influence on the target variable and for formulating hypotheses about the cause-and-effect relationships present in the data.

Multivariate analysis examines the interaction between three or more variables at the same time, allowing us to capture complex relationships that are not visible in lower-dimensional analyses. This approach is key to understanding how multiple factors act together, identifying groups or segments within the data, and reducing the dimensionality of the problem before the modeling stage.

Contents

📖 References

Visualization

EDA

Statistical tests