Sumários
Regression - I
18 Outubro 2023, 08:30 • André Maria da Silva Dias Moitinho de Almeida
* Typical problem in data science: dataset + model + cost function
* Model determination: Training and testing samples
* Bias/Variance trade-off
* Overfitting
* Regression as a special case of general model fitting and selection
* Linear Regression: Linear models and norms (L2 and L1)
* Ordinary least squares (OLS)
Dimensionality Reduction
11 Outubro 2023, 11:00 • André Maria da Silva Dias Moitinho de Almeida
Computational exercises in dimensionality reduction. Input to clustering
Dimensionality Reduction
11 Outubro 2023, 08:30 • André Maria da Silva Dias Moitinho de Almeida
* High dimensional data in physics
* The curse (and blessing) of dimensionality
* Methods for dimensionality reduction
- Principal component analysis (PCA)
- Kernel PCA
- Manifold learning: t-distributed Stochastic Neighbor Embedding (t-SNE)
Structure in data
4 Outubro 2023, 11:00 • André Maria da Silva Dias Moitinho de Almeida
Computational exercises in clustering
Structure in data
4 Outubro 2023, 08:30 • André Maria da Silva Dias Moitinho de Almeida
* Exploratory data analysis: why and how
* Statistical description of the observed structure
* Density estimation: inferring the Probability Density Function
- Kernel Density Estimation
- Nearest-Neighbor Density Estimation
* Finding groups: clustering
- KMeans
- Hierarchical: Aglomeratiove clustering,
- Density Based: DBScan
- Going further: UPMASK, and more