Sumários

Regression - I

18 Outubro 2023, 08:30 André Maria da Silva Dias Moitinho de Almeida


* Typical problem in data science: dataset + model + cost function
* Model determination: Training and testing samples
* Bias/Variance trade-off
* Overfitting
* Regression as a special case of general model fitting and selection
* Linear Regression: Linear models and norms (L2 and L1)
* Ordinary least squares (OLS)

Dimensionality Reduction

11 Outubro 2023, 11:00 André Maria da Silva Dias Moitinho de Almeida


Computational exercises in dimensionality reduction. Input to clustering

Dimensionality Reduction

11 Outubro 2023, 08:30 André Maria da Silva Dias Moitinho de Almeida


* High dimensional data in physics
* The curse (and blessing) of dimensionality
* Methods for dimensionality reduction
- Principal component analysis (PCA)
- Kernel PCA
- Manifold learning: t-distributed Stochastic Neighbor Embedding (t-SNE)

Structure in data

4 Outubro 2023, 11:00 André Maria da Silva Dias Moitinho de Almeida


Computational exercises in clustering

Structure in data

4 Outubro 2023, 08:30 André Maria da Silva Dias Moitinho de Almeida


* Exploratory data analysis: why and how
* Statistical description of the observed structure
* Density estimation: inferring the Probability Density Function
- Kernel Density Estimation
- Nearest-Neighbor Density Estimation
* Finding groups: clustering
- KMeans
- Hierarchical: Aglomeratiove clustering, 
- Density Based: DBScan
- Going further: UPMASK, and more