Sumários

Spark RDDs and Dataframes

15 Maio 2024, 15:00 Mário João Barata Calha


Running a Spark container at cloud shell. Spark wordcount. Using Spark's REPL. RDDs and dataframes. Mounting a volume on the Spark container to make python programs accessible for spark-submit.

Spark RDDs and Dataframes

9 Maio 2024, 15:00 Mário João Barata Calha


Running a Spark container at cloud shell. Spark wordcount. Using Spark's REPL. RDDs and dataframes. Mounting a volume on the Spark container to make python programs accessible for spark-submit.

Monitoring

8 Maio 2024, 18:30 Mário João Barata Calha


Deploying metrics server. Collecting metrics. Monitoring a Kubernetes deployment with Prometheus and Grafana.

Spark. RDDs, SQL, Dataframes and Datasets

8 Maio 2024, 16:30 Mário João Barata Calha


Spark RDD transformations and actions. Broadcast variables. Accumulators. Spark SQL operations. Dataframes and datasets. Relation with RDDs.

Monitoring

8 Maio 2024, 15:00 Mário João Barata Calha


Deploying metrics server. Collecting metrics. Monitoring a Kubernetes deployment with Prometheus and Grafana.