How to Setup PySpark in a Jupyter Notebook in 2 Steps on a Mac

  1. Download Anaconda
  2. Open your Terminal and run conda install -c conda-forge pyspark=3.0 openjdk=8

Now you are all set to launch a Jupyter notebook from Anaconda. Once your notebook is open, you can run the following to get Spark running:

import pyspark
from pyspark.sql import DataFrame, SparkSession
from typing import List
import pyspark.sql.types as T
import pyspark.sql.functions as F

#Initiate Spark
spark= SparkSession \
       .builder \
       .appName("Spark App") \
       .getOrCreate()

spark

Final Thoughts

Check out more Python tricks in this Colab Notebook or in my recent Python Posts.

Thanks for reading!


Posted

in

by

Tags: