Install jenv with a Homebrew command:
>> brew install jenv
Install Java 8
>> brew install --cask adoptopenjdk8
List all the Java environments
>> ls /Library/Java/JavaVirtualMachines/
Check if Java 8 is installed
>>ls /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
Add Java to Environment
>> jenv add /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
Check Java environments and can see the environment is added.
>> jenv versions
Then create and update zsh or bash system environment.
>>touch ~/.zprofile
>>open ~/.zprofile
update the below lined to update the jEnv configurations
#jEnv Configurations
export PATH="$HOME/.jenv/bin:$PATH"
eval "$(jenv init -)"
Step 3 : Setup Scala 2.13.0
Normally I put all my Big data development related softwares in a specific directory.
Usual location is. : /Users/surarajpradhan/bigdata_projects/softwares/
So downloaded the Scala binaries for macOS from this link and unzip the file to put the Scala 2.13.0 in : /Users/surarajpradhan/bigdata_projects/softwares/scala-2.13.0.
Finally update the environment PATH variable for SCALA_HOME in .zprofile
>>open ~/.zprofile
update with below line:
export SCALA_HOME=/Users/surarajpradhan/bigdata_projects/softwares/scala-2.13.0
export PATH=$SCALA_HOME/bin:$SCALA_HOME/lib:$PATH
Step 4 : Setup Python 3.9
Nonamlly while installing Brew, Python3.9 is installed automatically.
To find if the Python 3.9 is installed or not excute the below command : >> brew list
update with below line:
# Setting PATH for Python 3.9
# The original version is saved in .profile.pysave
PATH="/usr/local/Frameworks/Python.framework/Versions/3.9/bin:${PATH}"
export PATH
export PYSPARK_PYTHON=python3.9
Step 4 : Setup Spark 3.2.0
Finally update the environment PATH variable for SPARK_HOME and other others in .zprofile
>>open ~/.zprofile
update with below line:
export SPARK_HOME=/Users/surarajpradhan/bigdata_projects/softwares/spark-3.2.0-bin-hadoop3.2
export PATH=$SPARK_HOME:$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH
export PYSPARK_PYTHON=python3.9
setup the IP address in environment
export SPARK_LOCAL_IP=localhost
export SPARK_MASTER_HOST=localhost
After all this configuration the Spark 3.2.0 should be ready to access and execute the code.
Pysparkspark-shell
If you have any difficulties please let me know, I will try to help setting up the environment.