Member-only story

Unity Catalog OSS — 01

Ganesh Chandrasekaran
2 min readMar 1, 2025

--

if you are unable to read the full post.

This blog series explores the open source version of Unity Catalog and how you can use it on your local machine for

  • Registering tables in the OSS Unity Catalog using local Spark
  • Managing models in the OSS Unity Catalog with open-source MLflow
  • Used with Databases like Duck DB.

In this part, we’ll walk through installing Java and PySpark on macOS and Linux. Windows users should first and follow these steps.

Src:

Installing Java with SDKMAN

simplifies Java installation and version management. Follow these steps:

Install SDKMAN

curl -s "https://get.sdkman.io" | bash

Initialize SDKMAN

source "$HOME/.sdkman/bin/sdkman-init.sh"

Verify SDKMAN Installation

sdk version

List Available Java Versions

sdk list java

Install Java (Recommended Version: 17.0.13-tem)

sdk install java 17.0.13-tem

Check Active Java Version

sdk current java

Get Java Home Path

sdk home java 17.0.13-tem

Verify Java Installation

java --version

Installing PySpark

Check Python and PIP Versions

python3 --version

pip --version

Install PySpark

pip install --user pyspark

Set Environment Variables

export SPARK_HOME=/Users/<your-username>/Library/Python/3.x/lib/python/site-packages/pyspark
export PYSPARK_PYTHON=$(which python3)
export PYSPARK_DRIVER_PYTHON=$(which python3)

--

--

Written by Ganesh Chandrasekaran

Big Data Solution Architect | Adjunct Professor. Thoughts and opinions are my own and don’t represent the companies I work for.

No responses yet

Write a response