Skip to content

[BUG] Local docker failed to run spark-shell on Mac M1 #115

@thinh2

Description

@thinh2

Willingness to contribute

Yes. I can contribute a fix for this bug independently.

OpenHouse version

v0.5.62

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 20.0): Apple M1, MacOS Sonoma 14.5, Docker v4.30
  • JDK version: 1.8

Describe the problem

While running spark-shell commands in the SETUP.md, it always prompts a fatal error related to Java Runtime Environment.

After investigation, I found that it is a common docker issue on Apple Silicon Macbook due to a bug in Rosseta (the x86/amd64 emulation application on Apple Silicon).

More details about this issue can be found in https://github.com/docker/for-mac/issues/7006

While waiting for the fix from Apple, there are several workarounds for this issue. For me, downgrading the Docker to [https://docs.docker.com/desktop/release-notes/#4272](version 4.27.2) will work. Additionally, other methods are mentioned in https://github.com/docker/for-mac/issues/7006#issuecomment-2122869966.

Stacktrace, metrics and logs

A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007ffffe0b8e1e, pid=692, tid=0x00007fffe86e6700
#
# JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-8u232-b09-1~deb9u1-b09)
# Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x628e1e]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /opt/spark/hs_err_pid692.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp

Code to reproduce bug

bin/spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.1_2.12:1.2.0   \
  --jars openhouse-spark-runtime_2.12-*-all.jar  \
  --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,com.linkedin.openhouse.spark.extensions.OpenhouseSparkSessionExtensions   \
  --conf spark.sql.catalog.openhouse=org.apache.iceberg.spark.SparkCatalog   \
  --conf spark.sql.catalog.openhouse.catalog-impl=com.linkedin.openhouse.spark.OpenHouseCatalog     \
  --conf spark.sql.catalog.openhouse.metrics-reporter-impl=com.linkedin.openhouse.javaclient.OpenHouseMetricsReporter    \
  --conf spark.sql.catalog.openhouse.uri=http://openhouse-tables:8080   \
  --conf spark.sql.catalog.openhouse.auth-token=$(cat /var/config/$(whoami).token) \
  --conf spark.sql.catalog.openhouse.cluster=LocalHadoopCluster

What component does this bug affect?

  • Table Service: This is the RESTful catalog service that stores table metadata. :services:tables
  • Jobs Service: This is the job orchestrator that submits data services for table maintenance. :services:jobs
  • Data Services: This is the jobs that performs table maintenance. apps:spark
  • Iceberg internal catalog: This is the internal Iceberg catalog for OpenHouse Catalog Service. :iceberg:openhouse
  • Spark Client Integration: This is the Apache Spark integration for OpenHouse catalog. :integration:spark
  • Documentation: This is the documentation for OpenHouse. docs
  • Local Docker: This is the local Docker environment for OpenHouse. infra/recipes/docker-compose
  • Other: Please specify the component.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions