Apache Spark - A unified analytics engine for large-scale data processing. Includes APIs in Scala, Java, Python (known as PySpark), and R (SparkR). Apache Beam - An open-source implementation of ...
Deploy big data components using docker compose, you can use docker to set up hadoop based big data platform in a few minutes, docker images include Hadoop 3+, HBase 2+, Hive 3+, Kafka 2+, Prestodb ...