Check if exists a Amazon S3 path from Apache Spark

To access to the Amazon S3 service from a Apache Spark application refer to this post. Then is necessary to import the following classes:

import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path

With these classes imported the following statement will return true or false if exists the path:

FileSystem.get(new URI("s3n://bucket"), sc.hadoopConfiguration).exists(new Path("s3n://bucket/path_to_check"))

Leave a Reply

Your email address will not be published. Required fields are marked *