Deleting a Amazon S3 path from Apache Spark

To access to the Amazon S3 service from a Apache Spark application refer to this post. Then is necessary to import the following classes:

import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path

With these classes imported the following statement will delete the defined path:

FileSystem.get(new URI("s3n://bucket"), sc.hadoopConfiguration).delete(new Path("s3n://bucket/path_to_delete"), true)

Leave a Reply

Your email address will not be published. Required fields are marked *