Light PHP application to extract and load data

Light PHP application to move data between different types of system through independent processes for each data pipe. A data pipe is composed of an inbound channel and an outbound channel and it’s executed directly from the shell. You can use tools as for example cron to setup a periodically data pipe execution.

Continue reading “Light PHP application to extract and load data”

Apache Spark application to calculate the relevance of each word from a list of phrases

The following Apache Spark application wrote in Scala calculates the relevance of each word from a list of phrases using a initial weight value for each phrase.
Continue reading “Apache Spark application to calculate the relevance of each word from a list of phrases”

Reading and writing Amazon S3 files from Apache Spark

The S3 Native Filesystem client present in Apache Spark running over Apache Hadoop allows access to the Amazon S3 service from a Apache Spark application. So it is enough to define the S3 Access Key and the S3 Secret Access Key in the Spark Context as shown below:

Continue reading “Reading and writing Amazon S3 files from Apache Spark”

Bash script to upload files to a Amazon S3 bucket using cURL

The following Bash script copies all files matching a specified local path pattern to a S3 directory. The script uses the cURL command-line tool to upload the files so is not necessary AWS CLI or other specific tool installed.

Continue reading “Bash script to upload files to a Amazon S3 bucket using cURL”