Recently,Amazon EMR introduced software releases referenced by their release label rather than the AMI version. AWS Data Pipeline now supports Amazon EMR 4.x software releases. You can specify the required release in the releaseLabel field on the EmrCluster thing in your pipeline. Using the EmrConfiguration thing, you can specify cluster configurations, and such as choosing the applications to be installed at cluster creation,or setting the Apache Hadoop environment variables.
To learn more, please refer to AWS Data Pipeline documentation.
Source: amazon.com