Laniakea team has completed several BigData projects in the past. This project was done for a client with specific requirement. Please reach out to us for further queries.
Create Directory Service in AWS.
Please use below parameter for SimpleAD configuration.
Password – Welcome1234
You can create the access URL, however its not necessary.
You can also configure applications with this Directory
Ranger Install.
Launch an EC2 instance.
Lau
Launch an instance with below configuration.
Download the Key Pair.
I tried to create the whole stack with the link Provided in the blog, However it didnt work. It created only the RangerServer and failed or said CREATE_IN_PROGRESS for most of the tasks.
I tried to create the The above screen will not have all the tasks completed.
Now, login to the RangerServer using the credentials.
Get the repository for the Hadoop 2.4 via below command.
wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.2.0/hdp.repo -O /etc/yum.repos.d/hdp.repo
Below command is required in order for the yum to work.
yum clean all
Install ranger-admin
yum install ranger-admin
Install Maven
wget http://mirrors.advancedhosters.com/apache/maven/maven-3/3.5.3/binaries/apache-maven-3.5.3-bin.tar.gz
Extract maven tar file
tar -xvf apache-maven-3.5.3-bin.tar.gz
Set environment variables
export M2_HOME=/usr/local/apache-maven-3.5.3
export M2=$M2_HOME/bin
export PATH=$M2:$PATH
mvn -version
Install git
yum install git
Install GCC
yum install gcc
Install MySQL
rpm -Uvh /home/ec2-user/mysql-community-release-el6-5.noarch.rpm
yum install mysql-community-server
Start MySQL service
service mysqld start
Build Ranger admin source
cd ~/dev
git clone https://github.com/apache/incubator-ranger.git
Under target directory, there should be several .tar files including ranger admin tar.
You need to install ranger policy admin using below procedure.
cd /usr/local
tar zxvf ~/dev/incubator-ranger/target/ranger-0.5.0-admin.tar.gz
ln -s ranger-0.5.0-admin ranger-admin
cd /usr/local/ranger-admin
update install.properties file
db_root_user=root
db_root_password=root
db_host=localhost
db_name=ranger
db_user=rangeradmin
db_password=rangeradmin
audit_db_name=ranger
audit_db_user=rangerlogger
audit_db_password=rangerlogger
start setup
./setup.sh
Start ranger admin
ranger-admin start
Login to MySQL and change the root password.
mysql -u root -p
I Verify login credential by login into the MySQL.
mysql -u root -p $r53dfftR
Download Hive – You can choose to install on ranger-admin server or it will be installed on EMR servers.
cd dev/
wget http://apache.claz.org/hive/hive-2.3.3/apache-hive-2.3.3-bin.tar.gz
Create S3 bucket which will be used to store logs.
Please configure aws cli in your laptop so that below commands can work.
This command will create default Roles and Profiles.
Create Role – EC2
Launch EC2 Windows machine for SimpleAD configuration. Make sure to select Domain Join Directory option below.
Create a windows EC2 instance and configure it with corp.emr.local SimpleAD. Follow below steps to login to this EC2 instance.
Install AD tools on windows server – Program feature
Login with SimpleAD administrator password – corp\administrator
Create user Analyst1 from AD tools users
Login to Hue using this user
Create EMR Cluster using below command. Make sure to go through all the options and change based on your environment. Some of the locatioin below s3://aws-bigdata-blog shouldn’t be changed. InstancePorfile, service-role shouldn’t be changes as well. They are default.
aws emr create-cluster –applications Name=Hive Name=Spark Name=Hue –tags ‘Name=EMR-Security’ –ec2-attributes ‘{“KeyName”:”emr-amz”,”InstanceProfile”:”EMR_EC2_DefaultRole”,”SubnetId”:”subnet-0bcd2b56″,”EmrManagedSlaveSecurityGroup”:”sg-458d2e0e”,”EmrManagedMasterSecurityGroup”:”sg-828a29c9″}’ –release-label emr-5.0.0 –log-uri ‘s3n://emr-local-log/emrlog/‘ –steps ‘[{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://emr-local-log/emrlog/script-runner.jar”,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3a://elasticmapreduce/libs/script-runner/script-runner.jar“,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/loadDataIntoHDFS.sh”,”us-east-1″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”LoadHDFSData”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/install-hive-hdfs-ranger-plugin.sh”,”34.227.84.73″,”0.6″,”s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger”],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”InstallRangerPlugin”},{“Args”:[“spark-submit”,”–deploy-mode”,”cluster”,”–class”,”org.apache.spark.examples.SparkPi”,”/usr/lib/spark/examples/jars/spark-examples.jar”,”10″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”command-runner.jar”,”Properties”:”=”,”Name”:”SparkStep”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/install-hive-hdfs-ranger-policies.sh”,”34.227.84.73″,”s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger/inputdata”],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”InstallRangerPolicies”}]’ –instance-groups ‘[{“InstanceCount”:0,”InstanceGroupType”:”TASK”,”InstanceType”:”c1.medium”,”Name”:”Task”},{“InstanceCount”:1,”InstanceGroupType”:”CORE”,”InstanceType”:”m3.2xlarge”,”Name”:”CORE”},{“InstanceCount”:1,”InstanceGroupType”:”MASTER”,”InstanceType”:”m3.2xlarge”,”Name”:”MASTER”}]’ –configurations ‘[{“Classification”:”hue-ini”,”Properties”:{},”Configurations”:[{“Classification”:”desktop”,”Properties”:{},”Configurations”:[{“Classification”:”auth”,”Properties”:{“backend”:”desktop.auth.backend.LdapBackend”},”Configurations”:[]},{“Classification”:”ldap”,”Properties”:{“bind_dn”:”binduser”,”trace_level”:”0″,”search_bind_authentication”:”false”,”debug”:”true”,”base_dn”:”dc=corp,dc=emr,dc=local”,”bind_password”:”Welcome1234″,”ignore_username_case”:”true”,”create_users_on_login”:”true”,”ldap_username_pattern”:”uid=usertest1,cn=users,dc=corp,dc=emr,dc=local”,”force_username_lowercase”:”true”,”ldap_url”:”ldap://172.31.84.239″,”nt_domain”:”corp.emr.local”},”Configurations”:[{“Classification”:”groups”,”Properties”:{“group_filter”:”objectclass=*”,”group_name_attr”:”cn”},”Configurations”:[]},{“Classification”:”users”,”Properties”:{“user_name_attr”:”sAMAccountName”,”user_filter”:”objectclass=*”},”Configurations”:[]}]}]}]}]’ –bootstrap-actions ‘[{“Path”:”s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger/scripts/download-scripts.sh”,”Args”:[“s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger”],”Name”:”Download scripts”}]’ –service-role EMR_DefaultRole –name ‘EMRRangerTest’ –scale-down-behavior TERMINATE_AT_TASK_COMPLETION –region us-east-1
Create Install Atlas-metadata
Atlas
yum install atlas-metadata
Create client.properties
Leave A Comment