Skip to content

Latest commit

 

History

History
132 lines (104 loc) · 2.65 KB

process.md

File metadata and controls

132 lines (104 loc) · 2.65 KB

Tasks prep batch Create a new folder, in same location for each/all projects

  1. Ubuntu a. Install ubuntu servers (on vb) b. Install java-8 c. Install python3 d. Install sbt (optional) e. Set up ssh

  2. Netcat a. Send message from producer port (Terminal) b. Receive message on consumer port (Terminal)

  3. Python Basic a. Complete map-reduce tasks in python, b. Store output data to csv

  4. Scala Basic a. Complete map reduce tasks in python, b. Store output data to csv

  5. Java Basic a. Complete map reduce tasks in python, b. Store output data to csv

  6. Bash automation a. Automate tasks 1, to 4

  7. Airflow a. Automate tasks 1 to 4

  8. Git a. Create git repo precursors b. Push tasks 1 to 6 to github

  9. HDFS – install a. Install HDFS b. Create directories: data, tmp, and user c. Write installation notes – ptg

  10. Flume – install a. Install Flume b. Make sure flume agent is available everywhere

  11. Flume-1 a. Flume read from file/terminal b. Write to file

  12. Flume-2 a. Flume read from source 2 b. Write to hdfs

  13. Kafka – Install a. Install Kafka b. Make sure flume agent is available everywhere c. Start 3 kafka servers

  14. Kafka – 0 a. Full kafka servers start automation b. Based on number of kafka servers wanted

  15. Kafka – 1 – py a. Kafka producer in python b. Kafka consumer in python c. Read from twitter api, write to terminal

  16. Kafka – 1 – scala a. Kafka producer in Scala b. Kafka consumer in Scala c. Read from twitter api, write to terminal

  17. Hive – Hortonworks a. Employees data to hive b. Full crud operations c. Hive queries

  18. Mysql – Hortonworks a. Employees data to hive b. Full crud operations c. Hive queries

  19. Sqoop – Hortonworks a. Sqoop csv to mysql b. Sqoop mysql to hive c. Sqoop hive files to mysql

  20. Hive – Local a. Install Hive b. Employees data to hive c. Full crud operations d. Hive queries

  21. HBase – cloudera a. Employees data to hbase b. Full crud operations c. hbase queries

  22. HBase – local a. Install hbase b. Employees data to hbase c. Full crud operations d. hbase queries

  23. Spark – (Hortonworks) – scala a. Read/write text file b. Read/write csv file c. Read/write json file d. Map reduce to hive

  24. Spark – (Hortonworks) – python a. Read/write text file b. Read/write csv file c. Read/write json file d. Map reduce to hive

  25. Kafka - Spark a. Stream data from kafka api producer b. Set spark Consumer

  26. Kafka – Spark – hive a. Stream data from kafka api producer b. Set spark Consumer c. Write to hive

  27. Kafka – Spark – hbase a. Stream data from kafka api producer b. Set spark Consumer c. Write to hbase

  28. Capstone Project a. Completed requested project