High Performance Computing
- Super Computer
- Entire cluster bought at same time
- High end network, server and storage HW
- Small number of scientists have access
MPI (Message Passing Interface) (downway)
- MPI
- Library standard defined by a committee of vendors, implementers, and parallel programmers
- Used to create parallel programs based on message passing (tree)
- Popular for scientific computation being performed on high performance compute cluster
Amdahl's Law
- Parts of a program must be run sequentially and parts can be run in parallel.
- Speedup = 1/((1-P) + P/N) N-#of processing entities, P fraction of program taht is parallel
Big Data Analytics
-
Volume: The amount of data companies want to analyze is growing tremendously. 40 trillion GB by 2020
-
Variety: Data is often unstructured and/or user generated. Tweets, videos
-
Velocity: Analysis must be fast to be useful.
Map Reduce & Hadoop
-
Map Reduce (google developed)
- Large scale analytics
- Uses commodity servers
- Includes a distributed storage system
- schedule task..
-
Hadoop is an open source version of Map Reduce
Map Reudce Flow
- Input files -> Map -> Key-value -> Reudce -> Output
Storm
- Like Spark
- Hadoop is for batch, Storm is for stram
Edge Computing!!!
- Latency
- Bandwidth
Content Delivery Networks (CDN)
-
It is similar to Edge cloud:
- Provide content close to users
- Use less core bandwidth, low latency.
-
Akamai is the major CDN company
- static and dynamic
Iot (Internet of things) need
- Streaming data
Job
- Amazon's cloud services
- Hadoop, big data analysis
- Network programming
- Multi-threading, concurrency
- Virtual machines, containers.