Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md

README.md

Challenge 2 - Design a Data lake for a High Volume Trading Exchange

Problem Statement

You are the Architect for a new Big Data based Data Lake system for a High Volume Trading Exchange which processes approx. one million transactions per second. During peek volume it can reach upto five million transaction per second. For ease of use, let's call this system Exchange Store

Create a -

efficient, scalable, fault tolerant and highly available system for the Exchange Store
data is sourced via the following options
- realtime message (~400k messages per second) via Kafka
- Start of day positions via files (~10k files with various formats like CSV, TXT, JSON and XML)
reports are generated by End of Day (EOD) via the following options
- via Kafka topic for consumers who need to process EOD positions and trades
- EOD feed files (~12k feed files) sent via different mechanisms to consumers (SFTP, Object Store, etc.,)

Use relevant database, technology stack, frameworks and tools for creating an efficient system which processes these huge volumes without much delay. Also if possible mention why would you choose the tool over others.

Hint: Some of these which come to my mind are: Apache Spark, Apache Storm, Apache Flink, Apache Hadoop, Apache HBase, Apache Hive, Amazon EMR, Windows Azure HDInsight, GCP Dataproc, etc.,

Assumptions

You need to also consider the maintainability and operational aspects of the deployment too (Observability).

Things to consider

You can leverage any non-Cloud platforms or Cloud Platforms (eg, Cloud Foundry, AWS, Azure or GCP) to overlay your deployment diagram and leverage features from these platforms.

Solutions from Community

Name	Solution	Comments
Name	Solution	This architecture uses so and so.. This is a sample text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

challenge-2

challenge-2

README.md

Challenge 2 - Design a Data lake for a High Volume Trading Exchange

Table of Contents

Problem Statement

Assumptions

Things to consider

Solutions from Community

Files

challenge-2

Directory actions

More options

Directory actions

More options

Latest commit

History

challenge-2

Folders and files

parent directory

README.md

Challenge 2 - Design a Data lake for a High Volume Trading Exchange

Table of Contents

Problem Statement

Assumptions

Things to consider

Solutions from Community