From c526ab06eee8b909d3436ec069dec5117b8284f7 Mon Sep 17 00:00:00 2001 From: "Xavier R. Hoffmann" Date: Wed, 11 Jan 2023 13:03:06 +0100 Subject: [PATCH] Update SimulatedDataset.ipynb The subsection `Transaction generation process` appears in the section `2.1. Customer profiles generation` but it seems more logical to me that it should appear in the earlier section `Design choices`. --- .../SimulatedDataset.ipynb | 50 +++++++++---------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/Chapter_3_GettingStarted/SimulatedDataset.ipynb b/Chapter_3_GettingStarted/SimulatedDataset.ipynb index 3af320f..e2dfe62 100644 --- a/Chapter_3_GettingStarted/SimulatedDataset.ipynb +++ b/Chapter_3_GettingStarted/SimulatedDataset.ipynb @@ -45,6 +45,31 @@ " \n" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

Transaction generation process

\n", + "\n", + "The simulation will consist of five main steps:\n", + "\n", + "1. Generation of customer profiles: Every customer is different in their spending habits. This will be simulated by defining some properties for each customer. The main properties will be their geographical location, their spending frequency, and their spending amounts. The customer properties will be represented as a table, referred to as the *customer profile table*. \n", + "2. Generation of terminal profiles: Terminal properties will simply consist of a geographical location. The terminal properties will be represented as a table, referred to as the *terminal profile table*.\n", + "3. Association of customer profiles to terminals: We will assume that customers only make transactions on terminals that are within a radius of $r$ of their geographical locations. This makes the simple assumption that a customer only makes transactions on terminals that are geographically close to their location. This step will consist of adding a feature 'list_terminals' to each customer profile, that contains the set of terminals that a customer can use.\n", + "4. Generation of transactions: The simulator will loop over the set of customer profiles, and generate transactions according to their properties (spending frequencies and amounts, and available terminals). This will result in a table of transactions.\n", + "5. Generation of fraud scenarios: This last step will label the transactions as legitimate or genuine. This will be done by following three different fraud scenarios.\n", + "\n", + "The transaction generation process is illustrated below. \n", + "\n", + "![alt text](images/FlowDatasetGenerator.png)\n", + "

\n", + "Fig. 2. Transaction generation process. The customer and terminal profiles are used to generate
a set of transactions. The final step, which generates fraud scenarios, provides the labeled transactions table.\n", + " \n", + "\n", + "The following sections detail the implementation for each of these steps. \n", + " " + ] + }, { "cell_type": "code", "execution_count": 1, @@ -91,31 +116,6 @@ "The `generate_customer_profiles_table` function provides an implementation for generating a table of customer profiles. It takes as input the number of customers for which to generate a profile and a random state for reproducibility. It returns a DataFrame containing the properties for each customer. " ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "

Transaction generation process

\n", - "\n", - "The simulation will consist of five main steps:\n", - "\n", - "1. Generation of customer profiles: Every customer is different in their spending habits. This will be simulated by defining some properties for each customer. The main properties will be their geographical location, their spending frequency, and their spending amounts. The customer properties will be represented as a table, referred to as the *customer profile table*. \n", - "2. Generation of terminal profiles: Terminal properties will simply consist of a geographical location. The terminal properties will be represented as a table, referred to as the *terminal profile table*.\n", - "3. Association of customer profiles to terminals: We will assume that customers only make transactions on terminals that are within a radius of $r$ of their geographical locations. This makes the simple assumption that a customer only makes transactions on terminals that are geographically close to their location. This step will consist of adding a feature 'list_terminals' to each customer profile, that contains the set of terminals that a customer can use.\n", - "4. Generation of transactions: The simulator will loop over the set of customer profiles, and generate transactions according to their properties (spending frequencies and amounts, and available terminals). This will result in a table of transactions.\n", - "5. Generation of fraud scenarios: This last step will label the transactions as legitimate or genuine. This will be done by following three different fraud scenarios.\n", - "\n", - "The transaction generation process is illustrated below. \n", - "\n", - "![alt text](images/FlowDatasetGenerator.png)\n", - "

\n", - "Fig. 2. Transaction generation process. The customer and terminal profiles are used to generate
a set of transactions. The final step, which generates fraud scenarios, provides the labeled transactions table.\n", - " \n", - "\n", - "The following sections detail the implementation for each of these steps. \n", - " " - ] - }, { "cell_type": "code", "execution_count": 2,