Skip to content

Commit 4658ab0

Browse files
committed
add digit swap detection solution
1 parent 3a7624b commit 4658ab0

File tree

2 files changed

+59
-0
lines changed

2 files changed

+59
-0
lines changed

data-box/digit_swap/README.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Consecutive digit swap
2+
3+
----
4+
## Overview
5+
6+
This project provides a solution to detecting consecutive digit swaps in phone numbers.
7+
E.g.: 070 1234 5678 vs 070 2134 5678
8+
9+
----
10+
## Implementation
11+
1. Modify this code to run on custom script Python container. (https://docs.treasuredata.com/articles/#!pd/python-custom-scripting-example)
12+
2. Copy and paste the code into a custom script in Treasure Workflows.
13+
14+
----
15+
## Considerations
16+
17+
This project can be used to detect any consecutive character swaps, e.g.: email, username etc.
18+
19+
----
20+
## Questions
21+
22+
Please feel free to reach out to [email protected] with any questions you have about using this code.

data-box/digit_swap/digit_swap.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
import pandas as pd
2+
3+
def check_consecutive_digit_swap():
4+
df = pd.read_csv('data.csv', dtype=str)
5+
df = df.reset_index() # Make sure indexes pair with number of rows.
6+
cnt = 0
7+
8+
# init result csv
9+
f = open('res.csv','w+')
10+
f.write('phone1,phone2\n')
11+
f.close()
12+
13+
f = open('res.csv', 'w')
14+
15+
for index, row in df.iterrows():
16+
phone1 = row['ph1']
17+
phone2 = row['ph2']
18+
19+
# Check if lengths are the same.
20+
if len(phone1) == len(phone2):
21+
22+
# Find differing positions.
23+
differing_positions = [i for i in range(len(phone1)) if phone1[i] != phone2[i]]
24+
25+
# Check if there are exactly two differing positions, and that they are consecutive.
26+
if len(differing_positions) == 2:
27+
i, j = differing_positions
28+
if (j == i + 1 and phone1[i] == phone2[j] and phone1[j] == phone2[i]):
29+
cnt = cnt + 1
30+
f.write(phone1 + ',' + phone2 + '\n')
31+
#print(phone1, phone2, (j == i + 1
32+
# and phone1[i] == phone2[j]
33+
# and phone1[j] == phone2[i]))
34+
print(str(cnt))
35+
f.close()
36+
37+
check_consecutive_digit_swap()

0 commit comments

Comments
 (0)