Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for fortnight and century #987

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ddff3eb
added support for decade in hindi
Mr-Sunglasses Sep 27, 2021
96d2e52
added support for decade in hindi
Mr-Sunglasses Sep 27, 2021
e826a06
added support for fortnignt and century
Mr-Sunglasses Sep 27, 2021
7185282
Delete pyvenv.cfg
Mr-Sunglasses Sep 27, 2021
127149b
Delete hi.py
Mr-Sunglasses Sep 27, 2021
2f1f8fc
Delete hi.yaml
Mr-Sunglasses Sep 27, 2021
7edac70
fixed accidentally deleted files
Mr-Sunglasses Sep 28, 2021
9a585be
added test for centuries
Mr-Sunglasses Sep 28, 2021
ec50655
added test for centuries
Mr-Sunglasses Sep 28, 2021
5b55b2a
fixed and added centurys support
Mr-Sunglasses Oct 4, 2021
b461617
Delete pyvenv.cfg
Gallaecio Oct 5, 2021
ceae086
fixed all the bugs now its working fine
Mr-Sunglasses Dec 5, 2021
52582cd
fixed all bugs , now this is working fine
Mr-Sunglasses Dec 5, 2021
83c593d
Removed repeated test
Mr-Sunglasses Jun 19, 2022
e9ab0d0
Removed repeated test for last fortnight
Mr-Sunglasses Jun 19, 2022
b97612e
added my env to gitignore
Mr-Sunglasses Jun 19, 2022
fcab861
Added plural for decade in hindi
Mr-Sunglasses Jun 19, 2022
93863fb
Added plural for decade in hindi
Mr-Sunglasses Jun 19, 2022
baec3b6
Merge branch 'scrapinghub:master' into added-suppor-for-for-fortnight
Mr-Sunglasses Jun 19, 2022
11c6a5f
Removed some garbage files
Mr-Sunglasses Jun 19, 2022
7649958
Added support for counting from 1 to 12 in Hindi
Mr-Sunglasses Jun 19, 2022
472cf9b
Added test for 2 centuries ago in hindi
Mr-Sunglasses Jun 19, 2022
e13c925
Add more words for (in) in hindi language
Mr-Sunglasses Jun 19, 2022
6f7630c
Added tests for the in_future in decade in hindi
Mr-Sunglasses Jun 19, 2022
d89210c
Added test for decade in future in hindi
Mr-Sunglasses Jun 20, 2022
46dcbc9
Added test for coming fortnight
Mr-Sunglasses Jun 20, 2022
10ce3a4
Added Support for in coming fortnight
Mr-Sunglasses Jun 20, 2022
a36c48b
Fixed gitignore
Mr-Sunglasses Jun 20, 2022
2738b28
Fixed bug for centuries
Mr-Sunglasses Jun 20, 2022
0943ee1
Added tests for centuries
Mr-Sunglasses Jun 20, 2022
da174a8
Merge branch 'scrapinghub:master' into added-suppor-for-for-fortnight
Mr-Sunglasses Jun 20, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions dateparser/data/date_translation_data/en.py
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,20 @@
],
"in 1 decade": [
"next decade"
],
"1 century ago": [
"last century",
"this century"
],
"in 1 century": [
"next century"
],
"1 fortnight ago": [
"last fortnight",
"this fortnight"
],
"in 1 fortnight": [
"next fortnight"
]
},
"relative-type-regex": {
Expand Down Expand Up @@ -264,6 +278,18 @@
],
"\\1 decade ago": [
"(\\d+) decades? ago"
],
"in \\1 century": [
"in (\\d+) century?"
],
"\\1 century ago": [
"(\\d+) century? ago"
],
"in \\1 fortnight": [
"in (\\d+) fortnight?"
],
"\\1 fortnight ago": [
"(\\d+) fortnight? ago"
]
},
"locale_specific": {
Expand Down Expand Up @@ -771,6 +797,15 @@
"decade",
"decades"
],
"century": [
"century",
"centurys",
"centuries"
],
"fortnight": [
"fortnight",
"fortnights"
],
"ago": [
"ago"
],
Expand Down
15 changes: 15 additions & 0 deletions dateparser/data/date_translation_data/hi.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,12 @@
],
"2 day ago": [
"परसों"
],
"1 decade ago": [
"पिछला दशक"
],
"in 1 decade": [
"अगला दशक"
]
},
"relative-type-regex": {
Expand Down Expand Up @@ -212,6 +218,12 @@
],
"in \\1 year": [
"(\\d+) वर्ष में"
],
"in \\1 decade": [
"(\\d+) दशक में"
],
"\\1 decade ago": [
"(\\d+) दशक पहले"
]
},
"locale_specific": {},
Expand All @@ -235,6 +247,9 @@
","
],
"sentence_splitter_group": 3,
"decade": [
"दशक"
],
"ago": [
"पहले",
"पूर्व"
Expand Down
8 changes: 7 additions & 1 deletion dateparser/freshness_date_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from .timezone_parser import pop_tz_offset_from_string


_UNITS = r'decade|year|month|week|day|hour|minute|second'
_UNITS = r'fortnight|century|decade|year|month|week|day|hour|minute|second'
PATTERN = re.compile(r'(\d+)\s*(%s)\b' % _UNITS, re.I | re.S | re.U)


Expand Down Expand Up @@ -148,6 +148,12 @@ def get_kwargs(self, date_string):
if 'decades' in kwargs:
kwargs['years'] = 10 * kwargs['decades'] + kwargs.get('years', 0)
del kwargs['decades']
if 'centurys' in kwargs:
kwargs['years'] = 100 * kwargs['centurys'] + kwargs.get('years', 0)
del kwargs['centurys']
if 'fortnights' in kwargs:
kwargs['days'] = 14 * kwargs['fortnights'] + kwargs.get('days', 0)
del kwargs['fortnights']
return kwargs

def get_date_data(self, date_string, settings=None):
Expand Down
2 changes: 1 addition & 1 deletion dateparser/languages/dictionary.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
KNOWN_WORD_TOKENS = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday',
'saturday', 'sunday', 'january', 'february', 'march',
'april', 'may', 'june', 'july', 'august', 'september',
'october', 'november', 'december', 'decade', 'year',
'october', 'november', 'december', 'decade', 'century', 'fortnight', 'year',
'month', 'week', 'day', 'hour', 'minute', 'second', 'ago',
'in', 'am', 'pm']

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,13 @@ september:
decade:
- decade
- decades
century:
- century
- centurys
- centuries
fortnight:
- fortnight
- fortnights
year:
- years
month:
Expand Down Expand Up @@ -47,12 +54,30 @@ relative-type:
- this decade
in 1 decade:
- next decade
1 century ago:
- last century
- this century
in 1 century:
- next century
1 fortnight ago:
- last fortnight
- this fortnight
in 1 fortnight:
- next fortnight

relative-type-regex:
in \1 decade:
- in (\d+) decades?
\1 decade ago:
- (\d+) decades? ago
in \1 century:
- in (\d+) century?
\1 century ago:
- (\d+) century? ago
in \1 fortnight:
- in (\d+) fortnight?
\1 fortnight ago:
- (\d+) fortnight? ago
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct. The ? indicates an optional letter.

Doing this it will accepts "in 1 century" but also "in 1 centur". You need to add an s before the ?.
Also, you should need to add support for centuries here.

You can check if both thigns work with this

 dateparser.parse("in 3 centurys")

and:

 dateparser.parse("in 3 centuries")

Now both are failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noviluni sure fixing the bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noviluni The tests are failing for centuries , even I added them on en.yaml.


simplifications:
- an: '1'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ november:
- नवम्बर
december:
- दिसम्बर
decade:
- दशक
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add the plural for this as well, e.g. दशकों

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Thanks For Suggestions I'll be Implementing Them ASAP ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Done this change, Please Look into it 😊.


year:
- साल
Expand All @@ -33,3 +35,15 @@ in:
relative-type:
2 day ago:
- परसों
1 decade ago:
- पिछला दशक
in 1 decade:
- अगला दशक

relative-type-regex:
in \1 decade:
- (\d+) दशक में
\1 decade ago:
- (\d+) दशक पहले


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nitpick: there is an extra line here.

2 changes: 1 addition & 1 deletion tests/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
'name', 'date_order', 'skip', 'pertain', 'simplifications', 'no_word_spacing', 'ago',
'in', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday',
'january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september',
'october', 'november', 'december', 'decade', 'year', 'month', 'week', 'day', 'hour', 'minute',
'october', 'november', 'december', 'decade', 'century', 'fortnight', 'year', 'month', 'week', 'day', 'hour', 'minute',
'second', 'am', 'pm', 'relative-type', 'relative-type-regex', 'sentence_splitter_group']

NECESSARY_KEYS = ['name', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday',
Expand Down
35 changes: 35 additions & 0 deletions tests/test_freshness_date_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,18 @@ def test_relative_past_dates_with_time_as_period(self, date_string, ago, period)

@parameterized.expand([
# English dates
param('1 fortnight', ago={'days': 14}, period='day'),
param('last fortnight', ago={'days': 14}, period='day'),
param('14 fortnight', ago={'days': 196}, period='day'),
param('a fortnight ago', ago={'days': 14}, period='day'),
param('last fortnight', ago={'days': 14}, period='day'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is repeated and the same as in line 61.

Copy link
Contributor Author

@Mr-Sunglasses Mr-Sunglasses Jun 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Done this change, Please Look into it 😊.

param("1 century", ago={'years': 100}, period='year'),
param("1 century 2 years", ago={'years': 102}, period='year'),
param("1 century 12 months", ago={'years': 100, 'months': 12}, period='month'),
param("1 century and 11 months", ago={'years': 100, 'months': 11}, period='month'),
param("last century", ago={'years': 100}, period='year'),
param("a century ago", ago={'years': 100}, period='year'),
param("10 century", ago={'years': 1000}, period='year'),
param("1 decade", ago={'years': 10}, period='year'),
param("1 decade 2 years", ago={'years': 12}, period='year'),
param("1 decade 12 months", ago={'years': 10, 'months': 12}, period='month'),
Expand Down Expand Up @@ -353,6 +365,8 @@ def test_relative_past_dates_with_time_as_period(self, date_string, ago, period)
param('1 वर्ष, 8 महीने, 2 सप्ताह', ago={'years': 1, 'months': 8, 'weeks': 2}, period='week'),
param('1 वर्ष 7 महीने', ago={'years': 1, 'months': 7}, period='month'),
param('आज', ago={'days': 0}, period='day'),
param('1 दशक', ago={'years': 10}, period='year'),
param('1 दशक पहले', ago={'years': 10}, period='year'),
Comment on lines +370 to +371
Copy link
Collaborator

@gutsytechster gutsytechster Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a few suggestions above to improve the Hindi test cases. Please have a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Done this change, Please Look into it 😊.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe these test cases can be added here as well

  • 1 दशक पूर्व
  • दो दशक पहले
  • 10 दशकों पहले


# af
param("2 uur gelede", ago={'hours': 2}, period='day'),
Expand Down Expand Up @@ -576,6 +590,18 @@ def test_relative_past_dates(self, date_string, ago, period):

@parameterized.expand([
# English dates
param('1 fortnight', ago={'days': 14}, period='day'),
param('last fortnight', ago={'days': 14}, period='day'),
param('14 fortnight', ago={'days': 196}, period='day'),
param('a fortnight ago', ago={'days': 14}, period='day'),
param('last fortnight', ago={'days': 14}, period='day'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. It is repeated as well.

Copy link
Contributor Author

@Mr-Sunglasses Mr-Sunglasses Jun 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Done this change, Please Look into it 😊.

param("1 century", ago={'years': 100}, period='year'),
param("1 century 2 years", ago={'years': 102}, period='year'),
param("1 century 12 months", ago={'years': 100, 'months': 12}, period='month'),
param("1 century and 11 months", ago={'years': 100, 'months': 11}, period='month'),
param("last century", ago={'years': 100}, period='year'),
param("a century ago", ago={'years': 100}, period='year'),
param("10 century", ago={'years': 1000}, period='year'),
param("1 decade", ago={'years': 10}, period='year'),
param("1 decade 2 years", ago={'years': 12}, period='year'),
param("1 decade 12 months", ago={'years': 10, 'months': 12}, period='month'),
Expand Down Expand Up @@ -841,6 +867,7 @@ def test_relative_past_dates(self, date_string, ago, period):
param('1 वर्ष, 8 महीने, 2 सप्ताह', ago={'years': 1, 'months': 8, 'weeks': 2}, period='week'),
param('1 वर्ष 7 महीने', ago={'years': 1, 'months': 7}, period='month'),
param('आज', ago={'days': 0}, period='day'),
param('1 दशक पहले', ago={'years': 10}, period='year'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a few more test cases for the Hindi version here? For e.g.

  • 1 दशक पूर्व
  • दो दशक पहले
  • 10 दशकों पहले

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Done this change, Please Look into it 😊.


# af
param("2 uur gelede", ago={'hours': 2}, period='day'),
Expand Down Expand Up @@ -1066,6 +1093,13 @@ def test_normalized_relative_dates(self, date_string, ago, period):

@parameterized.expand([
# English dates
param('in a fortnight', in_future={'days': 14}, period='day'),
param('next fortnight', in_future={'days': 14}, period='day'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, but do we support coming fortnight?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Done this change, Please Look into it 😊.

param('in 1 century 2 months', in_future={'years': 100, 'months': 2}, period='month'),
param('in 10 century', in_future={'years': 1000}, period='year'),
param('in 1 century 12 years', in_future={'years': 112}, period='year'),
param('next century', in_future={'years': 100}, period='year'),
param('in a century', in_future={'years': 100}, period='year'),
param('in 1 decade 2 months', in_future={'years': 10, 'months': 2}, period='month'),
param('in 100 decades', in_future={'years': 1000}, period='year'),
param('in 1 decade 12 years', in_future={'years': 22}, period='year'),
Expand Down Expand Up @@ -1160,6 +1194,7 @@ def test_normalized_relative_dates(self, date_string, ago, period):
param('17 सेकंड बाद', in_future={'seconds': 17}, period='day'),
param('1 वर्ष, 5 महीने, 1 सप्ताह में',
in_future={'years': 1, 'months': 5, 'weeks': 1}, period='week'),
param('1 दशक में', in_future={'years': 10}, period='year'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add a few more test cases for the Hindi version here? For e.g.

  • पांच दशक बाद
  • दश दशक पश्चात
  • 9 दशकों मे

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gutsytechster Done this change, Please Look into it 😊.


# af
param("oor 10 jaar", in_future={'years': 10}, period='year'),
Expand Down