Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor sync #3312

Merged
merged 68 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
3d4c761
Add new sync implementation (WIP, untested)
eemeli May 20, 2024
9e65c9a
Address code review comments, start adding tests
eemeli Sep 9, 2024
3d6a1ec
Merge branch 'main' into sync-refactor
eemeli Sep 14, 2024
397e5f5
Add integration tests for new code; fix discovered issues
eemeli Sep 14, 2024
5778090
Update requirements
eemeli Sep 17, 2024
7f3bfcb
Satisfy lint
eemeli Sep 17, 2024
83b8fd7
Call repo commit() with pre-formatted author string rather than User …
eemeli Sep 21, 2024
3005870
Add end-to-end test
eemeli Sep 21, 2024
5b227e9
Add task wrapper & force option
eemeli Sep 21, 2024
e657223
Refactor handle_upload_content() into sync_uploaded_file()
eemeli Sep 22, 2024
5da483f
Replace get_download_content() with download_translations_zip()
eemeli Sep 30, 2024
2f0ea10
Remove old sync implementation
eemeli Oct 1, 2024
bb97199
Pretranslate added & changed resources, fix task invocations
eemeli Oct 1, 2024
3da84c0
Move & rename pontoon.sync contents around, adding pontoon.sync.core
eemeli Oct 1, 2024
92fb993
Combine upload & download functions into pontoon.sync.utils
eemeli Oct 1, 2024
410a278
Merge branch 'main' into sync-refactor
eemeli Oct 1, 2024
83049d9
Include --no-strip-extras in `uv pip compile` calls
eemeli Oct 1, 2024
f92e331
Satisfy ruff
eemeli Oct 1, 2024
fcfc5fa
File format detector cleanup, drop unused template
eemeli Oct 1, 2024
15ef5b6
Fix last_synced_revision data, drop remaining multi_locale references
eemeli Oct 1, 2024
f5e16db
More dead code removal
eemeli Oct 1, 2024
455c64a
Revert Repository.permalink_prefix help_text change
eemeli Oct 1, 2024
a5177cf
Support file renames for git repos
eemeli Oct 10, 2024
4efe373
Fix issues discovered by manual testing
eemeli Oct 10, 2024
2714f81
Improve sync logging
eemeli Oct 12, 2024
bb26ab5
Merge branch 'main' into sync-refactor
eemeli Oct 12, 2024
76dfa40
More sync fixes & logging
eemeli Oct 12, 2024
c49e95b
Update moz.l10n dependency
eemeli Oct 13, 2024
63aac31
Oops, fix file upload handling
eemeli Oct 13, 2024
fb538a1
Fix zip download, add test for it
eemeli Oct 15, 2024
8aebdd6
Drop unnecessary extras from test_download
eemeli Oct 15, 2024
646ab4b
Simplify aggregated stats for .po plurals, use SQL UPDATE queries
eemeli Oct 16, 2024
b4854c4
Reduce stats updates further, include total_strings calculation
eemeli Oct 16, 2024
d322f3d
Use simpler query for looking up entity identifiers
eemeli Oct 16, 2024
68f94f7
Use new update_stats() for `manage.py calculate_stats` command
eemeli Oct 17, 2024
240cd73
Add & remove TranslatedResource objects when locales change
eemeli Oct 17, 2024
b70451b
Sum project total_strings from translated resources, not resources
eemeli Oct 18, 2024
33c307d
Update moz.l10n to 0.5.2, log changed resources
eemeli Oct 18, 2024
263a2f1
Always sync all translated resources
eemeli Oct 18, 2024
e6d0425
Merge branch 'main' into sync-refactor
eemeli Oct 18, 2024
5eb6a9f
Merge branch 'main' into sync-refactor
eemeli Nov 26, 2024
f7171bf
Fix manual pretranslation task
eemeli Nov 26, 2024
cf4e4b8
Fix file upload, ensure that it reports at least some error on failure
eemeli Nov 27, 2024
bd649e7
Update to moz.l10n 0.5.5
eemeli Nov 27, 2024
cebc5b2
Satisfy ruff
eemeli Nov 27, 2024
a470d4a
Update to moz.l10n 0.5.6
eemeli Dec 2, 2024
37bdb89
Dedupe updates for multiple changes made to the same resource
eemeli Dec 4, 2024
ff10ffa
Apply suggested changes from code review
eemeli Dec 4, 2024
5b4f8b6
Merge branch 'main' into sync-refactor
eemeli Dec 4, 2024
d575b64
Drop dead code: Entity.reset_active_translation()
eemeli Dec 4, 2024
5d49838
Oops, it's EntityQuerySet.reset_active_translations() that is no long…
eemeli Dec 4, 2024
f166d76
Add test case for translation arriving before its source is added
eemeli Dec 5, 2024
a2ac4fb
Use shallow clones for downloads from projects using git repos
eemeli Dec 12, 2024
c244927
When downloading translations, skip missing files & use full target r…
eemeli Dec 12, 2024
ea4edf0
Merge branch 'main' into sync-refactor
eemeli Dec 12, 2024
619a22a
Fix download tests
eemeli Dec 12, 2024
1f298f4
Update to translate-toolkit 3.14.1
eemeli Dec 12, 2024
c309617
Fix total_strings counts to depend on gettext locale plurals in aggre…
eemeli Dec 13, 2024
c239d7d
Drop unused ResourceQuerySet
eemeli Dec 13, 2024
784dec2
Dismiss local git repo edits when branch is specified
eemeli Dec 16, 2024
8f26503
During update from repo, keep previously fuzzy suggestions unchanged …
eemeli Dec 16, 2024
3762da5
Rather than creating zip, "download" by redirecting to target repository
eemeli Dec 16, 2024
74e4a14
Add shortcut (read: hack) for downloading from projects with separate…
eemeli Dec 16, 2024
357f7fe
Include active fuzzy translations when writing to repo
eemeli Dec 17, 2024
008fa56
When approving matching prior translations, do not also reject them
eemeli Dec 18, 2024
ff95768
Use locale's total_strings for GET <locale>/<slug>/parts/
eemeli Dec 18, 2024
c931f71
Count translation updates before dropping approvals from the dict
eemeli Dec 18, 2024
ff2b00d
Rename add_errors() as add_failed_checks()
eemeli Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions docs/user/localizing-your-projects.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,7 @@ following required fields:
#. **Locales**: select at least one Localizable locale by clicking on it.
#. **Repository URL**: enter your repository's SSH URL of the form
``[email protected]:user/repo.git``.
#. **Download prefix or path to TOML file**: a URL prefix for downloading localized files. For
GitHub repositories, select any localized file on GitHub, click ``Raw`` and
replace locale code and the following bits in the URL with ``{locale_code}``.
If you use one, you need to select the `project configuration file`_ instead
of a localized file.
#. **Download prefix or path to TOML file**: a URL prefix for downloading localized files.
mathjazz marked this conversation as resolved.
Show resolved Hide resolved
#. Click **SAVE PROJECT** at the bottom of the page.
#. After the page reloads, click **SYNC** and wait for Pontoon to import
strings. You can monitor the progress in the Sync log (``/sync/log/``).
Expand Down
12 changes: 6 additions & 6 deletions pontoon/administration/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@
Translation,
)
from pontoon.base.utils import require_AJAX
from pontoon.pretranslation.tasks import pretranslate
from pontoon.pretranslation.tasks import pretranslate_task
from pontoon.sync.models import SyncLog
from pontoon.sync.tasks import sync_project
from pontoon.sync.tasks import sync_project_task


log = logging.getLogger(__name__)
Expand Down Expand Up @@ -431,7 +431,7 @@ def _create_or_update_translated_resources(
resource = _get_resource_for_database_project(project)

for locale in locales:
tr, created = TranslatedResource.objects.get_or_create(
tr, _ = TranslatedResource.objects.get_or_create(
locale_id=locale.pk,
resource=resource,
)
Expand Down Expand Up @@ -542,9 +542,9 @@ def manually_sync_project(request, slug):
"Forbidden: You don't have permission for syncing projects"
)

sync_log = SyncLog.objects.create(start_time=timezone.now())
project = Project.objects.get(slug=slug)
sync_project.delay(project.pk, sync_log.pk)
sync_log = SyncLog.objects.create(start_time=timezone.now())
sync_project_task.delay(project.pk, sync_log.pk)

return HttpResponse("ok")

Expand All @@ -558,6 +558,6 @@ def manually_pretranslate_project(request, slug):
)

project = Project.objects.get(slug=slug)
pretranslate.delay(project.pk)
pretranslate_task.delay(project.pk)

return HttpResponse("ok")
13 changes: 0 additions & 13 deletions pontoon/base/__init__.py
Original file line number Diff line number Diff line change
@@ -1,13 +0,0 @@
MOZILLA_REPOS = (
"ssh://hg.mozilla.org/users/m_owca.info/firefox-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/firefox-for-android-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/thunderbird-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/lightning-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/seamonkey-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/firefox-central/",
"ssh://hg.mozilla.org/users/m_owca.info/firefox-for-android-central/",
"ssh://hg.mozilla.org/users/m_owca.info/thunderbird-central/",
"ssh://hg.mozilla.org/users/m_owca.info/lightning-central/",
"ssh://hg.mozilla.org/users/m_owca.info/seamonkey-central/",
"[email protected]:seamonkey-project/seamonkey-central-l10n.git",
eemeli marked this conversation as resolved.
Show resolved Hide resolved
)
25 changes: 6 additions & 19 deletions pontoon/base/management/commands/calculate_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,8 @@
from django.core.management.base import BaseCommand
from django.db.models import Count

from pontoon.base.models import (
Project,
TranslatedResource,
)
from pontoon.base.models import Project
from pontoon.sync.core.stats import update_locale_stats, update_stats


log = logging.getLogger(__name__)
Expand Down Expand Up @@ -34,20 +32,9 @@ def handle(self, *args, **options):
"disabled", "resource_count"
)

for index, project in enumerate(projects):
log.info(
'Calculating stats for project "{project}" ({index}/{total})'.format(
index=index + 1,
total=len(projects),
project=project.name,
)
)

translated_resources = TranslatedResource.objects.filter(
resource__project=project
)

for translated_resource in translated_resources:
translated_resource.calculate_stats()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we've been using this command recently (i.e. in the last couple of years). Should we remove after a while if issues with that stats stop resurfacing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, as a parts of the aggregated-stats refactor.

log.info(f"Calculating stats for {len(projects)} projects...")
for project in projects:
update_stats(project, update_locales=False)
update_locale_stats()

log.info("Calculating stats complete for all projects.")
6 changes: 2 additions & 4 deletions pontoon/base/migrations/0018_populate_entity_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,20 @@
from django.db import migrations
from django.db.models import F, Func, TextField, Value

from pontoon.sync import KEY_SEPARATOR


def add_entity_context(apps, schema_editor):
Entity = apps.get_model("base", "Entity")

split_key_po = Func(
F("key"),
Value(KEY_SEPARATOR),
Value("\x04"),
mathjazz marked this conversation as resolved.
Show resolved Hide resolved
Value(1),
function="split_part",
output_field=TextField(),
)
split_key_xliff = Func(
F("key"),
Value(KEY_SEPARATOR),
Value("\x04"),
Value(2),
function="split_part",
output_field=TextField(),
Expand Down
10 changes: 8 additions & 2 deletions pontoon/base/models/changed_entity_locale.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,21 @@
from typing import TYPE_CHECKING

from django.db import models
from django.utils import timezone


if TYPE_CHECKING:
from pontoon.base.models import Entity, Locale


class ChangedEntityLocale(models.Model):
"""
ManyToMany model for storing what locales have changed translations for a
specific entity since the last sync.
"""

entity = models.ForeignKey("Entity", models.CASCADE)
locale = models.ForeignKey("Locale", models.CASCADE)
entity: models.ForeignKey["Entity"] = models.ForeignKey("Entity", models.CASCADE)
locale: models.ForeignKey["Locale"] = models.ForeignKey("Locale", models.CASCADE)
eemeli marked this conversation as resolved.
Show resolved Hide resolved
when = models.DateTimeField(default=timezone.now)

class Meta:
Expand Down
168 changes: 32 additions & 136 deletions pontoon/base/models/entity.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from collections.abc import Iterable
from functools import reduce
from operator import ior
from re import escape, findall, match
Expand All @@ -14,7 +15,6 @@
from pontoon.base.models.project import Project
from pontoon.base.models.project_locale import ProjectLocale
from pontoon.base.models.resource import Resource
from pontoon.sync import KEY_SEPARATOR


def get_word_count(string):
Expand Down Expand Up @@ -443,50 +443,6 @@ def prefetch_entities_data(self, locale, preferred_source_locale):

return entities

def reset_active_translations(self, locale):
"""
Reset active translation for given set of entities and locale.
"""
from pontoon.base.models.translation import Translation

translations = Translation.objects.filter(
entity__in=self,
locale=locale,
)

# First, deactivate all translations
translations.update(active=False)

# Mark all approved, pretranslated and fuzzy translations as active.
translations.filter(
Q(approved=True) | Q(pretranslated=True) | Q(fuzzy=True)
).update(active=True)

# Mark most recent unreviewed suggestions without active siblings
# for any given combination of (locale, entity, plural_form) as active.
unreviewed_pks = set()
unreviewed = translations.filter(
approved=False,
pretranslated=False,
fuzzy=False,
rejected=False,
).values_list("entity", "plural_form")

for entity, plural_form in unreviewed:
siblings = (
Translation.objects.filter(
entity=entity,
locale=locale,
plural_form=plural_form,
)
.exclude(rejected=True)
.order_by("-active", "-date")
)
if siblings and not siblings[0].active:
unreviewed_pks.add(siblings[0].pk)

translations.filter(pk__in=unreviewed_pks).update(active=True)

def get_or_create(self, defaults=None, **kwargs):
kwargs["word_count"] = get_word_count(kwargs["string"])
return super().get_or_create(defaults=defaults, **kwargs)
Expand Down Expand Up @@ -532,18 +488,6 @@ class Meta:
models.Index(fields=["resource", "obsolete", "string_plural"]),
]

@property
def cleaned_key(self):
"""
Get cleaned key, without the source string and Translate Toolkit
separator.
"""
key = self.key.split(KEY_SEPARATOR)[0]
if key == self.string:
key = ""

return key

def __str__(self):
return self.string

Expand All @@ -559,90 +503,35 @@ def get_stats(self, locale):
:return: a dictionary with stats for an Entity, all keys are suffixed with `_diff` to
make them easier to pass into adjust_all_stats.
"""
translations = list(
self.translation_set.filter(locale=locale).prefetch_related(
"errors",
"warnings",
)
)

approved_strings_count = len(
[
t
for t in translations
if t.approved and not (t.errors.exists() or t.warnings.exists())
]
)

pretranslated_strings_count = len(
[
t
for t in translations
if t.pretranslated and not (t.errors.exists() or t.warnings.exists())
]
)

if self.string_plural:
approved = int(approved_strings_count == locale.nplurals)
pretranslated = int(pretranslated_strings_count == locale.nplurals)

else:
approved = int(approved_strings_count > 0)
pretranslated = int(pretranslated_strings_count > 0)

if not (approved or pretranslated):
has_errors = bool(
[
t
for t in translations
if (t.approved or t.pretranslated or t.fuzzy) and t.errors.exists()
]
)
has_warnings = bool(
[
t
for t in translations
if (t.approved or t.pretranslated or t.fuzzy)
and t.warnings.exists()
]
)

errors = int(has_errors)
warnings = int(has_warnings)

else:
errors = 0
warnings = 0

unreviewed_count = len(
[
t
for t in translations
if not (t.approved or t.pretranslated or t.fuzzy or t.rejected)
]
)
approved = 0
pretranslated = 0
errors = 0
warnings = 0
unreviewed = 0

for t in self.translation_set.filter(locale=locale).prefetch_related(
"errors", "warnings"
):
if t.errors.exists():
if t.approved or t.pretranslated or t.fuzzy:
errors += 1
elif t.warnings.exists():
if t.approved or t.pretranslated or t.fuzzy:
warnings += 1
elif t.approved:
approved += 1
elif t.pretranslated:
pretranslated += 1
if not (t.approved or t.pretranslated or t.fuzzy or t.rejected):
unreviewed += 1

return {
"total_strings_diff": 0,
"approved_strings_diff": approved,
"pretranslated_strings_diff": pretranslated,
"strings_with_errors_diff": errors,
"strings_with_warnings_diff": warnings,
"unreviewed_strings_diff": unreviewed_count,
}

@classmethod
def get_stats_diff(cls, stats_before, stats_after):
"""
Return stat difference between the two states of the entity.

:arg dict stats_before: dict returned by get_stats() for the initial state.
:arg dict stats_after: dict returned by get_stats() for the current state.
:return: dictionary with differences between provided stats.
"""
return {
stat_name: stats_after[stat_name] - stats_before[stat_name]
for stat_name in stats_before
"unreviewed_strings_diff": unreviewed,
}

def has_changed(self, locale):
Expand Down Expand Up @@ -942,7 +831,9 @@ def map_entities(
):
entities_array = []

entities = entities.prefetch_entities_data(locale, preferred_source_locale)
entities: Iterable[Entity] = entities.prefetch_entities_data(
locale, preferred_source_locale
)

# If requested entity not in the current page
if requested_entity and requested_entity not in [e.pk for e in entities]:
Expand Down Expand Up @@ -981,13 +872,18 @@ def map_entities(
if original_plural != "":
original_plural = entity.alternative_originals[-1].string

key_separator = "\x04"
cleaned_key = entity.key.split(key_separator)[0]
if cleaned_key == entity.string:
cleaned_key = ""

entities_array.append(
{
"pk": entity.pk,
"original": original,
"original_plural": original_plural,
"machinery_original": entity.string,
"key": entity.cleaned_key,
"key": cleaned_key,
"context": entity.context,
"path": entity.resource.path,
"project": entity.resource.project.serialize(),
Expand Down
Loading
Loading