Skip to content

Commit 6c3a00b

Browse files
alfsbAndré L F S Bacci
and
André L F S Bacci
authored
Revcheck library and QA tools for translation synchronization  (#111)
Add QA tooling for translations to ensure they do not get out of sync. --------- Co-authored-by: André L F S Bacci <[email protected]>
1 parent 332f933 commit 6c3a00b

24 files changed

+1853
-0
lines changed

CODEOWNERS

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# The following volunteers have self-identified as subject matter experts
2+
# or interested parties over a particular area of this repository.
3+
# While requesting a review from someone does not obligate that person to
4+
# review a pull request, these reviewers might have valuable knowledge of
5+
# the problem area and could aid in deciding whether a pull request is ready
6+
# for merging.
7+
#
8+
# For more information, see the GitHub CODEOWNERS documentation:
9+
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
10+
11+
/scripts/translation/ @alfsb

README.md

+5
Original file line numberDiff line numberDiff line change
@@ -149,3 +149,8 @@ and find issues with it, they are located in the `scripts/qa/` directory.
149149
There might be some more just in `scripts/` but they need to be checked if they
150150
are still relevant and/or given some love.
151151

152+
# Translation Tools
153+
154+
There are also various scripts to ensure the quality and synchrony of
155+
documentation translations, located in the `scripts/translation/` directory.
156+

scripts/translation/.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Persistent data shared between scripts
2+
.cache/

scripts/translation/README.md

+111
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Some useful scripts for maintaining translation consistency of manual
2+
3+
Some of these scripts only test some file contents or XML structure
4+
of translated files against their equivalents on `en/` directory.
5+
Others will try modify the translations in place, changing the
6+
translated files. Use with care.
7+
8+
Not all translations are identical, or use the same conventions.
9+
So not all scripts will be of use for all translations. The
10+
assumptions of each script are described in each file.
11+
12+
The `lib/` directory contains common code and functionality
13+
across these scripts.
14+
15+
Before using the scripts, it need be configured:
16+
```
17+
php doc-base/scripts/translation/configure.php $LANG_DIR
18+
```
19+
20+
## qarvt.php
21+
22+
`qarvt.a.php` checks if all translated files have revtags in the
23+
expected format.
24+
25+
## qaxml.a.php
26+
27+
`qaxml.a.php` checks if all updated translated files have
28+
the same tag-attribute-value triples. Tag's attributes are extensively
29+
utilized in manual for linking and XIncluding. Translated files with
30+
missing os mistyped attributes may cause build failing or missing
31+
parts not copied by XIncludes.
32+
33+
## qaxml.e.php
34+
35+
`qaxml.e.php` checks if all updated translated files have
36+
the same external entities as the original files. Unbalanced entities
37+
may indicate mistyped or wrongly traduced parts.
38+
39+
## qaxml.p.php
40+
41+
`qaxml.p.php` checks if all updated translated files have
42+
the same processing instructions as the original files. Unbalanced entities
43+
may cause compilation errors, as they are utilized on manual in the build
44+
process.
45+
46+
## qaxml.t.php
47+
48+
`qaxml.t.php` checks if all updated translated files have
49+
the same tags as the original files. Different number of tags between
50+
source texts and target translations may cause compilation errors.
51+
52+
Usage: `php qaxml.t.php [--detail] [tag[,tag]]`
53+
54+
`[tag[,tag]]` is a comma separated tag list to check their
55+
contents, as some tag's contents are expected *not* be translated.
56+
57+
`--detail` will also print line defintions of each mismatched tag,
58+
to facilitate bitsecting.
59+
60+
## Suggested execution
61+
62+
Structural checks:
63+
64+
```
65+
php doc-base/scripts/translation/configure.php $LANG_DIR
66+
67+
php doc-base/scripts/translation/qarvt.php
68+
69+
php doc-base/scripts/translation/qaxml.a.php
70+
php doc-base/scripts/translation/qaxml.e.php
71+
php doc-base/scripts/translation/qaxml.p.php
72+
php doc-base/scripts/translation/qaxml.t.php
73+
```
74+
Tags where is expected no translations:
75+
```
76+
php doc-base/scripts/translation/qaxml.t.php acronym
77+
php doc-base/scripts/translation/qaxml.t.php classname
78+
php doc-base/scripts/translation/qaxml.t.php constant
79+
php doc-base/scripts/translation/qaxml.t.php envar
80+
php doc-base/scripts/translation/qaxml.t.php function
81+
php doc-base/scripts/translation/qaxml.t.php interfacename
82+
php doc-base/scripts/translation/qaxml.t.php parameter
83+
php doc-base/scripts/translation/qaxml.t.php type
84+
php doc-base/scripts/translation/qaxml.t.php classsynopsis
85+
php doc-base/scripts/translation/qaxml.t.php constructorsynopsis
86+
php doc-base/scripts/translation/qaxml.t.php destructorsynopsis
87+
php doc-base/scripts/translation/qaxml.t.php fieldsynopsis
88+
php doc-base/scripts/translation/qaxml.t.php funcsynopsis
89+
php doc-base/scripts/translation/qaxml.t.php methodsynopsis
90+
```
91+
Tags where is expected few translations:
92+
```
93+
php doc-base/scripts/translation/qaxml.t.php code
94+
php doc-base/scripts/translation/qaxml.t.php computeroutput
95+
php doc-base/scripts/translation/qaxml.t.php filename
96+
php doc-base/scripts/translation/qaxml.t.php literal
97+
php doc-base/scripts/translation/qaxml.t.php varname
98+
```
99+
100+
# Migration
101+
102+
## Maintainers with spaces
103+
104+
The regex on `RevtagParser` was narrowed to not accept maintainer's names
105+
with spaces. This need to be confirmed on all active translations, or
106+
the regex modified to accept spaces again.
107+
108+
## en/chmonly
109+
110+
`en/chmonly` is ignored on revcheck, but it appears translatable. If it's a
111+
`en/` only directory, this should be uncommented on RevcheckIgnore.

scripts/translation/configure.php

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
<?php
2+
/**
3+
* +----------------------------------------------------------------------+
4+
* | Copyright (c) 1997-2023 The PHP Group |
5+
* +----------------------------------------------------------------------+
6+
* | This source file is subject to version 3.01 of the PHP license, |
7+
* | that is bundled with this package in the file LICENSE, and is |
8+
* | available through the world-wide-web at the following url: |
9+
* | https://www.php.net/license/3_01.txt. |
10+
* | If you did not receive a copy of the PHP license and are unable to |
11+
* | obtain it through the world-wide-web, please send a note to |
12+
* | [email protected], so we can mail you a copy immediately. |
13+
* +----------------------------------------------------------------------+
14+
* | Authors: André L F S Bacci <ae php.net> |
15+
* +----------------------------------------------------------------------+
16+
* | Description: Generate cached data for revcheck and QA tools. |
17+
* +----------------------------------------------------------------------+
18+
*/
19+
20+
require_once __DIR__ . '/lib/all.php';
21+
22+
if ( count( $argv ) < 2 || in_array( '--help' , $argv ) || in_array( '-h' , $argv ) )
23+
{
24+
fwrite( STDERR , "Usage: {$argv[0]} [lang_dir]\n\n" );
25+
fwrite( STDERR , "See https://github.com/php/doc-base/tree/master/scripts/translation#readme for more info.\n" );
26+
return;
27+
}
28+
29+
new RevcheckRun( 'en' , $argv[1] , true );

scripts/translation/lib/CacheFile.php

+57
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
<?php
2+
/**
3+
* +----------------------------------------------------------------------+
4+
* | Copyright (c) 1997-2023 The PHP Group |
5+
* +----------------------------------------------------------------------+
6+
* | This source file is subject to version 3.01 of the PHP license, |
7+
* | that is bundled with this package in the file LICENSE, and is |
8+
* | available through the world-wide-web at the following url: |
9+
* | https://www.php.net/license/3_01.txt. |
10+
* | If you did not receive a copy of the PHP license and are unable to |
11+
* | obtain it through the world-wide-web, please send a note to |
12+
* | [email protected], so we can mail you a copy immediately. |
13+
* +----------------------------------------------------------------------+
14+
* | Authors: André L F S Bacci <ae php.net> |
15+
* +----------------------------------------------------------------------+
16+
* | Description: Class to handle data persistence. |
17+
* +----------------------------------------------------------------------+
18+
*/
19+
20+
require_once __DIR__ . '/all.php';
21+
22+
class CacheFile
23+
{
24+
const CACHE_DIR = __DIR__ . '/../.cache';
25+
26+
private string $filename;
27+
28+
function __construct( string $file )
29+
{
30+
$this->filename = CacheFile::prepareFilename( $file , true );
31+
}
32+
33+
public function load( mixed $init = null )
34+
{
35+
if ( file_exists( $this->filename ) == false )
36+
return $init;
37+
$data = file_get_contents( $this->filename );
38+
return unserialize( gzdecode( $data ) );
39+
}
40+
41+
public function save( $data )
42+
{
43+
$contents = gzencode( serialize( $data ) );
44+
file_put_contents( $this->filename , $contents );
45+
}
46+
47+
public static function prepareFilename( string $file , bool $createCacheDirs = false )
48+
{
49+
if ( str_starts_with( $file , '/' ) )
50+
return $file;
51+
$outPath = CacheUtil::CACHE_DIR . '/' . dirname( $file );
52+
$outFile = rtrim( $outPath , '/' ) . '/' . $file;
53+
if ( $createCacheDirs && file_exists( $outPath ) == false )
54+
mkdir( $outPath , 0777 , true );
55+
return $outFile;
56+
}
57+
}

scripts/translation/lib/CacheUtil.php

+51
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
<?php
2+
/**
3+
* +----------------------------------------------------------------------+
4+
* | Copyright (c) 1997-2023 The PHP Group |
5+
* +----------------------------------------------------------------------+
6+
* | This source file is subject to version 3.01 of the PHP license, |
7+
* | that is bundled with this package in the file LICENSE, and is |
8+
* | available through the world-wide-web at the following url: |
9+
* | https://www.php.net/license/3_01.txt. |
10+
* | If you did not receive a copy of the PHP license and are unable to |
11+
* | obtain it through the world-wide-web, please send a note to |
12+
* | [email protected], so we can mail you a copy immediately. |
13+
* +----------------------------------------------------------------------+
14+
* | Authors: André L F S Bacci <ae php.net> |
15+
* +----------------------------------------------------------------------+
16+
* | Description: Common functions do load and save to cache files. |
17+
* +----------------------------------------------------------------------+
18+
*/
19+
20+
require_once __DIR__ . '/all.php';
21+
22+
class CacheUtil
23+
{
24+
const CACHE_DIR = __DIR__ . '/../.cache';
25+
26+
public static function load( string $path , string $file )
27+
{
28+
$filename = CacheUtil::prepareFilename( $path , $file , true );
29+
if ( file_exists( $filename ) == false )
30+
return null;
31+
$data = file_get_contents( $filename );
32+
return unserialize( $data );
33+
}
34+
35+
public static function save( string $path , string $file , $data )
36+
{
37+
$outFile = CacheUtil::prepareFilename( $path , $file , true );
38+
$contents = serialize( $data );
39+
file_put_contents( $outFile , $contents );
40+
}
41+
42+
public static function prepareFilename( string $path , string $file , bool $createDirs = false )
43+
{
44+
$baseDir = CacheUtil::CACHE_DIR;
45+
$outPath = rtrim( $baseDir , '/' ) . '/' . $path;
46+
$outFile = rtrim( $outPath , '/' ) . '/' . $file;
47+
if ( $createDirs && file_exists( $outPath ) == false )
48+
mkdir( $outPath , 0777 , true );
49+
return $outFile;
50+
}
51+
}
+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
<?php
2+
/**
3+
* +----------------------------------------------------------------------+
4+
* | Copyright (c) 1997-2023 The PHP Group |
5+
* +----------------------------------------------------------------------+
6+
* | This source file is subject to version 3.01 of the PHP license, |
7+
* | that is bundled with this package in the file LICENSE, and is |
8+
* | available through the world-wide-web at the following url: |
9+
* | https://www.php.net/license/3_01.txt. |
10+
* | If you did not receive a copy of the PHP license and are unable to |
11+
* | obtain it through the world-wide-web, please send a note to |
12+
* | [email protected], so we can mail you a copy immediately. |
13+
* +----------------------------------------------------------------------+
14+
* | Authors: André L F S Bacci <ae php.net> |
15+
* +----------------------------------------------------------------------+
16+
* | Description: Parse `git diff` to complement file state. |
17+
* +----------------------------------------------------------------------+
18+
*/
19+
20+
require_once __DIR__ . '/all.php';
21+
22+
class GitDiffParser
23+
{
24+
public static function parseNumstatInto( string $dir , RevcheckFileInfo $file )
25+
{}
26+
}
+93
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
<?php
2+
/**
3+
* +----------------------------------------------------------------------+
4+
* | Copyright (c) 1997-2023 The PHP Group |
5+
* +----------------------------------------------------------------------+
6+
* | This source file is subject to version 3.01 of the PHP license, |
7+
* | that is bundled with this package in the file LICENSE, and is |
8+
* | available through the world-wide-web at the following url: |
9+
* | https://www.php.net/license/3_01.txt. |
10+
* | If you did not receive a copy of the PHP license and are unable to |
11+
* | obtain it through the world-wide-web, please send a note to |
12+
* | [email protected], so we can mail you a copy immediately. |
13+
* +----------------------------------------------------------------------+
14+
* | Authors: André L F S Bacci <ae php.net> |
15+
* +----------------------------------------------------------------------+
16+
* | Description: Parse `git log` to complement file state. |
17+
* +----------------------------------------------------------------------+
18+
*/
19+
20+
require_once __DIR__ . '/all.php';
21+
22+
class GitLogParser
23+
{
24+
static function parseInto( string $lang , RevcheckFileList & $list )
25+
{
26+
$cwd = getcwd();
27+
chdir( $lang );
28+
$fp = popen( "git log --name-only" , "r" );
29+
$hash = "";
30+
$date = "";
31+
$skip = false;
32+
while ( ( $line = fgets( $fp ) ) !== false )
33+
{
34+
// new commit block
35+
if ( substr( $line , 0 , 7 ) == "commit " )
36+
{
37+
$hash = trim( substr( $line , 7 ) );
38+
$date = "";
39+
$skip = false;
40+
continue;
41+
}
42+
// datetime of commit
43+
if ( strpos( $line , 'Date:' ) === 0 )
44+
{
45+
$line = trim( substr( $line , 5 ) );
46+
$date = strtotime( $line );
47+
continue;
48+
}
49+
// other headers
50+
if ( strpos( $line , ': ' ) > 0 )
51+
continue;
52+
// empty lines
53+
if ( trim( $line ) == "" )
54+
continue;
55+
// commit message
56+
if ( str_starts_with( $line , ' ' ) )
57+
{
58+
// commits with this mark are ignored
59+
if ( stristr( $line, '[skip-revcheck]' ) !== false )
60+
$skip = true;
61+
continue;
62+
}
63+
// otherwise, a filename
64+
$filename = trim( $line );
65+
$info = $list->get( $filename );
66+
67+
// untracked file (deleted, renamed)
68+
if ( $info == null )
69+
continue;
70+
71+
// the head commit
72+
if ( $info->head == "" )
73+
{
74+
$info->head = $hash;
75+
$info->date = $date;
76+
}
77+
78+
// after, only tracks non skipped commits
79+
if ( $skip )
80+
continue;
81+
82+
// the diff commit
83+
if ( $info->diff == "" )
84+
{
85+
$info->diff = $hash;
86+
$info->date = $date;
87+
}
88+
}
89+
90+
pclose( $fp );
91+
chdir( $cwd );
92+
}
93+
}

0 commit comments

Comments
 (0)