-
Notifications
You must be signed in to change notification settings - Fork 40
/
Copy pathchangelog.txt
76 lines (64 loc) · 4.14 KB
/
changelog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
Release 1.7.1
- Bug fix for the "non progressive alignment" exception in the command-line tool
Release 1.7
- Major rewrite of the matching phase of the Dekker algorithm. The MatchTable approach used in versions 1.2 - 1.6 of
CollateX would match the tokens of the witness to be aligned against the tokens present on the vertices of the variant
graph one witness at a time. This approach did not scale very well and it was unable to look ahead. The new
implementation of the matching phase builds a token index of all tokens of all witnesses before the alignment phase
based on sorting the tokens. This approach is faster and returns a complete overview of all the possible matches
including 1) the number of tokens present in a block, 2) in how many witnesses the block occurs and 3) how frequent
the block occurs in the complete witness set.
- Optimizations of the alignment phase of the Dekker algorithm. A priority queue is now used instead of a multiple
value map to select the best possible match. The detection of possible overlap between possible matches now has a fast
path in case of complete overlap or no overlap between two possible blocks.
- The alignment phase of algorithm (including the transposition detection) has stayed the same. There are same quality
improvements however based on the fact that the possible matches are now based on the tokens of the complete witness
set.
Release 1.6.2
- Added servlet module for easy deployment on servlet containers such as the Apache Tomcat web-server.
Implementation is based on JAX-RS and Jersey 2.
Release 1.6.1
- JSON processing: fix regression bug due to which "tokenComparator" and "algorithm" turned into mandatory fields
Release 1.6
- new algorithm based on Greedy String Tiling
- Java 8 now required
- provide our own implementation of variant graphs
- remove optional and seldomly used integrations with Neo4j and Apache Cocoon
- turn collatex-core into a self-contained library, independent of other components
- package collatex-tools as a self-contained, shaded JAR
Release 1.5.1
- Extended the normalization in the javascript alignment table rendering to not only trim
whitespace but also lowercase the tokens.
- Update of Google Guava to v15.0.
Release 1.5
- Feature: The merging and rendering of transpositions is now switchable in the web-service.
- Feature: Punctuation is now treated as separate tokens by default in the web-service and command-line tool.
- Transposition limiter is moved from the Transposition Detector class to the DekkerAlgorithm class.
- The transposition detector is rewritten. It no longer works from left to right, but from largest
moved distance to smallest moved distance. This improves the alignment result in case of longer witnesses.
- Improved handling of competing blocks of text in the IslandConflictResolver.
- Fix: When splitting island in the IslandConflictResolver resulting islands were only
kept if there were of size two and up. Now they are kept if they are of size one and up.
Release 1.4
- Workaround for a, b / b, a transpositions and Greek witnesses, where a segment of a single word
can be longer than a segment containing lots of small words.
- Transposition limiter is moved from IslandConflictResolver to
the transposition detector class.
- Fix: PhraseMatches in the TranspositionDetector are now normalized in size. This fixes a problem where false
positives where detected in cases where there are more than 2 witnesses
and a specific case of variation between witnesses.
Release 1.3
- First release with alignment based on the MatchTable approach.
This approach improves the alignment quality greatly in case of longer witnesses.
- Major API cleanup.
- New ColateX Tools package, containing a web-service based on JAX-RS.
- New Command-line interface.
Release 1.2
- Beta release of the MatchTable approach (only released internally to beta testers).
Release 1.1
- First release to allow a custom matching function by supplying a comparator<Token> function to the aligner.
Release 1.0
- First release to align against a variant graph instead of an alignment table. The alignment table is now a
visualization of the variant graph.
Release 0.9
- First public release (beta quality).