Releases: metafacture/metafacture-core
Metafacture Core Distribution 5.3.1
Changes
Reverted
- The dependency upgrade of
log4j:1.2.12
tolog4j-core:2.14.1
can be problematic, so this is reverted in 36ed969 and the dependency is only minor updated tolog4j:1.2.17
. - an API break concerning the accessibility of
FilenameUtil
inFilenameExtractor
is reverted in 7c1ea04.
Possible caveats
Using a metamorph.xsd
of your own and making use of FileMap
you have to also update your locally metamorph.xsd
like:
- <attribute name="separator" type="string" use="optional" default="\t">
+ <attribute name="separator" type="string" use="optional" default="	">
This will occur only quite rarely. This is the result of a bug fix (d528ac9) (in effect the default separator defined (falsely) in metamorph.xsd has had no effect at all).
Metafacture Core Distribution 5.3.0
Consider this release as deprecated .
The update of the dependency of log4j:log4j:1.2.12
to org.apache.logging.log4j:log4j-core:2.14.1
will not play along if you use metafacture as a library along with log4j-slf4j-impl
in an other version than org.apache.logging.log4j:log4j-slf4j-impl:2.14.1
and results in:
[...] java.lang.NoClassDefFoundError: Could not initialize class org.metafacture.metamorph.Metamorph
Caused by:
java.lang.ClassNotFoundException: org.apache.logging.log4j.util.ReflectionUtil [...]
Use the upcoming Metafacture Core Distribution 5.3.1
to be on the safe side.
(The update of the log4j
dependency will be part of the upcoming Metafacture Core Distribution 6.0.0
)
Changes
Bug fixes
- XML/biblio: Fix creation of Marc XML namespaces #403
- XML/biblio: Fix Namespace-prefixes of elements and attributes #377
- XML/biblio: Marc-XML-encoder: record-type written as controlfield not as attribut of record-field #402
- XML/biblio: Improve handling of XML attributes and element values #394
- XML/biblio: Encode top-level MARC record leader as proper XML element instead of control field. #336
- XML/biblio: Make simple XML encoder value tag name configurable #379
- JSON: Fix _elseNested loses array-key in JSON #374
- Metamorph: Fix _elseNested only outputs two hierachy levels #378
- Metamorph: Fix "setreplace" using a FileMap #381
- Metamorph: Guarantee that tests should verify that no unexpected interactions occurred #339
New Features
- JSON: Make JSON encoder array marker configurable #393
- JSON: Add or enhance a function to extract JSON-Records from an JSON-API #382
- Mangling: Split up event stream into records #385
- Metamorph: Allow empty values in setreplace map #420
- Triples: Sort triples numerically #380
- YAML: Add YAML Encoder/Decoder #399
Other
- Update release and publish process #311
- Checkstyle and javadoc #389 #396
- Update and apply EditorConfig file #388
- Add initial CONTRIBUTING.md #382
- Fix insecure logging configuration #364
... and various smaller fixes and improvements (e.g. #417)
Metafacture Core Distribution 5.2.0
Changes
Bug fixes
- Fix flux.bat for running on Windows #315
- OAI-Pmh fails with SAXParseException #334
- Ignore null values in Regexp function. #337
- Metamorph tests verify that no unexpected interactions occurred (first batch). #341
- Escape feedback char in _else data. #344
- Don't handle empty '[]' as character class #348
- _elseNested: Entity with more than one subfield results in multiple entities #338
New Features
- Add HTML input support #312
- Allow to omit id in JsonToElasticsearchBulk #323
- Add original OAI-PMH opener #320
- Support flux options via setters in JsonToElasticsearchBulk #326
- Add constructor to allow Entity creation w/o Metamorph #327
- Make MARCXML namespace for record elements configurable. #331
- Allow flushing collectors to only emit when complete #332
- Pass data through metamorph else #333
- ADD contains and not-contains to metamorph; CHG extent filter to work… #317
- Allow comments in JSON files. #345
- Add option to set record IDs in ObjectToLiteral #354
- Optionally decompress concatenated streams. #358
- Add filter-null-values Flux command for NullFilter #362
- Support HTML attribute values as subfields #361
Other
Metafacture Core Distribution 5.1.0
Changes
Bug fixes
- Prevent possible hanging when using
ObjectPipeDecoupler
(see 6ece930)
New Features
- Fix hbz/lobid-resources#967: Added
ObjectThreader
which allows to use multithreading (see eb1cb22); also provided as a flux command (see a432554) - Fix #296: handle not-normalized Pica+ by improving
PicaDecoder
(see 25b125f) - Fix #300: provided
MarcXmlEncoder
(see 78aaca3)
Other
- Added an example of how to use XPointer to address only portions of inclusions of another Metamorph (thanks @cboehme, see ccdad61).
- Added an example how to convert Pica+ to Marc-Xml (see 3407f14)
- Fix #308 : Added a flux command
encode-xml
to reference theSimpleXmlEncoder
. The original flux command wasstream-to-xml
and is kept for being backwards compatible
Metafacture Core Distribution 5.0.1
metafacture-core-5.0.1 Publish first bug-fix release of Metafacture 5
Metafacture Core Distribution 5.0.0
This is the first release of the Metafacture 5 line. With Metafacture 5 the migration from a monolithic library to smaller domain-specific libraries is completed.
Important: This release is published with new Maven coordinates and uses a new package root. As Metafacture has been split-up into domain-specific libraries there is no longer a single Maven dependency. Instead there is one for each domain-specific library. The readme explains how to find the Maven coordinates of these dependencies. The root package has been changed from org.culturegraph.mf
to org.metafacture
.
Additionally, metafacture-runner has been merged back into the metafacture-core repository. This makes it easier to keep it up-to-date with new releases of metafacture-core.
Updating to Metafacture 5
- If you are still using metafacture-core 3.5.0 or older you should first update to metafacture-core 4.0.0. In this release many classes were relocated. By updating first to metafacture-core 4.0.0 you avoid having to handle the relocation and the split-up of the metafacture-core library at the same time.
- Search for
org.culturegraph.mf
in your project and replace it withorg.metafacture
. This should mostly affectimport
statements. - Remove the Maven depedency on metafacture-core and follow the explanation in the readme to add the new domain-specific library dependencies.
Changes
Breaking changes
- Split up the monolithic library into smaller domain-specific libraries and changed the Maven group id from
org.culturegraph
toorg.metafacture
. The readme explains how to find the Maven dependencies (see 2234874) - Changed package root from
org.culturegraph.mf
toorg.metafacture
(see 2234874) - Merge metafacture-runner (see 7d07e91)
Bug fixes
- Fixed #267:
XmlUtil.escape
does not handle Unicode corretly for codepoints above U+10000 (see 95ff1ab) - Changed JsonEncoder to not prefix pretty-print with spaces (thanks @blackwinter, see de8e7b3)
- Included metafacture-files in Metafacture Distribution (see a6dc848)
New Features
- Added decoder for JSON (thanks @blackwinter, see 8fe600e, 60b36aa)
- Added handler for CO-MARC (thanks @larsgsvensson, see fd3dd48)
- Added option to XmlEncoder for setting encoding and version attributes in the xml header (thanks @eberhardtj, see 087e20b)
Build infrastructure
- Migrated from Maven to Gradle (see 0f7a3b2)
- Automated the release process. A release is now triggered by pushing an annotated tag to the Github repository. The release is automatically published on Maven Central and the distribution files are uploaded to Github (see 04276e4, b544371, 52ab09b)
- Version numbers are generated from SCM information (see bc4a3d3)
- Added Sonarqube analysis to the CI build process (see 8a1f500, 91a4448)
Metafacture Core Distribution 5.0.0-rc2
This is the second and last release candidate for Metafacture 5. With Metafacture 5 the migration from a monolithic library to smaller domain-specific libraries is completed.
Changes
Bug fixes
- Changed JsonEncoder to not prefix pretty-print with spaces (thanks @blackwinter, see de8e7b3)
- Included metafacture-files in Metafacture Distribution (see a6dc848)
Metafacture Core Distribution 5.0.0-rc1
This is the first release candidate for Metafacture 5. With Metafacture 5 the migration from a monolithic library to smaller domain-specific libraries is completed.
Changes
Breaking changes
- Split up the monolithic library into smaller domain-specific libraries and changed the Maven group id from
org.culturegraph
toorg.metafacture
. The readme explains how to find the Maven dependencies (see 2234874) - Changed package root from
org.culturegraph.mf
toorg.metafacture
(see 2234874) - Merge metafacture-runner (see 7d07e91)
Bug fixes
- Fixed #267:
XmlUtil.escape
does not handle Unicode corretly for codepoints above U+10000 (see 95ff1ab)
New Features
- Added decoder for JSON (thanks @blackwinter, see 8fe600e, 60b36aa)
- Added handler for CO-MARC (thanks @larsgsvensson, see fd3dd48)
- Added option to XmlEncoder for setting encoding and version attributes in the xml header (thanks @eberhardtj, see 087e20b)
Build infrastructure
- Migrated from Maven to Gradle (see 0f7a3b2)
- Automated the release process. A release is now triggered by pushing an annotated tag to the Github repository. The release is automatically published on Maven Central and the distribution files are uploaded to Github (see 04276e4, b544371, 52ab09b)
- Version numbers are generated from SCM information (see bc4a3d3)
- Added Sonarqube analysis to the CI build process (see 8a1f500, 91a4448)
Metafacture Runner Distribution 4.0.0
This release updates the metafacture-core dependency to version 4.0.0.
This is the last release of the Metafacture Runner Distribution. Starting with Metafacture 5 the distribution will be named Metafacture Core Distribution.
Changes
- Minimum required Java version is now Java 8
- Issues with running
flux.sh
on MacOS have been resolved (thanks @miku, see #9, #10) visualizeMorphDefs.sh
has been removed as support for Metamorph visualisation is no longer available in metafacture-core 4.0.0 (see 5b6dd56).
Please see the release notes for metafacture-core for a list of changes.
Metafacture Core 4.0.0
This is the last release of metafacture-core as a single Maven artifact. The release concentrates on reorganising the sources to prepare for the split-up of metafacture-core in the next release.
Important: This release contains a number of breaking changes and updating from metafacture-core-3.5.0 is not trivial. The most notable changes are the update to Java 8 and the reorganisation of the package structure which changed the qualified names of all Metafacture modules and moved many of the other classes to new packages. Please refer to the list of modules per package at the end of the release notes to find the new module locations. All other relocated and renamed classes are listed below.
Changes
- New features
- Bug fixes
- Removed features
- Moved and renamed items
- Changed behaviour
- Other improvements
New features
Flux
Metamorph
- Fix #256: Support sameEntity in none and all. Add support for the
sameEntity
attribute tonone
andall
statements. The attribute does not make sense inany
statements (see 93bc48d) - The
RestMap
which looks up values by doing a REST request works now (thanks @philboeselager, see 704d4ad) - The new helper class
InlineMorph
simplifies embedding Metamorph scripts directly in Java. It was introduced to help writing test cases for Metamorph functions and collectors (see 9b6e7f1)
Metafacture modules
- Added
ForwardingStreamPipe
as a base class for modules which only need to intercept some events but forward all others unmodified (see 4412623) - Added
NullFilter
which replacesnull
values with a replacement string or discards them (see c282c74) - Added
JsonToElasticsearchBulk
module to create Elasticsearch bulk import data from JSON (thanks @fsteeg and @blackwinter, see 9d85194, 056fe60) - Added
AlephMabXmlHandler
for the widely used Mab-Xml derivative created by Aleph exports (thanks @dr0i, see 482af42) - Added
AseqDecoder
. A decoder for aseq data (thanks @larsgsvensson, see f77b1ae) - Added
XmlElementSplitter
. The module splits an xml document at acertain element (thanks @dr0i, see 5517beb) - Added
XmlFilenameWriter
: The module extracts a file name from an xml document and saves the document to the extracted file. It's possible to store the file uncompressed and as bz2 (thanks @dr0i, see 40d290e) - Added
XmlTee
. A tee implementation for XML event streams. Allows to forward a stream to more than one downstream module (thanks @dr0i, see 3fb3b4d) PojoEncoder
: Added support for populating maps in POJOs (thanks @thomasseidel, see e23bc13)IdChangePipe
(nowRecordIdChanger
): Added corresponding getters for setters (see e8300a8)Utf8Normalizer
(nowUnicodeNormalizer
): Made normalisation form configurable (see 314a641)PicaDecoder
: Added record ids for level 1 & 2 records. Local system records (level 1) and holding records (level 2) do not store their record id in field003@ $0
but in field107F $0
or203@ $0
(the latter may include an occurrence specification). These ids are now emitted as record ids in start-record events when level 1 or level 2 records are processed (see c955f4d, a9529dd)Marc21Encoder
: The record identifier field (001) can now be automatically created from the record identifier of the start record event. This is configured by thesetGenerateRecordId(boolean)
parameter (see 6d04d69)
Other
- Added a framework reading and writing ISO 2709:2008 records (see 3b24df5)
ResourceUtil
: AddedreadAll(InputStream, Charset)
andreadAll(Reader)
to read a full stream into a string (see 9a70936)XmlUtil
: Addedescape(String)
method for escaping strings for xml output (see 9fb4154)
Bug fixes
Flux
- The flux grammar supported octal escape sequence but failed to convert them into characters after parsing (see b4da5bf)
Metamorph
- Fix #255: Metamorph emits null as entity name (see 8adafef)
- Fix #257: Do not reset entity if reset is false (see a680a28)
- Fix #265:
split
andswitch-name-value
functions emitted wrong source (see a7f6785) - Fixed resource leak in Metamorph file-maps (see d519e8c)
Metamorph-Test
- Fix #6: Fix test names for Metamorph Test in Intellij. Intellij did not show the name of the xml files containing the tests but only the string "xml". This was caused by Intellij interpreting the test class name as a fully qualified java class name and attempting to extract the class name from it. By making sure that the Metamorph Test names do not contain any dots this problem can be avoided (see 033e6d0)
Metafacture modules
StreamUnicodeNormalizer
no longer fails onnull
values in literals but simply forwards them (see 12e3420)Marc21Encoder
produced invalid MARC 21 records if the record data contained unicode codepoints which required more than one byte in UTF-8 encoding (see 6d04d69)
Removed features
Flux
- Removed
generic-xml
Flux command. Usedecode-xml
followed byhandle-generic-xml
instead (see 53edfdb) - Removed
MorphVisualizer
. The tool was outdated and not well maintained. There is currently no replacement for it (see 73efd63)
Metamorph
- Fix #226: Remove miss-spelled options from
occurrence
function. Theoccurrence
function in Metamorph no longer supportslessThen
andmoreThen
in itsonly
attribute. InsteadlessThan
andmoreThan
must be used (see 800108e) - The constructors of the collector helper classes expected a reference to the
Metamorph
object. The parameter has been removed as collectors should not access theMetamorph
object (see f76f39e) CollectFactory
,FunctionFactory
andMapsFactory
have been made package-private in themetamorph
package. The are considered to be an internal part of Metamorph (see c805690)
Metamorph-Test
TestConfigurationException
is removed and replaced with JUnit'sInitializationError
(see 35dfaf5)MetamorphTestCase
,MetamorphTestLoader
andMetamorphTestRunner
are made package-private as the are not required for using Metamorph-Test (see 35dfaf5)
Metafacture modules
- Removed
CGEntityDecoder
,CGEntityEncoder
,CGTextDecoder
,CGEntityReader
and the helper classCGEntity
. UseFormetaDecoder
andFormetaEncoder
instead. To convert data cg-entity or cg-text format to Formeta use metafacture-3.5.0 which contains support for both formats (see ac1c71d) - Removed
Bzip2Opener
andGzipOpener
. The genericFileOpener
automatically recognises and handles compressed files (see c3f24eb) - Removed
RecordBounderyRemover
. The functionality provided by this module is also provided byStreamEventDiscarder
(see 305c8d4) - Removed
ObjectExceptionLogger
in favour ofObjectExceptionCatcher
which provides the same functionality (see 919349b) - Removed
RecordBatcher
. Use implementations ofAbstractBatcher
instead (see 52da508) - Removed
StreamFormatter
. TheFormetaEncoder
andStreamLogger
modules provide very similar functionality and should be used instead (see 2f6bc2a) - Removed
WrappingStreamPipe
. Modules with nested pipelines should manage them themselves (see 52a5f21) - Removed
SimpleJsonEncoder
. TheJsonEncoder
module provides the same functionality (see 7d5e467) - Removed
EventListSource
. If used withEventList
it can be often replaced with aStreamBuffer
(see f77539b) - Removed
MultiOpener
. There were no concrete opener implementations registered to be used byMultiOpener
. So, the class was completely useless (see 566022f) - Removed
CsvReader
,GenericXmlReader
,LidoReader
,MabReader
,MarcReader
,MetsModsReader
andPicaXmlReader
. Users who use these readers should replace them by the corresponding combination of a record splitting module and a recor...