Skip to content

Releases: metafacture/metafacture-core

Metafacture Core Distribution 5.3.1

06 Dec 15:36
metafacture-core-5.3.1
Compare
Choose a tag to compare

Changes

Reverted

  • The dependency upgrade of log4j:1.2.12 to log4j-core:2.14.1 can be problematic, so this is reverted in 36ed969 and the dependency is only minor updated to log4j:1.2.17.
  • an API break concerning the accessibility of FilenameUtilin FilenameExtractor is reverted in 7c1ea04.

Possible caveats

Using a metamorph.xsd of your own and making use of FileMap you have to also update your locally metamorph.xsd like:

-      <attribute name="separator" type="string" use="optional" default="\t">
+      <attribute name="separator" type="string" use="optional" default="&#09;">

This will occur only quite rarely. This is the result of a bug fix (d528ac9) (in effect the default separator defined (falsely) in metamorph.xsd has had no effect at all).

Metafacture Core Distribution 5.3.0

29 Nov 15:28
metafacture-core-5.3.0
a81f2bf
Compare
Choose a tag to compare

Consider this release as deprecated .

The update of the dependency of log4j:log4j:1.2.12 to org.apache.logging.log4j:log4j-core:2.14.1 will not play along if you use metafacture as a library along with log4j-slf4j-impl in an other version than org.apache.logging.log4j:log4j-slf4j-impl:2.14.1 and results in:

[...] java.lang.NoClassDefFoundError: Could not initialize class org.metafacture.metamorph.Metamorph
Caused by:
java.lang.ClassNotFoundException: org.apache.logging.log4j.util.ReflectionUtil [...]

Use the upcoming Metafacture Core Distribution 5.3.1 to be on the safe side.
(The update of the log4j dependency will be part of the upcoming Metafacture Core Distribution 6.0.0)

Changes

Bug fixes

  • XML/biblio: Fix creation of Marc XML namespaces #403
  • XML/biblio: Fix Namespace-prefixes of elements and attributes #377
  • XML/biblio: Marc-XML-encoder: record-type written as controlfield not as attribut of record-field #402
  • XML/biblio: Improve handling of XML attributes and element values #394
  • XML/biblio: Encode top-level MARC record leader as proper XML element instead of control field. #336
  • XML/biblio: Make simple XML encoder value tag name configurable #379
  • JSON: Fix _elseNested loses array-key in JSON #374
  • Metamorph: Fix _elseNested only outputs two hierachy levels #378
  • Metamorph: Fix "setreplace" using a FileMap #381
  • Metamorph: Guarantee that tests should verify that no unexpected interactions occurred #339

New Features

  • JSON: Make JSON encoder array marker configurable #393
  • JSON: Add or enhance a function to extract JSON-Records from an JSON-API #382
  • Mangling: Split up event stream into records #385
  • Metamorph: Allow empty values in setreplace map #420
  • Triples: Sort triples numerically #380
  • YAML: Add YAML Encoder/Decoder #399

Other

  • Update release and publish process #311
  • Checkstyle and javadoc #389 #396
  • Update and apply EditorConfig file #388
  • Add initial CONTRIBUTING.md #382
  • Fix insecure logging configuration #364

... and various smaller fixes and improvements (e.g. #417)

Metafacture Core Distribution 5.2.0

22 Apr 13:15
metafacture-core-5.2.0
e132278
Compare
Choose a tag to compare

Changes

Bug fixes

  • Fix flux.bat for running on Windows #315
  • OAI-Pmh fails with SAXParseException #334
  • Ignore null values in Regexp function. #337
  • Metamorph tests verify that no unexpected interactions occurred (first batch). #341
  • Escape feedback char in _else data. #344
  • Don't handle empty '[]' as character class #348
  • _elseNested: Entity with more than one subfield results in multiple entities #338

New Features

  • Add HTML input support #312
  • Allow to omit id in JsonToElasticsearchBulk #323
  • Add original OAI-PMH opener #320
  • Support flux options via setters in JsonToElasticsearchBulk #326
  • Add constructor to allow Entity creation w/o Metamorph #327
  • Make MARCXML namespace for record elements configurable. #331
  • Allow flushing collectors to only emit when complete #332
  • Pass data through metamorph else #333
  • ADD contains and not-contains to metamorph; CHG extent filter to work… #317
  • Allow comments in JSON files. #345
  • Add option to set record IDs in ObjectToLiteral #354
  • Optionally decompress concatenated streams. #358
  • Add filter-null-values Flux command for NullFilter #362
  • Support HTML attribute values as subfields #361

Other

  • Tweak help output for Markdown formatting #316
  • Migrate from travis to github actions to test build #352

Metafacture Core Distribution 5.1.0

19 Nov 13:09
metafacture-core-5.1.0
Compare
Choose a tag to compare

Changes

Bug fixes

  • Prevent possible hanging when using ObjectPipeDecoupler (see 6ece930)

New Features

Other

  • Added an example of how to use XPointer to address only portions of inclusions of another Metamorph (thanks @cboehme, see ccdad61).
  • Added an example how to convert Pica+ to Marc-Xml (see 3407f14)
  • Fix #308 : Added a flux command encode-xml to reference the SimpleXmlEncoder. The original flux command was stream-to-xml and is kept for being backwards compatible

Metafacture Core Distribution 5.0.1

19 Dec 14:10
metafacture-core-5.0.1
Compare
Choose a tag to compare
metafacture-core-5.0.1

Publish first bug-fix release of Metafacture 5

Metafacture Core Distribution 5.0.0

09 Jan 12:33
metafacture-core-5.0.0
Compare
Choose a tag to compare

This is the first release of the Metafacture 5 line. With Metafacture 5 the migration from a monolithic library to smaller domain-specific libraries is completed.

Important: This release is published with new Maven coordinates and uses a new package root. As Metafacture has been split-up into domain-specific libraries there is no longer a single Maven dependency. Instead there is one for each domain-specific library. The readme explains how to find the Maven coordinates of these dependencies. The root package has been changed from org.culturegraph.mf to org.metafacture.

Additionally, metafacture-runner has been merged back into the metafacture-core repository. This makes it easier to keep it up-to-date with new releases of metafacture-core.

Updating to Metafacture 5

  1. If you are still using metafacture-core 3.5.0 or older you should first update to metafacture-core 4.0.0. In this release many classes were relocated. By updating first to metafacture-core 4.0.0 you avoid having to handle the relocation and the split-up of the metafacture-core library at the same time.
  2. Search for org.culturegraph.mf in your project and replace it with org.metafacture. This should mostly affect import statements.
  3. Remove the Maven depedency on metafacture-core and follow the explanation in the readme to add the new domain-specific library dependencies.

Changes

Breaking changes

  • Split up the monolithic library into smaller domain-specific libraries and changed the Maven group id from org.culturegraph to org.metafacture. The readme explains how to find the Maven dependencies (see 2234874)
  • Changed package root from org.culturegraph.mf to org.metafacture (see 2234874)
  • Merge metafacture-runner (see 7d07e91)

Bug fixes

  • Fixed #267: XmlUtil.escape does not handle Unicode corretly for codepoints above U+10000 (see 95ff1ab)
  • Changed JsonEncoder to not prefix pretty-print with spaces (thanks @blackwinter, see de8e7b3)
  • Included metafacture-files in Metafacture Distribution (see a6dc848)

New Features

Build infrastructure

  • Migrated from Maven to Gradle (see 0f7a3b2)
  • Automated the release process. A release is now triggered by pushing an annotated tag to the Github repository. The release is automatically published on Maven Central and the distribution files are uploaded to Github (see 04276e4, b544371, 52ab09b)
  • Version numbers are generated from SCM information (see bc4a3d3)
  • Added Sonarqube analysis to the CI build process (see 8a1f500, 91a4448)

Metafacture Core Distribution 5.0.0-rc2

04 Jan 13:59
metafacture-core-5.0.0-rc2
Compare
Choose a tag to compare
Pre-release

This is the second and last release candidate for Metafacture 5. With Metafacture 5 the migration from a monolithic library to smaller domain-specific libraries is completed.

Changes

Bug fixes

  • Changed JsonEncoder to not prefix pretty-print with spaces (thanks @blackwinter, see de8e7b3)
  • Included metafacture-files in Metafacture Distribution (see a6dc848)

Metafacture Core Distribution 5.0.0-rc1

13 Oct 08:49
metafacture-core-5.0.0-rc1
Compare
Choose a tag to compare
Pre-release

This is the first release candidate for Metafacture 5. With Metafacture 5 the migration from a monolithic library to smaller domain-specific libraries is completed.

Changes

Breaking changes

  • Split up the monolithic library into smaller domain-specific libraries and changed the Maven group id from org.culturegraph to org.metafacture. The readme explains how to find the Maven dependencies (see 2234874)
  • Changed package root from org.culturegraph.mf to org.metafacture (see 2234874)
  • Merge metafacture-runner (see 7d07e91)

Bug fixes

  • Fixed #267: XmlUtil.escape does not handle Unicode corretly for codepoints above U+10000 (see 95ff1ab)

New Features

Build infrastructure

  • Migrated from Maven to Gradle (see 0f7a3b2)
  • Automated the release process. A release is now triggered by pushing an annotated tag to the Github repository. The release is automatically published on Maven Central and the distribution files are uploaded to Github (see 04276e4, b544371, 52ab09b)
  • Version numbers are generated from SCM information (see bc4a3d3)
  • Added Sonarqube analysis to the CI build process (see 8a1f500, 91a4448)

Metafacture Runner Distribution 4.0.0

26 Jul 16:55
Compare
Choose a tag to compare

This release updates the metafacture-core dependency to version 4.0.0.

This is the last release of the Metafacture Runner Distribution. Starting with Metafacture 5 the distribution will be named Metafacture Core Distribution.

Changes

  • Minimum required Java version is now Java 8
  • Issues with running flux.sh on MacOS have been resolved (thanks @miku, see #9, #10)
  • visualizeMorphDefs.sh has been removed as support for Metamorph visualisation is no longer available in metafacture-core 4.0.0 (see 5b6dd56).

Please see the release notes for metafacture-core for a list of changes.

Metafacture Core 4.0.0

09 Jan 19:10
Compare
Choose a tag to compare

This is the last release of metafacture-core as a single Maven artifact. The release concentrates on reorganising the sources to prepare for the split-up of metafacture-core in the next release.

Important: This release contains a number of breaking changes and updating from metafacture-core-3.5.0 is not trivial. The most notable changes are the update to Java 8 and the reorganisation of the package structure which changed the qualified names of all Metafacture modules and moved many of the other classes to new packages. Please refer to the list of modules per package at the end of the release notes to find the new module locations. All other relocated and renamed classes are listed below.

Changes

New features

Flux

  • Added the @FluxCommand annotation to all modules which can be use in Flux (see 4ff1ce5, b297e6d)

Metamorph

  • Fix #256: Support sameEntity in none and all. Add support for the sameEntity attribute to none and all statements. The attribute does not make sense in any statements (see 93bc48d)
  • The RestMap which looks up values by doing a REST request works now (thanks @philboeselager, see 704d4ad)
  • The new helper class InlineMorph simplifies embedding Metamorph scripts directly in Java. It was introduced to help writing test cases for Metamorph functions and collectors (see 9b6e7f1)

Metafacture modules

  • Added ForwardingStreamPipe as a base class for modules which only need to intercept some events but forward all others unmodified (see 4412623)
  • Added NullFilter which replaces null values with a replacement string or discards them (see c282c74)
  • Added JsonToElasticsearchBulk module to create Elasticsearch bulk import data from JSON (thanks @fsteeg and @blackwinter, see 9d85194, 056fe60)
  • Added AlephMabXmlHandler for the widely used Mab-Xml derivative created by Aleph exports (thanks @dr0i, see 482af42)
  • Added AseqDecoder. A decoder for aseq data (thanks @larsgsvensson, see f77b1ae)
  • Added XmlElementSplitter. The module splits an xml document at acertain element (thanks @dr0i, see 5517beb)
  • Added XmlFilenameWriter: The module extracts a file name from an xml document and saves the document to the extracted file. It's possible to store the file uncompressed and as bz2 (thanks @dr0i, see 40d290e)
  • Added XmlTee. A tee implementation for XML event streams. Allows to forward a stream to more than one downstream module (thanks @dr0i, see 3fb3b4d)
  • PojoEncoder: Added support for populating maps in POJOs (thanks @thomasseidel, see e23bc13)
  • IdChangePipe (now RecordIdChanger): Added corresponding getters for setters (see e8300a8)
  • Utf8Normalizer (now UnicodeNormalizer): Made normalisation form configurable (see 314a641)
  • PicaDecoder: Added record ids for level 1 & 2 records. Local system records (level 1) and holding records (level 2) do not store their record id in field 003@ $0 but in field 107F $0 or 203@ $0 (the latter may include an occurrence specification). These ids are now emitted as record ids in start-record events when level 1 or level 2 records are processed (see c955f4d, a9529dd)
  • Marc21Encoder: The record identifier field (001) can now be automatically created from the record identifier of the start record event. This is configured by the setGenerateRecordId(boolean) parameter (see 6d04d69)

Other

  • Added a framework reading and writing ISO 2709:2008 records (see 3b24df5)
  • ResourceUtil: Added readAll(InputStream, Charset) and readAll(Reader) to read a full stream into a string (see 9a70936)
  • XmlUtil: Added escape(String) method for escaping strings for xml output (see 9fb4154)

Bug fixes

Flux

  • The flux grammar supported octal escape sequence but failed to convert them into characters after parsing (see b4da5bf)

Metamorph

  • Fix #255: Metamorph emits null as entity name (see 8adafef)
  • Fix #257: Do not reset entity if reset is false (see a680a28)
  • Fix #265: split and switch-name-value functions emitted wrong source (see a7f6785)
  • Fixed resource leak in Metamorph file-maps (see d519e8c)

Metamorph-Test

  • Fix #6: Fix test names for Metamorph Test in Intellij. Intellij did not show the name of the xml files containing the tests but only the string "xml". This was caused by Intellij interpreting the test class name as a fully qualified java class name and attempting to extract the class name from it. By making sure that the Metamorph Test names do not contain any dots this problem can be avoided (see 033e6d0)

Metafacture modules

  • StreamUnicodeNormalizer no longer fails on null values in literals but simply forwards them (see 12e3420)
  • Marc21Encoder produced invalid MARC 21 records if the record data contained unicode codepoints which required more than one byte in UTF-8 encoding (see 6d04d69)

Removed features

Flux

  • Removed generic-xml Flux command. Use decode-xml followed by handle-generic-xml instead (see 53edfdb)
  • Removed MorphVisualizer. The tool was outdated and not well maintained. There is currently no replacement for it (see 73efd63)

Metamorph

  • Fix #226: Remove miss-spelled options from occurrence function. The occurrence function in Metamorph no longer supports lessThen and moreThen in its only attribute. Instead lessThan and moreThan must be used (see 800108e)
  • The constructors of the collector helper classes expected a reference to the Metamorph object. The parameter has been removed as collectors should not access the Metamorph object (see f76f39e)
  • CollectFactory, FunctionFactory and MapsFactory have been made package-private in the metamorph package. The are considered to be an internal part of Metamorph (see c805690)

Metamorph-Test

  • TestConfigurationException is removed and replaced with JUnit's InitializationError (see 35dfaf5)
  • MetamorphTestCase, MetamorphTestLoader and MetamorphTestRunner are made package-private as the are not required for using Metamorph-Test (see 35dfaf5)

Metafacture modules

  • Removed CGEntityDecoder, CGEntityEncoder, CGTextDecoder, CGEntityReader and the helper class CGEntity. Use FormetaDecoder and FormetaEncoder instead. To convert data cg-entity or cg-text format to Formeta use metafacture-3.5.0 which contains support for both formats (see ac1c71d)
  • Removed Bzip2Opener and GzipOpener. The generic FileOpener automatically recognises and handles compressed files (see c3f24eb)
  • Removed RecordBounderyRemover. The functionality provided by this module is also provided by StreamEventDiscarder (see 305c8d4)
  • Removed ObjectExceptionLogger in favour of ObjectExceptionCatcher which provides the same functionality (see 919349b)
  • Removed RecordBatcher. Use implementations of AbstractBatcher instead (see 52da508)
  • Removed StreamFormatter. The FormetaEncoder and StreamLogger modules provide very similar functionality and should be used instead (see 2f6bc2a)
  • Removed WrappingStreamPipe. Modules with nested pipelines should manage them themselves (see 52a5f21)
  • Removed SimpleJsonEncoder. The JsonEncoder module provides the same functionality (see 7d5e467)
  • Removed EventListSource. If used with EventList it can be often replaced with a StreamBuffer (see f77539b)
  • Removed MultiOpener. There were no concrete opener implementations registered to be used by MultiOpener. So, the class was completely useless (see 566022f)
  • Removed CsvReader, GenericXmlReader, LidoReader, MabReader, MarcReader, MetsModsReader and PicaXmlReader. Users who use these readers should replace them by the corresponding combination of a record splitting module and a recor...
Read more