Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR-D Workflow integration #5697

Closed
wants to merge 76 commits into from
Closed
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
3a5aaa6
add config parameter
markusweigelt Apr 19, 2022
4672dd1
initial commit of ocr workflow entity
markusweigelt Apr 22, 2022
8d6809a
add entities and db object for ocr workflow
markusweigelt Apr 26, 2022
54f0d93
add ocr workflow edit
markusweigelt Apr 29, 2022
de02bb5
add bean to hibernate
markusweigelt May 11, 2022
c041fc3
select ocr workflow on project level and fix delete and editing of oc…
markusweigelt May 12, 2022
35637f0
create ocr workflow file in process dir
markusweigelt May 13, 2022
b59203a
add config parameter
markusweigelt Apr 19, 2022
df3b6f5
initial commit of ocr workflow entity
markusweigelt Apr 22, 2022
6ae0344
add entities and db object for ocr workflow
markusweigelt Apr 26, 2022
805637a
add ocr workflow edit
markusweigelt Apr 29, 2022
c737e71
add bean to hibernate
markusweigelt May 11, 2022
a227bbd
select ocr workflow on project level and fix delete and editing of oc…
markusweigelt May 12, 2022
922ae8e
add ocr workflwo tab to process edit
markusweigelt May 13, 2022
464b861
improve selection and remove process ocr workflow tab
markusweigelt Jun 29, 2022
c450ad3
add comment and remove unused file
markusweigelt Jun 29, 2022
ff53014
improve messages
markusweigelt Jun 29, 2022
18ff2d8
add config parameter
markusweigelt Apr 19, 2022
7055ef3
initial commit of ocr workflow entity
markusweigelt Apr 22, 2022
7380441
add entities and db object for ocr workflow
markusweigelt Apr 26, 2022
b57a3f2
add ocr workflow edit
markusweigelt Apr 29, 2022
319727e
add bean to hibernate
markusweigelt May 11, 2022
bfd037e
select ocr workflow on project level and fix delete and editing of oc…
markusweigelt May 12, 2022
3b23eb9
create ocr workflow file in process dir
markusweigelt May 13, 2022
305bb06
add config parameter
markusweigelt Apr 19, 2022
d3e53ca
initial commit of ocr workflow entity
markusweigelt Apr 22, 2022
980cfa0
add entities and db object for ocr workflow
markusweigelt Apr 26, 2022
0a3ca19
add ocr workflow edit
markusweigelt Apr 29, 2022
8197c62
add ocr workflwo tab to process edit
markusweigelt May 13, 2022
ac4883e
improve selection and remove process ocr workflow tab
markusweigelt Jun 29, 2022
5a22270
add comment and remove unused file
markusweigelt Jun 29, 2022
138c740
improve messages
markusweigelt Jun 29, 2022
525091e
Merge branch 'ocrd-main' of github.com:markusweigelt/kitodo-productio…
markusweigelt Jun 29, 2022
28ae981
improvements of review notes
markusweigelt Jun 30, 2022
7f78c38
Merge branch 'ocrd-main' of github.com:markusweigelt/kitodo-productio…
markusweigelt Jun 30, 2022
73f39e3
improve label
markusweigelt Jul 6, 2022
980c322
Merge master into branch
markusweigelt Jun 8, 2023
abf063f
Rename V2_109__Add_authorities_to_manage_ocr_workflow.sql to V2_124__…
markusweigelt Jun 8, 2023
8790fff
Improvements for checkstyle
markusweigelt Jun 8, 2023
7e7ab3a
Change identifier of table
markusweigelt Jun 9, 2023
2ec1b19
Revert "Change identifier of table"
markusweigelt Jun 9, 2023
64203e0
Rename table idenifier
markusweigelt Jun 9, 2023
3f76202
Add ocr workflow config to process settings
markusweigelt Jun 9, 2023
2127a97
Improve process form
markusweigelt Jun 9, 2023
3a736ef
Add ocrworkflow variable for scripts
markusweigelt Jun 12, 2023
ec84969
Fix checkstyle
markusweigelt Jun 12, 2023
b1de092
Remove copy to process dir
markusweigelt Jun 13, 2023
db660b4
Revert change of method
markusweigelt Jun 13, 2023
23944ba
Revert unnecessary changes
markusweigelt Jun 13, 2023
6686042
Revert unnecessary changes
markusweigelt Jun 13, 2023
90b63e6
Revert unnecessary changes
markusweigelt Jun 13, 2023
2cb746c
Improvements and changes of review
markusweigelt Jun 26, 2023
540ed9d
Rename edit path
markusweigelt Jun 26, 2023
0c7f4f6
Remove state of workflow
markusweigelt Jun 26, 2023
2fc9af5
Rename ocr to ocrd
markusweigelt Jun 26, 2023
5cb74c5
Rename ocr to ocrd
markusweigelt Jun 26, 2023
20c3d4e
Rename ocr to ocrd
markusweigelt Jun 26, 2023
d39a62a
Rename ocr to ocrd
markusweigelt Jun 26, 2023
17cbade
Rename ocr to ocrd
markusweigelt Jun 26, 2023
d57cd7b
Rename ocr to ocrd
markusweigelt Jun 26, 2023
10d03b9
Rename ocr to ocrd
markusweigelt Jun 26, 2023
651d710
Rename ocr to ocrd
markusweigelt Jun 27, 2023
01d275d
Rename OCR-D Workflow to OCR Profile
markusweigelt Jun 28, 2023
347f6ee
Rename OCR-D Workflow to OCR Profile
markusweigelt Jun 28, 2023
8e389d4
Rename OCR-D Workflow to OCR Profile
markusweigelt Jun 28, 2023
67e5ecc
Improve checkstyle
markusweigelt Jun 28, 2023
453ab22
Fixes regarding renaming to OCR profiles
markusweigelt Jun 28, 2023
ecbae9e
Change first letter of converter reference
markusweigelt Jun 29, 2023
ffc58d1
Rename variable ocrprofile to ocrprofilefilename
markusweigelt Jun 29, 2023
e76a770
Improve checkstyle
markusweigelt Jun 29, 2023
aa4ac13
Add named parameter
markusweigelt Jun 29, 2023
42480ff
Improve authorities and support relativ path to ocr profile
markusweigelt Jun 30, 2023
93f6aa9
Improve java doc
markusweigelt Jun 30, 2023
08c746f
Allow symlinks in file path
markusweigelt Jul 3, 2023
587aa3e
Add file separator
markusweigelt Jul 3, 2023
5a90f0d
Add project dms export path
markusweigelt Jul 4, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
/*
* (c) Kitodo. Key to digital objects e. V. <[email protected]>
*
* This file is part of the Kitodo project.
*
* It is licensed under GNU General Public License version 3 or later.
*
* For the full copyright and license information, please read the
* GPL3-License.txt file that was distributed with this source code.
*/

package org.kitodo.data.database.beans;

import java.util.Objects;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.ForeignKey;
import javax.persistence.JoinColumn;
import javax.persistence.ManyToOne;
import javax.persistence.Table;

@Entity
@Table(name = "ocrworkflow")
public class OCRWorkflow extends BaseBean {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term workflow is already in use in Production with a different meaning. Maybe we can find a not-yet-used name for this class, like OCR configuration (OcrConfig).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That ambiguity is well known and oft discussed. So far, I have never seen a good counter proposal. OCR configuration gets closest indeed – I could live with that.

Within ocrd_kitodo documentation, though, we decided to use workflow and always disambiguate between (organisational) Kitodo Workflow and (technical) OCR Workflow.

Copy link
Collaborator Author

@markusweigelt markusweigelt Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the term OCR Configuration misleading. It reminds me of the configuration to an OCR server or a more general configuration. What is with OCR Processing, OCR Recipe or directly OCR Profile. I think OCR Profile sounds good for me! It may contain a workflow field and can be linked to a later profile implementation in the monitor.

In addition, we would have to document well that the workflows in the OCR-D environment are meant with this term, but that is the case with all terms i think.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, configuration may not be so good after all. Yes, OCR profile would also be acceptable, if properly explained/documented, as indeed would have to be done in any case.


@Column(name = "title")
private String title;

@Column(name = "file")
private String file;

@Column(name = "active")
private Boolean active = true;
markusweigelt marked this conversation as resolved.
Show resolved Hide resolved

@ManyToOne
@JoinColumn(name = "client_id", foreignKey = @ForeignKey(name = "FK_ocrworkflow_client_id"))
private Client client;

public String getTitle() {
return this.title;
}

public void setTitle(String title) {
this.title = title;
}

public String getFile() {
return this.file;
}

public void setFile(String file) {
this.file = file;
}

/**
* Check if ocr workflow is active.
*
* @return true or false
*/
public Boolean isActive() {
if (Objects.isNull(this.active)) {
markusweigelt marked this conversation as resolved.
Show resolved Hide resolved
this.active = true;
}
return this.active;
}

/**
* Set ruleset as active.
markusweigelt marked this conversation as resolved.
Show resolved Hide resolved
*
* @param active as boolean
*/
public void setActive(boolean active) {
this.active = active;
}


/**
* Get client.
*
* @return Client object
*/
public Client getClient() {
return this.client;
}

/**
* Set client.
*
* @param client
* as Client object
*/
public void setClient(Client client) {
this.client = client;
}

@Override
public boolean equals(Object object) {
if (this == object) {
return true;
}

if (object instanceof OCRWorkflow) {
OCRWorkflow ruleset = (OCRWorkflow) object;
markusweigelt marked this conversation as resolved.
Show resolved Hide resolved
return Objects.equals(this.getId(), ruleset.getId());
}

return false;
}

@Override
public int hashCode() {
return Objects.hash(title, file, active);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ public class Process extends BaseTemplateBean {
@JoinColumn(name = "docket_id", foreignKey = @ForeignKey(name = "FK_process_docket_id"))
private Docket docket;

@ManyToOne
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add corresponding @OneToMany relationship in OCRWorkflow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have such a corresponding relationship for similar examples, for example Ruleset or Docket and we don't need this relationship atm. If the relationship is needed then we should add it later and provide an appropriate migration script. I want to customize the data structure only for the cases which are really needed.

@solth What do you mean?

@JoinColumn(name = "ocr_workflow_id", foreignKey = @ForeignKey(name = "FK_process_ocr_workflow_id"))
private OCRWorkflow ocrWorkflow;

@ManyToOne
@JoinColumn(name = "project_id", foreignKey = @ForeignKey(name = "FK_process_project_id"))
private Project project;
Expand Down Expand Up @@ -289,6 +293,25 @@ public void setDocket(Docket docket) {
this.docket = docket;
}

/**
* Get ocr workflow.
*
* @return value of ocr workflow
*/
public OCRWorkflow getOcrWorkflow() {
return ocrWorkflow;
}

/**
* Set ocr workflow.
*
* @param ocrWorkflow
* as Workflow object
*/
public void setOcrWorkflow(OCRWorkflow ocrWorkflow) {
this.ocrWorkflow = ocrWorkflow;
}

/**
* Get template.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ public class Template extends BaseTemplateBean {
@JoinColumn(name = "workflow_id", foreignKey = @ForeignKey(name = "FK_template_workflow_id"))
private Workflow workflow;

@ManyToOne
@JoinColumn(name = "ocr_workflow_id", foreignKey = @ForeignKey(name = "FK_template_ocr_workflow_id"))
private OCRWorkflow ocrWorkflow;

@OneToMany(mappedBy = "template", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Process> processes;

Expand Down Expand Up @@ -168,6 +172,25 @@ public void setWorkflow(Workflow workflow) {
this.workflow = workflow;
}

/**
* Get ocr workflow.
*
* @return value of ocr workflow
*/
public OCRWorkflow getOcrWorkflow() {
return ocrWorkflow;
}

/**
* Set ocr workflow.
*
* @param ocrWorkflow
* as Workflow object
*/
public void setOcrWorkflow(OCRWorkflow ocrWorkflow) {
this.ocrWorkflow = ocrWorkflow;
}

/**
* Get projects list.
*
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
/*
* (c) Kitodo. Key to digital objects e. V. <[email protected]>
*
* This file is part of the Kitodo project.
*
* It is licensed under GNU General Public License version 3 or later.
*
* For the full copyright and license information, please read the
* GPL3-License.txt file that was distributed with this source code.
*/

package org.kitodo.data.database.persistence;

import java.util.Collections;
import java.util.List;
import java.util.Objects;

import org.kitodo.data.database.beans.OCRWorkflow;
import org.kitodo.data.database.exceptions.DAOException;

public class OCRWorkflowDAO extends BaseDAO<OCRWorkflow> {

@Override
public OCRWorkflow getById(Integer id) throws DAOException {
OCRWorkflow ocrWorkflow = retrieveObject(OCRWorkflow.class, id);
if (Objects.isNull(ocrWorkflow)) {
throw new DAOException("Object cannot be found in database");
}
return ocrWorkflow;
}

@Override
public List<OCRWorkflow> getAll() throws DAOException {
return retrieveAllObjects(OCRWorkflow.class);
}

@Override
public List<OCRWorkflow> getAll(int offset, int size) throws DAOException {
return retrieveObjects("FROM OCRWorkflow ORDER BY id ASC", offset, size);
}

@Override
public List<OCRWorkflow> getAllNotIndexed(int offset, int size) throws DAOException {
throw new UnsupportedOperationException();
}

@Override
public void remove(Integer ocrWorkflowId) throws DAOException {
removeObject(OCRWorkflow.class, ocrWorkflowId);
}

/**
* Get available ocr workflows - available means that ocr workflow has status active and is
* assigned to client with given id.
*
* @param clientId
* id of client to which searched ocr workflows should be assigned
* @return list of available ocr workflow objects
*/
public List<OCRWorkflow> getAvailableOCRWorkflows(int clientId) {
return getByQuery(
"SELECT w FROM OCRWorkflow AS w INNER JOIN w.client AS c WITH c.id = :clientId",
Collections.singletonMap("clientId", clientId));
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
--
-- (c) Kitodo. Key to digital objects e. V. <[email protected]>
--
-- This file is part of the Kitodo project.
--
-- It is licensed under GNU General Public License version 3 or later.
--
-- For the full copyright and license information, please read the
-- GPL3-License.txt file that was distributed with this source code.
--

-- Add authorities to manage ocr workflows
INSERT IGNORE INTO authority (title) VALUES ('addOCRWorkflow_clientAssignable');
INSERT IGNORE INTO authority (title) VALUES ('editOCRWorkflow_clientAssignable');
INSERT IGNORE INTO authority (title) VALUES ('deleteOCRWorkflow_clientAssignable');
INSERT IGNORE INTO authority (title) VALUES ('viewOCRWorkflow_clientAssignable');

-- Add table "ocrworkflow"
CREATE TABLE IF NOT EXISTS ocrworkflow (
id INT(10) NOT NULL AUTO_INCREMENT,
title varchar(255) NOT NULL,
file varchar(255) NOT NULL,
active tinyint(1) DEFAULT NULL,
markusweigelt marked this conversation as resolved.
Show resolved Hide resolved
client_id INT(10) NOT NULL,
PRIMARY KEY(id)
) DEFAULT CHARACTER SET = utf8mb4
COLLATE utf8mb4_unicode_ci;

-- Add column related to ocr workflow to process table
ALTER TABLE process ADD ocr_workflow_id INT(11) DEFAULT NULL;

-- Add foreign key
ALTER TABLE process add constraint `FK_process_ocr_workflow_id`
foreign key (ocr_workflow_id) REFERENCES ocrworkflow(id);

-- Add column related to ocr workflow to template table
ALTER TABLE template ADD ocr_workflow_id INT(11) DEFAULT NULL;

-- Add foreign key
ALTER TABLE template add constraint `FK_template_ocr_workflow_id`
foreign key (ocr_workflow_id) REFERENCES ocrworkflow(id);
1 change: 1 addition & 0 deletions Kitodo-DataManagement/src/test/resources/hibernate.cfg.xml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
<mapping class="org.kitodo.data.database.beans.LdapServer"/>
<mapping class="org.kitodo.data.database.beans.ListColumn"/>
<mapping class="org.kitodo.data.database.beans.MappingFile"/>
<mapping class="org.kitodo.data.database.beans.OCRWorkflow"/>
<mapping class="org.kitodo.data.database.beans.Process"/>
<mapping class="org.kitodo.data.database.beans.Project"/>
<mapping class="org.kitodo.data.database.beans.Property"/>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@ public enum ParameterCore implements ParameterInterface {
*/
DIR_RULESETS(new Parameter<UndefinedParameter>("directory.rulesets")),

/**
* Absolute path to the directory that the ocr workflow files will be
* read from. It must be terminated by a directory separator ("/").
*/
DIR_OCR_WORKFLOWS(new Parameter<UndefinedParameter>("directory.ocr.workflows")),

/**
* Absolute path to the directory that XSLT files are stored in which are used
* to transform the "XML log" (as visible from the XML button in the processes
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1081,4 +1081,40 @@ public boolean hasAuthorityToViewDatabaseStatistics() {
public boolean hasAuthorityToRunKitodoScripts() {
return securityAccessService.hasAuthorityToRunKitodoScripts();
}

/**
* Check if the current user has the authority to add ocr workflow.
*
* @return true if the current user has the authority to add ocr workflow.
*/
public boolean hasAuthorityToAddOCRWorkflow() {
return securityAccessService.hasAuthorityToAddOCRWorkflow();
}

/**
* Check if the current user has the authority to edit ocr workflow.
*
* @return true if the current user has the authority to edit ocr workflow.
*/
public boolean hasAuthorityToEditOCRWorkflow() {
return securityAccessService.hasAuthorityToEditOCRWorkflow();
}

/**
* Check if the current user has the authority to delete ocr workflow.
*
* @return true if the current user has the authority to delete ocr workflow.
*/
public boolean hasAuthorityToDeleteOCRWorkflow() {
return securityAccessService.hasAuthorityToDeleteOCRWorkflow();
}

/**
* Check if the current user has the authority to view ocr workflow.
*
* @return true if the current user has the authority to view ocr workflow.
*/
public boolean hasAuthorityToViewOCRWorkflow() {
return securityAccessService.hasAuthorityToViewOCRWorkflow();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
/*
* (c) Kitodo. Key to digital objects e. V. <[email protected]>
*
* This file is part of the Kitodo project.
*
* It is licensed under GNU General Public License version 3 or later.
*
* For the full copyright and license information, please read the
* GPL3-License.txt file that was distributed with this source code.
*/

package org.kitodo.production.converter;

import javax.faces.component.UIComponent;
import javax.faces.context.FacesContext;
import javax.faces.convert.Converter;
import javax.inject.Named;

import org.kitodo.production.services.ServiceManager;

@Named
public class OCRWorkflowConverter extends BeanConverter implements Converter {

@Override
public Object getAsObject(FacesContext context, UIComponent component, String value) {
return getAsObject(ServiceManager.getOCRWorkflowService(), value);
}

@Override
public String getAsString(FacesContext context, UIComponent component, Object value) {
return getAsString(value, "ocrWorkflow");
}

}
Loading