Skip to content

Commit

Permalink
Release 1.10.0 (#80)
Browse files Browse the repository at this point in the history
* Athena Release 1.10.0 (#79)

* Version 1.10.0-SNAPSHOT

* Improved concept search (#66)

* 1. excluded brackets, braces, parentheses from search in "non-exact" mode
2. sorted results
3. now results with all keywords are upper then results with some of them
4. enabled partial concept code search
5. added "exact search" mode

* 1. made 'exact mode' case-sensitive
2. added tests

* 1. added fuzzy search
2. fixed tests according to fuzzy search

* copied "id" field into "query_wo_symbols"

* added possibility for a part of phrase to be exact;
updated tests

* fixed postgres version in pom
added comments
wrote README

* show vocabularies release version (#68)

* Athena-13: support vocabulary release version - REST,Email Template, Database changes

* Athena-13: support vocabulary release version - CodeReview comments

* Athena-13: support vocabulary release version - CodeReview comments

* Athena-13: support vocabulary release version - CodeReview comments

* Athena-13: support vocabulary release version - CodeReview comments

* Athena-13: support vocabulary release version - CodeReview comments

* Athena-13: support vocabulary release version - CodeReview comments

* issue-13: rename path according to standards (#72)

* issues-60 Concept search improvement
- small refactoring
- surround "whole phrase" and "term from phrase" query parts with brackets.

* issues-60 Concept search improvement
- add more unit tests
- fix minor thing with split phrase by terms

* issues-60 Concept search improvement
- fix search for exact terms in the whole phrase part

* issues-60 Concept search improvement
- fix issue with only exact phrase search
- fix unit tests

* issues-60 Concept search improvement
- simplify query logic

* issues-60 Concept search improvement
- fix unit test
- rename ConceptSearchPhraseToSolrQueryConverter -> ConceptSearchPhraseToSolrQueryService

* issues-60 Concept search improvement
- fix exact search

* issues-60 Concept search improvement
- add unit test base on the EmbeddedSolrServer

* issues-60 Concept search improvement
- read me

* issues-60 Concept search improvement
- remove example images

* Athena 65 (#75)

* ATHENA-65 sharing api was created

* ATHENA-65 revert application.properties

* ATHENA-65 resolve conflicts

* ATHENA-65 refactoring according to review

* ATHENA-65 user emails validation comment

* issue-53: vocabularies update notification emails (#71)

* Athena-53: handle vocabularies updates notifications

* Athena-53: handle vocabularies updates notifications -schedulers

* Athena-53: handle vocabularies updates notifications - params optimization

* Athena-53: handle vocabularies updates notifications - coding standards

* Athena-53: handle vocabularies updates notifications - coding standards

* issue-53: code-review first part of the comments

* issue-53: code-review second part of the comments

* issue-53: code-review third part of the comments

* issue-53: code-review third part of the comments

* issue-53: rebase on recent develop, resolving merge conflicts

* issue-53: PR comments

* issue-53: PR comments

* issues-60 Concept search improvement
- clarify readme

* issues-60 Concept search improvement
- change example for test and readme

* issues-60 Concept search improvement
- fix: search did not split word by comma, dot and other delimiter chars

* issues-60 Concept search improvement
- fix: schema for solr

* issues-77 Guarantee the same solr configuration for Test/QA/Prod env and unit test
- copy conf file for Embedded solr from main resource.

* Replace old cpt.jar with new one (#78)

* set pom.xml version to 1.10.0 for the release-1.10.0

* Update pom.xml

set version to 1.10.0-QA

* Update pom.xml

Set version to 1.10.0
  • Loading branch information
acumarav authored and wivern committed Sep 27, 2019
1 parent 078fdb8 commit ad2bb5a
Show file tree
Hide file tree
Showing 113 changed files with 8,946 additions and 825 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,8 @@
athena.iml
target/
src/main/resources/public/

### Developer's personal properties ###
**/resources/config/application*-dev-*.properties
**/resources/config/application*-dev-*.yaml
**/resources/config/application*-dev-*.yml
117 changes: 117 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
#1 Phrase search

Search provides the ability to search by phrase. All results are sorted by default according to the following criteria:

-full phrase match
- concepts contain all the words from the search phrase
- result based on two parameters, the number of searched words in the result and importance of each word (importance is calculated for each word, the words that are rearer among all documents are more important)

Example:

Search phrase: **Stroke Myocardial Infarction Gastrointestinal Bleeding**

Name | sort priority explanation |
---- | ---- |
Stroke Myocardial Infarction Gastrointestinal Bleeding| full match |
Gastrointestinal Bleeding Myocardial Infarction Stroke| all words |
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction| 3 words |
Stroke Myocardial Infarction Bleeding in Back| 2 words |
Bleeding in Back Gastrointestinal Bleeding| 2 word |
Stroke Myocardial Infarction| 2 word |
Stroke Myocardial Infarction Strok| 2 words |
Stroke Myocardial Infarction Stroke Nothin| 2 words |
Stroke Myocardial Infarction Renal Dysfunction| 2 words |
Stroke Myocardial Infarction Renal Dysfunction and Nothing| 1 words |
stroke| 1 words |
Stroke| 1 words |
Strook| 1 words |


NB: the search goes through all concept fields, but the highest priority is given to CONCEPT_NAME and CONCEPT_CODE

#2 Exact search

Using quotation marks forces an exact-match search.

For an exact search, the following conditions are met
- the word must be present
- not case sensitive, the number of spaces between words does not matter
- stemming is disabled(the word/words must be present exactly as it is in quotation marks)

Example 1:

Search phrase: **"Stroke Myocardial Infarction Gastrointestinal Bleeding"**

Name |
--- |
Stroke Myocardial Infarction Gastrointestinal Bleeding |
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction |

Example 2:

Search phrase: **"Stroke Myocardial Infarction "Gastrointestinal Bleeding"**

Name |
--- |
Stroke Myocardial Infarction Gastrointestinal Bleeding |
Gastrointestinal Bleeding Myocardial Infarction Stroke |
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction |
Bleeding in Back Gastrointestinal Bleeding |

#3 Special symbols

For special symbols, the following conditions are met
- These special symbols are always ignored and treated as words separation symbols: / \ | ? ! , ; .
e.g. "Pooh.eats?honey!" equals "Pooh eats honey"
- All other special symbols ignored only if it is a separate word: + - ( ) : ^ [ ] { } ~ * ? | & ;
e.g. "Pooh ` eats raspberries - honey" equals "Pooh eats honey", but "Pooh'eats raspberries-honey" will remain the same
- the first funded result will be with characters and then without

Search phrase: **[hip]**

Name |
--- |
[hip] fracture risk |
[Hip] fracture risk |
[hip fracture risk |
hip] fracture risk |
(hip fracture risk |
(hip) fracture risk |
hip fracture risk |
hip) fracture risk |
hip} fracture risk |
hip} fracture risk |
{hip fracture risk |


A special character becomes mandatory if the word is surrounded by quotation marks.

Search phrase: **"[hip]"**

Name |
--- |
[hip] fracture risk |
[Hip] fracture risk |


#4 Approximate matching(fuzzy searching)

In case of a typo, or if there is a similar spelling of the word, the most similar result will be found

Search phrase: **Strok Myocardi8 Infarctiin Gastrointestinal Bleedi**

Name |
--- |
Gastrointestinal Bleeding Myocardial Infarction Stroke|
Stroke Myocardial Infarction Gastrointestinal Bleeding|
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction|
Stroke Myocardial Infarction Strok|
Bleeding in Back Gastrointestinal Bleeding|
Stroke Myocardial Infarction Bleeding in Back|
Stroke Myocardial Infarction|
Stroke Myocardial Infarction Stroke Nothin|
Stroke Myocardial Infarction Renal Dysfunction|
Stroke Myocardial Infarction Renal Dysfunction and Nothing|
stroke|
Stroke|
Stroo |
31 changes: 26 additions & 5 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,9 @@

<groupId>com.odysseusinc.athena</groupId>
<artifactId>athena</artifactId>
<version>1.9.1</version>
<version>1.10.0</version>
<packaging>jar</packaging>


<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
Expand All @@ -23,7 +22,7 @@
<swagger.springmvc.version>1.0.2</swagger.springmvc.version>
<springfox.swagger2.version>2.6.1</springfox.swagger2.version>
<jjwt.version>0.7.0</jjwt.version>
<postgresql.version>9.4.1208-jdbc42-atlassian-hosted</postgresql.version>
<postgresql.version>42.2.5</postgresql.version>
<spring.hateoas.version>0.23.0.RELEASE</spring.hateoas.version>
<opensaml.version>2.6.6-patched</opensaml.version>
<solr.version>7.2.1</solr.version>
Expand Down Expand Up @@ -154,7 +153,7 @@
<version>${jjwt.version}</version>
</dependency>
<dependency>
<groupId>postgresql</groupId>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>${postgresql.version}</version>
<scope>runtime</scope>
Expand Down Expand Up @@ -252,6 +251,28 @@
<artifactId>j2e-pac4j</artifactId>
<version>${j2e-pac4j.version}</version>
</dependency>

<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-core</artifactId>
</dependency>
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-core</artifactId>
<version>3.2.0</version>
</dependency>
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20190722</version>
</dependency>

<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-core</artifactId>
<version>${solr.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
Expand Down Expand Up @@ -469,4 +490,4 @@
</repository>
</distributionManagement>

</project>
</project>
4 changes: 3 additions & 1 deletion properties/prod/application.properties
Original file line number Diff line number Diff line change
Expand Up @@ -113,4 +113,6 @@ arachne.portal.remindToken=

build.number=@build.number@
build.id=@build.id@
project.version=@project.version@
project.version=@project.version@

scheduled.vocabulary.checker=0 0 19 1/1 * ?
Binary file modified src/main/docker/cpt4.jar
Binary file not shown.
30 changes: 1 addition & 29 deletions src/main/docker/cpt4_4_5/cpt.bat
Original file line number Diff line number Diff line change
@@ -1,29 +1 @@
@echo off
rem Argument counting code from http://www.testdeveloper.com/2010/09/26/how-to-count-arguments-to-a-dos-batch-file-without-using-your-fingers-and-toes
set _exitStatus=0
set _argcActual=0
set _argcExpected=2
for %%i in (%*) do set /A _argcActual+=1
if %_argcActual% NEQ %_argcExpected% (
call :_ShowUsage %0%, "Need to include login name and password for UMLS Terminology Services."
set _exitStatus=1
goto:_EOF
)

FOR /f tokens^=2-5^ delims^=.-_^" %%j IN ('java -fullversion 2^>^&1') DO SET "jver=%%j%%k"
IF %jver% GTR 18 (
java -Dumls-user=%1 -Dumls-password=%2 --add-modules=java.xml.ws -jar cpt4.jar 4
) ELSE (
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 4
)
set _exitStatus=%ERRORLEVEL%
goto:_EOF
:_ShowUsage
echo [USAGE]: %~1 login password
if NOT "%~2" == "" (
echo %~2
)
goto:eof
:_EOF
echo The exit status is %_exitStatus%
cmd /c exit %_exitStatus%
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 4
2 changes: 1 addition & 1 deletion src/main/docker/cpt4_4_5/cpt.sh
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
#!/bin/bash
java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 4
java -Dumls-user=$1 -Dumls-password=$2 -jar cpt4.jar 4
11 changes: 5 additions & 6 deletions src/main/docker/cpt4_4_5/readme.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@ CPT4 utility for CDM v4.
This utility will import the CPT4 vocabulary into concept.csv.
Internet connection is required.

Start import process from command line with: "java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 4"
or use cpt.sh or cpt.bat depending on your OS. Please replace "xxx" with UMLS username and password (https://utslogin.nlm.nih.gov/cas/login).
Start import process from command line with:
windows: cpt.bat USER PASSWORD
linux: ./cpt.sh USER PASSWORD
Use USER/PASSWORD from UMLS account: https://utslogin.nlm.nih.gov/cas/login.
Do not close or shutdown your PC until the end of import process,
it will cause damage to concept.csv file.

The number of imported records will be shown in command line,
import process finishes with: "CPT successfully updated." message.
it will cause damage to concept.csv file.
30 changes: 1 addition & 29 deletions src/main/docker/cpt4_5/cpt.bat
Original file line number Diff line number Diff line change
@@ -1,29 +1 @@
@echo off
rem Argument counting code from http://www.testdeveloper.com/2010/09/26/how-to-count-arguments-to-a-dos-batch-file-without-using-your-fingers-and-toes
set _exitStatus=0
set _argcActual=0
set _argcExpected=2
for %%i in (%*) do set /A _argcActual+=1
if %_argcActual% NEQ %_argcExpected% (
call :_ShowUsage %0%, "Need to include login name and password for UMLS Terminology Services."
set _exitStatus=1
goto:_EOF
)

FOR /f tokens^=2-5^ delims^=.-_^" %%j IN ('java -fullversion 2^>^&1') DO SET "jver=%%j%%k"
IF %jver% GTR 18 (
java -Dumls-user=%1 -Dumls-password=%2 --add-modules=java.xml.ws -jar cpt4.jar 5
) ELSE (
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 5
)
set _exitStatus=%ERRORLEVEL%
goto:_EOF
:_ShowUsage
echo [USAGE]: %~1 login password
if NOT "%~2" == "" (
echo %~2
)
goto:eof
:_EOF
echo The exit status is %_exitStatus%
cmd /c exit %_exitStatus%
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 5
2 changes: 1 addition & 1 deletion src/main/docker/cpt4_5/cpt.sh
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
#!/bin/bash
java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 5
java -Dumls-user=$1 -Dumls-password=$2 -jar cpt4.jar 5
9 changes: 4 additions & 5 deletions src/main/docker/cpt4_5/readme.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@ CPT4 utility for CDM v5.
This utility will import the CPT4 vocabulary into concept.csv.
Internet connection is required.

Start import process from command line with: "java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 5"
or use cpt.sh or cpt.bat depending on your OS. Please replace "xxx" with UMLS username and password (https://utslogin.nlm.nih.gov/cas/login).
Start import process from command line with:
windows: cpt.bat USER PASSWORD
linux: ./cpt.sh USER PASSWORD
Use USER/PASSWORD from UMLS account: https://utslogin.nlm.nih.gov/cas/login.
Do not close or shutdown your PC until the end of import process,
it will cause damage to concept.csv file.

The number of imported records will be shown in command line,
import process finishes with: "CPT successfully updated." message.
Loading

0 comments on commit ad2bb5a

Please sign in to comment.