-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Athena Release 1.10.0 (#79) * Version 1.10.0-SNAPSHOT * Improved concept search (#66) * 1. excluded brackets, braces, parentheses from search in "non-exact" mode 2. sorted results 3. now results with all keywords are upper then results with some of them 4. enabled partial concept code search 5. added "exact search" mode * 1. made 'exact mode' case-sensitive 2. added tests * 1. added fuzzy search 2. fixed tests according to fuzzy search * copied "id" field into "query_wo_symbols" * added possibility for a part of phrase to be exact; updated tests * fixed postgres version in pom added comments wrote README * show vocabularies release version (#68) * Athena-13: support vocabulary release version - REST,Email Template, Database changes * Athena-13: support vocabulary release version - CodeReview comments * Athena-13: support vocabulary release version - CodeReview comments * Athena-13: support vocabulary release version - CodeReview comments * Athena-13: support vocabulary release version - CodeReview comments * Athena-13: support vocabulary release version - CodeReview comments * Athena-13: support vocabulary release version - CodeReview comments * issue-13: rename path according to standards (#72) * issues-60 Concept search improvement - small refactoring - surround "whole phrase" and "term from phrase" query parts with brackets. * issues-60 Concept search improvement - add more unit tests - fix minor thing with split phrase by terms * issues-60 Concept search improvement - fix search for exact terms in the whole phrase part * issues-60 Concept search improvement - fix issue with only exact phrase search - fix unit tests * issues-60 Concept search improvement - simplify query logic * issues-60 Concept search improvement - fix unit test - rename ConceptSearchPhraseToSolrQueryConverter -> ConceptSearchPhraseToSolrQueryService * issues-60 Concept search improvement - fix exact search * issues-60 Concept search improvement - add unit test base on the EmbeddedSolrServer * issues-60 Concept search improvement - read me * issues-60 Concept search improvement - remove example images * Athena 65 (#75) * ATHENA-65 sharing api was created * ATHENA-65 revert application.properties * ATHENA-65 resolve conflicts * ATHENA-65 refactoring according to review * ATHENA-65 user emails validation comment * issue-53: vocabularies update notification emails (#71) * Athena-53: handle vocabularies updates notifications * Athena-53: handle vocabularies updates notifications -schedulers * Athena-53: handle vocabularies updates notifications - params optimization * Athena-53: handle vocabularies updates notifications - coding standards * Athena-53: handle vocabularies updates notifications - coding standards * issue-53: code-review first part of the comments * issue-53: code-review second part of the comments * issue-53: code-review third part of the comments * issue-53: code-review third part of the comments * issue-53: rebase on recent develop, resolving merge conflicts * issue-53: PR comments * issue-53: PR comments * issues-60 Concept search improvement - clarify readme * issues-60 Concept search improvement - change example for test and readme * issues-60 Concept search improvement - fix: search did not split word by comma, dot and other delimiter chars * issues-60 Concept search improvement - fix: schema for solr * issues-77 Guarantee the same solr configuration for Test/QA/Prod env and unit test - copy conf file for Embedded solr from main resource. * Replace old cpt.jar with new one (#78) * set pom.xml version to 1.10.0 for the release-1.10.0 * Update pom.xml set version to 1.10.0-QA * Update pom.xml Set version to 1.10.0
- Loading branch information
Showing
113 changed files
with
8,946 additions
and
825 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
#1 Phrase search | ||
|
||
Search provides the ability to search by phrase. All results are sorted by default according to the following criteria: | ||
|
||
-full phrase match | ||
- concepts contain all the words from the search phrase | ||
- result based on two parameters, the number of searched words in the result and importance of each word (importance is calculated for each word, the words that are rearer among all documents are more important) | ||
|
||
Example: | ||
|
||
Search phrase: **Stroke Myocardial Infarction Gastrointestinal Bleeding** | ||
|
||
Name | sort priority explanation | | ||
---- | ---- | | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding| full match | | ||
Gastrointestinal Bleeding Myocardial Infarction Stroke| all words | | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction| 3 words | | ||
Stroke Myocardial Infarction Bleeding in Back| 2 words | | ||
Bleeding in Back Gastrointestinal Bleeding| 2 word | | ||
Stroke Myocardial Infarction| 2 word | | ||
Stroke Myocardial Infarction Strok| 2 words | | ||
Stroke Myocardial Infarction Stroke Nothin| 2 words | | ||
Stroke Myocardial Infarction Renal Dysfunction| 2 words | | ||
Stroke Myocardial Infarction Renal Dysfunction and Nothing| 1 words | | ||
stroke| 1 words | | ||
Stroke| 1 words | | ||
Strook| 1 words | | ||
|
||
|
||
NB: the search goes through all concept fields, but the highest priority is given to CONCEPT_NAME and CONCEPT_CODE | ||
|
||
#2 Exact search | ||
|
||
Using quotation marks forces an exact-match search. | ||
|
||
For an exact search, the following conditions are met | ||
- the word must be present | ||
- not case sensitive, the number of spaces between words does not matter | ||
- stemming is disabled(the word/words must be present exactly as it is in quotation marks) | ||
|
||
Example 1: | ||
|
||
Search phrase: **"Stroke Myocardial Infarction Gastrointestinal Bleeding"** | ||
|
||
Name | | ||
--- | | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding | | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction | | ||
|
||
Example 2: | ||
|
||
Search phrase: **"Stroke Myocardial Infarction "Gastrointestinal Bleeding"** | ||
|
||
Name | | ||
--- | | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding | | ||
Gastrointestinal Bleeding Myocardial Infarction Stroke | | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction | | ||
Bleeding in Back Gastrointestinal Bleeding | | ||
|
||
#3 Special symbols | ||
|
||
For special symbols, the following conditions are met | ||
- These special symbols are always ignored and treated as words separation symbols: / \ | ? ! , ; . | ||
e.g. "Pooh.eats?honey!" equals "Pooh eats honey" | ||
- All other special symbols ignored only if it is a separate word: + - ( ) : ^ [ ] { } ~ * ? | & ; | ||
e.g. "Pooh ` eats raspberries - honey" equals "Pooh eats honey", but "Pooh'eats raspberries-honey" will remain the same | ||
- the first funded result will be with characters and then without | ||
|
||
Search phrase: **[hip]** | ||
|
||
Name | | ||
--- | | ||
[hip] fracture risk | | ||
[Hip] fracture risk | | ||
[hip fracture risk | | ||
hip] fracture risk | | ||
(hip fracture risk | | ||
(hip) fracture risk | | ||
hip fracture risk | | ||
hip) fracture risk | | ||
hip} fracture risk | | ||
hip} fracture risk | | ||
{hip fracture risk | | ||
|
||
|
||
A special character becomes mandatory if the word is surrounded by quotation marks. | ||
|
||
Search phrase: **"[hip]"** | ||
|
||
Name | | ||
--- | | ||
[hip] fracture risk | | ||
[Hip] fracture risk | | ||
|
||
|
||
#4 Approximate matching(fuzzy searching) | ||
|
||
In case of a typo, or if there is a similar spelling of the word, the most similar result will be found | ||
|
||
Search phrase: **Strok Myocardi8 Infarctiin Gastrointestinal Bleedi** | ||
|
||
Name | | ||
--- | | ||
Gastrointestinal Bleeding Myocardial Infarction Stroke| | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding| | ||
Stroke Myocardial Infarction Gastrointestinal Bleeding and Renal Dysfunction| | ||
Stroke Myocardial Infarction Strok| | ||
Bleeding in Back Gastrointestinal Bleeding| | ||
Stroke Myocardial Infarction Bleeding in Back| | ||
Stroke Myocardial Infarction| | ||
Stroke Myocardial Infarction Stroke Nothin| | ||
Stroke Myocardial Infarction Renal Dysfunction| | ||
Stroke Myocardial Infarction Renal Dysfunction and Nothing| | ||
stroke| | ||
Stroke| | ||
Stroo | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,29 +1 @@ | ||
@echo off | ||
rem Argument counting code from http://www.testdeveloper.com/2010/09/26/how-to-count-arguments-to-a-dos-batch-file-without-using-your-fingers-and-toes | ||
set _exitStatus=0 | ||
set _argcActual=0 | ||
set _argcExpected=2 | ||
for %%i in (%*) do set /A _argcActual+=1 | ||
if %_argcActual% NEQ %_argcExpected% ( | ||
call :_ShowUsage %0%, "Need to include login name and password for UMLS Terminology Services." | ||
set _exitStatus=1 | ||
goto:_EOF | ||
) | ||
|
||
FOR /f tokens^=2-5^ delims^=.-_^" %%j IN ('java -fullversion 2^>^&1') DO SET "jver=%%j%%k" | ||
IF %jver% GTR 18 ( | ||
java -Dumls-user=%1 -Dumls-password=%2 --add-modules=java.xml.ws -jar cpt4.jar 4 | ||
) ELSE ( | ||
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 4 | ||
) | ||
set _exitStatus=%ERRORLEVEL% | ||
goto:_EOF | ||
:_ShowUsage | ||
echo [USAGE]: %~1 login password | ||
if NOT "%~2" == "" ( | ||
echo %~2 | ||
) | ||
goto:eof | ||
:_EOF | ||
echo The exit status is %_exitStatus% | ||
cmd /c exit %_exitStatus% | ||
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
#!/bin/bash | ||
java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 4 | ||
java -Dumls-user=$1 -Dumls-password=$2 -jar cpt4.jar 4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,29 +1 @@ | ||
@echo off | ||
rem Argument counting code from http://www.testdeveloper.com/2010/09/26/how-to-count-arguments-to-a-dos-batch-file-without-using-your-fingers-and-toes | ||
set _exitStatus=0 | ||
set _argcActual=0 | ||
set _argcExpected=2 | ||
for %%i in (%*) do set /A _argcActual+=1 | ||
if %_argcActual% NEQ %_argcExpected% ( | ||
call :_ShowUsage %0%, "Need to include login name and password for UMLS Terminology Services." | ||
set _exitStatus=1 | ||
goto:_EOF | ||
) | ||
|
||
FOR /f tokens^=2-5^ delims^=.-_^" %%j IN ('java -fullversion 2^>^&1') DO SET "jver=%%j%%k" | ||
IF %jver% GTR 18 ( | ||
java -Dumls-user=%1 -Dumls-password=%2 --add-modules=java.xml.ws -jar cpt4.jar 5 | ||
) ELSE ( | ||
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 5 | ||
) | ||
set _exitStatus=%ERRORLEVEL% | ||
goto:_EOF | ||
:_ShowUsage | ||
echo [USAGE]: %~1 login password | ||
if NOT "%~2" == "" ( | ||
echo %~2 | ||
) | ||
goto:eof | ||
:_EOF | ||
echo The exit status is %_exitStatus% | ||
cmd /c exit %_exitStatus% | ||
java -Dumls-user=%1 -Dumls-password=%2 -jar cpt4.jar 5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
#!/bin/bash | ||
java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 5 | ||
java -Dumls-user=$1 -Dumls-password=$2 -jar cpt4.jar 5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.