This repository has been archived by the owner on Sep 24, 2019. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merged next into master for 0.6.0 release.
Fixed: - Facets are not created for evidence uploaded through a dataset. - Facets are empty while uploading a dataset. - Dataset evidence collection is missing annotation/namespace URIs (#95). Changed: - Mongo schema redesign for evidence.facets and evidence facet cache. - Bumped MongoDB requirement to 3.2.0. We now use the $slice operator for facet aggregation operations. Added: - Export evidence using BEL translator plugins (#44). - Export dataset evidence using BEL translator plugins (#99). - Mongo migration scripts for existing installations of openbel-api. - Upgrading guide. - 0.6.0 changelog notes. Squashed commit of the following: commit be2e6e1 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 15 15:07:24 2016 -0400 replace method for BEL.keys_to_symbols additional style alignment commit fbf5368 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 15 09:25:06 2016 -0400 return 404 when translating empty evidence results refs #44 commit ac61baf Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 15 08:32:37 2016 -0400 added storage.engine note for UPGRADING to 0.6.0 commit 3f4f700 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 15 08:27:14 2016 -0400 added UPGRADING guide commit 29f86e8 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 15 08:05:01 2016 -0400 added document for 0.6.0 mongodb migration commit 0e22354 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 15 06:30:26 2016 -0400 add configuration check for MongoDB 3.2 Check will fail to start OpenBEL API is MongoDB is < 3.2 commit 45e5e39 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 15 06:17:57 2016 -0400 added missing arg to render evidence collection commit 1edb037 Author: Anthony Bargnesi <[email protected]> Date: Mon Mar 14 14:45:43 2016 -0400 set mongo operation timeouts to unbounded The operation timeout is the number of seconds that can pass before subsequent reads from a mongo operation. This change makes this read timeout unbounded in order to satisfy long evidence and facet creation queries. commit 39524ca Author: Anthony Bargnesi <[email protected]> Date: Mon Mar 14 13:46:25 2016 -0400 remove cache facets during dataset load Cached facets were removed at the end of a dataset load. Now they are additionally removed at the start of the load as well as every increment of 10k nanopubs loaded. commit 68c2107 Merge: de9a500 61a291d Author: Anthony Bargnesi <[email protected]> Date: Mon Mar 14 12:50:35 2016 -0400 Merge branch 'next' into rewrite_references commit 61a291d Merge: 1b4dbb7 1bdf14e Author: Tony Bargnesi <[email protected]> Date: Mon Mar 14 12:20:40 2016 -0400 Merge pull request #101 from nbargnesi/issue100 Issue100 commit 1bdf14e Author: Nick Bargnesi <[email protected]> Date: Mon Mar 14 12:05:43 2016 -0400 document auth.enabled, auth.secret commit 0e900f6 Author: Nick Bargnesi <[email protected]> Date: Tue Feb 2 13:56:15 2016 -0500 include only auth enabled/secret in default config for #100 commit fbb8b06 Author: Nick Bargnesi <[email protected]> Date: Tue Feb 2 13:55:54 2016 -0500 simplify authenticate route to enabled/disabled commit fe724ff Author: Nick Bargnesi <[email protected]> Date: Tue Feb 2 13:54:30 2016 -0500 remove rest-client dependency commit de9a500 Author: Anthony Bargnesi <[email protected]> Date: Thu Mar 10 14:29:16 2016 -0500 set mongo connection pool size to 30 This number was chosen in order to have at most 30 long-running queries simulaneously executing. This would then fail the 31st query unless a connection could be obtained with a timeout of 5 seconds. commit 8d46fc1 Author: Anthony Bargnesi <[email protected]> Date: Wed Mar 9 14:54:15 2016 -0500 do not index value of experiment_context/metadata annotation values can be large amount of text that will not fit into an index key of 1024, if it's attempted you may see an error: WiredTigerIndex::insert: key too large to index... commit 4426582 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 8 23:01:46 2016 -0500 flatten translator arrays so we return one, if any commit 4d42c35 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 8 20:38:41 2016 -0500 bump puma to 3.1.0 commit 5081567 Author: Anthony Bargnesi <[email protected]> Date: Tue Mar 8 20:36:41 2016 -0500 remove unnecessary local variables commit 32c5e56 Author: Tony Bargnesi <[email protected]> Date: Tue Mar 8 16:59:38 2016 -0500 Update README.md commit 53ea95f Author: Tony Bargnesi <[email protected]> Date: Tue Mar 8 16:51:59 2016 -0500 Update README.md commit 53653c0 Author: Anthony Bargnesi <[email protected]> Date: Mon Mar 7 23:06:27 2016 -0500 correct references when serialization evidence using rewrite references work in bel.rb commit 1b4dbb7 Author: Anthony Bargnesi <[email protected]> Date: Tue Feb 2 16:11:02 2016 -0500 convert /api/evidence to BEL using translators factored out rendering of evidence_resource_collection to evidence helper refs #44 commit 3500811 Author: Anthony Bargnesi <[email protected]> Date: Tue Feb 2 15:20:01 2016 -0500 factored out filters validation into helper functional decomposition of filter validation for better understanding and maintenance; now reporting multiple JSON errors when responding with 400. commit 83935aa Author: Anthony Bargnesi <[email protected]> Date: Tue Feb 2 15:18:27 2016 -0500 added doc for opening ::Sinatra::Helpers::Stream It is important to convey why methods were added to this class. The methods are a convenience so RDF.rb's writers can expect to call them. commit c984f8a Author: Anthony Bargnesi <[email protected]> Date: Tue Feb 2 15:08:44 2016 -0500 bump version dependencies for bel-rdf-jena / rdf rdf bumped to 1.99.1 bel-rdf-jena bumped to 0.4.2 commit e4eb5dd Author: Anthony Bargnesi <[email protected]> Date: Mon Feb 1 14:50:34 2016 -0500 dataset serialization to all bel.rb translators updated dependencies to support all bel.rb translators refs #99 commit b1243d8 Author: Anthony Bargnesi <[email protected]> Date: Tue Jan 26 15:57:16 2016 -0500 aggregate on full-text search; avoids Mongo limits A full-text search filter to /api/evidence with a sort on bel_statement only used the text index. This means that the bel_statement sort had to be done in memory. This reaches the 32 MB sort limit with only several tens of thousands of documents. The solution employed here was to use cursored aggregation allowing disk use for sort stages. The solution was introduced as an alternative code path if a FTS filter was included in the HTTP request. Although this did minimize the risk of regression there is a fair bit of to clean up in the mongo access layer. closes #96 commit 5d44fd0 Author: Anthony Bargnesi <[email protected]> Date: Mon Jan 25 21:48:12 2016 -0500 return annotation/namespace defs in BEL Script removed normalization of experiment_context annotation keywords. The normalized names were in inconsistent with references.annotations definitions. integrate next version of bel.rb (0.4.3) to get fixes for annotation/namespace formats. refs #95 commit 92f7e7e Author: Anthony Bargnesi <[email protected]> Date: Mon Jan 25 15:51:14 2016 -0500 require MongoDB 3.2; closes #98 commit 0507714 Author: Anthony Bargnesi <[email protected]> Date: Mon Jan 25 14:57:28 2016 -0500 added 0.6.0 mongo migration helper, details follow The clear_evidence_facets_cache.rb mongo migration will clear out new evidence facet cache storage in case searches were built before migrating all documents in the "evidence" collection. commit 7707a92 Author: Anthony Bargnesi <[email protected]> Date: Thu Jan 14 14:16:24 2016 -0500 fix /api/datasets/{id}/evidence for facet changes Now facets correctly in light of evidence facet changes and respects "max_values_per_facet". commit 19eedef Author: Anthony Bargnesi <[email protected]> Date: Thu Jan 14 13:10:57 2016 -0500 add scripts for Mongo data migrations in 0.6.0 - Drops evidence_facets since it has been replaced by evidence_facet_cache plus individual "evidence_facet_cache_{UUID}" collections. - Updates each evidence document to have "facets" field contain JSON objects instead of JSON strings. commit 21a7bc4 Author: Anthony Bargnesi <[email protected]> Date: Thu Jan 14 13:08:32 2016 -0500 bumped next version to 0.6.0 Minor release looking to include: - New evidence facet storage in mongo. - Improve dataset import for large documents (occasional OOM). - Evidence streaming. - Evidence export to multiple formats. commit bb2ac16 Author: Anthony Bargnesi <[email protected]> Date: Wed Jan 13 16:44:47 2016 -0500 facet cache collection creation and removal This design builds individual facet_cache collections based on the filters applied to the evidence collection. Each filtered evidence collection will get it's own "evidence_facet_cache_{UUID}" mongo collection. The facets values are grouped by category, name so it's trivial to cursor out the facets (still need to set the filter string though). This alleviates the max document size issue for large evidence collections. A max of 1000 facet values can be added to each category, name pair in order to stay within the size limit. Facet cache eviction isn't great here: - Individual evidence changes require removal of facet caches for the empty filter search as well as any overlapping filter/facet. - Creation or removal of a dataset will remove all facet caches. The thought is that for large dataset imports it is more effective to regenerate than cache vs. trying to synchronize it with new data. This includes a breaking change to evidence document schema. The evidence "facets" array stores the full category, name, value json objects instead of flat strings. This is done to make it possible to separate values into category, name groupings. We should include an upgrade note for this and possibly a script. commit f5a08a3 Merge: f038be2 a515587 Author: Anthony Bargnesi <[email protected]> Date: Wed Jan 13 16:42:24 2016 -0500 Merge branch 'master' into next commit f038be2 Author: Anthony Bargnesi <[email protected]> Date: Mon Jan 11 22:58:47 2016 -0500 batch evidence to an array, avoid JRuby enumerator The JRuby enumerator uses a thread per next object in an enumerator which proves costly. Hundreds of threads are created (tested with yourkit) when batch-creating evidence due to the "each_slice(500)" of the enumerator. This issue is logged in JRuby: jruby/jruby#2577 The solution employed was to yield each evidence directly to the block and batch 500 into an array at a time. This should avoid the OOM exception received: ava.lang.OutOfMemoryError: unable to create new native thread Indeed the thread count was observed to be lower in yourkit.
- Loading branch information