Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

DKAN Integration Branch #26

Open
wants to merge 135 commits into
base: 7.x-1.x
Choose a base branch
from
Open

DKAN Integration Branch #26

wants to merge 135 commits into from

Conversation

dafeder
Copy link
Member

@dafeder dafeder commented Jul 26, 2016

This is the branch of DKAN Harvest that will go into DKAN core when complete. Do not merge this branch, it currently exists for CI and review purposes.

rhabbachi and others added 30 commits April 20, 2016 18:46
Git log report of all the changes done to dkan_harvest improvements:

Ahmed Sghaier (8):
  [7343921] Add the harvester module and configure #777
  [0b8e501] added dkan_harvest_sources
  [af2f03b] Merging dkan_harvest_source and dkan_harvest_dashboard modules
  into upstream dkan_harvest_sources, Integration and refactoring of
  upstream module
  [7d50e0e] Migration refactor to support new caching system
  [c37cc56]  As an Administrator, I want to add new harvest sources #722
  [8b8deb9]  As an Administrator, I want to be able to kick off a harvest
  process for a source at any time. #730
  [cfc05b4] civic-600 integration: adds dynamic source types via
  hook_harvest_sources_types
  [890f9af] civic-600 migration fix

Riadh Habbachi (90):
  [e523124] Simplify the source type quering in dkan_harvest
  [83ac296] Add hook php-doc header
  [8d6e4a0] Update the allowed_values_function for dkan_harvest_sources
  [b57232b] Move to a similar typical dkan module structure
  [4097c66] Move dkan_harvest_dashboard into dkan_harvest modules folder
  [c8a34b5] Extract data_json support for dkan_harvest to its own module
  [7767383] Update dkan_harvest modules info file
  [15bf86a] Initial rework for the caching mechanism
  [1122617] Encapsulate dkan_harvest source and source type into classes
  [9447429] Fix SourceType constructor validation and hook calls
  [fc25e13] Improved allowed values for type field in the harvest source
  content type
  [f7dd977] Move source location checking to the Source class
  [192050f] Rename dkan_harvest_data_json to dkan_harvest_pod
  [0f5e62e] Fix hook_harvest_sources() and hook_harvest_source_types()
  invocation
  [e6067e8] Change the SourceType property in Source class
  [6567189] Small fix for type declaration in the example module
  [f098ba5] Move the cache directory caching logic to the Source class
  [d4e9434] Update harvest example module dependency to dkan_harvest_pod
  [a022edd] Add functions to generate machine names for a source
  [9a24793] Move specfic pod migration code to dkan_harvest_pod
  [1d38b58] Fixes Harvest caching code for pod source
  [c6a2749] Add source migration factory instantiation code
  [10ce648] Update dkan_migrate_base_add_modified_column call during
  migration
  [b90861b] Get the source machine_name from the array key returned by the
  source hook
  [9d40b95] Change pod source type migration class export
  [0ecc1f7] Refactor some machine generation code in Source class
  [8ff168b] Fix the migration status error
  [63a7fc0] Fixes the migration process code
  [dcc293c] Fix small bug on the pod caching callback
  [1d79612] Rename dkan_harvest source and source type classes to a less
  generic names
  [ea7cf17] Small refactoring to the POD harvest caching callback
  [b743115] Add function to query sources by machine name
  [7d5e965] Remove statict migration call on harvest migrate
  [be2799c] fix: Add support for existing harvest migration
  [623c956] feat: Add more dkan_harvest drush commands
  [48b48bd] refactor: Change proprety name to label
  [41e46f7] feat: Add DkanHarvestSource Entity
  [10731f0] feat: Implement hook_views_data for harvest sources
  [fa05c6f] refactor: Clean source import hooks
  [d78e343] refactor: depricate drupal cron harvest support
  [0e93bff] refactor: remove harvest dashboard prototype code
  [06185ca] refactor: more prototype code clean up
  [f64d8f8] feat: improve harvest logging callback
  [178e749] feat: Return message for harvest cache operation
  [be57320] feat: Add a HarvestMigrate class to manage harvest based imports
  [ae06a72] refactor: Rework the migration registration logic
  [24f1999] docs: Add comment to HarvestSource
  [bf9f34b] feat: Add skip hash option to drush harvest migrate
  [b08bccf] feat: Add fixes and features to HarvestMigration
  [85ff1b4] doc: More phpdoc for base dkan_workflow callbacks
  [01b81e3] feat: Add harvest rollback drush command
  [da97700] refactor: Move source lookup to HarvestSource
  [f21b645] test: phpunit tests for core dkan_harvest
  [018410f] fix: Multiple fixes for HarvestSource class
  [7482c33] fix: Multiple fixes for HarvestSourceType
  [a9fcef6] feat: Override HarvestMigration instantiation
  [60864c6] refactor: Better logging for dkan_harvest
  [008048c] refact: Split postImport missing dataset logic
  [a8cb51b] Add dkan-harvest-list drush command
  [e449518] Better verification in the HarvestSource construct
  [33857ba] Replace sources lookup from hook to db query
  [0d9ddfb] More Harvest source field in the content type
  [243c56c] Add pathauto config to harvest_source
  [f404cdd] Add Source URI field validation for harvest source
  [e009700] Add safeword module (new dkan_harvest dependency)
  [044e4ec] Make dkan_harvest_sources a hard dep to harvest
  [03a5923] Rework resource helper and creation function
  [8019456] Support for single local file uri default caching
  [3eaf803] Better @cover annotation in core harvest unit test
  [9a193fa] Add dkan_harvest_sources and HarvestSource tests
  [e70a7af] Add placeholder phpunit test for HarvestMigration
  [5ffcb29] Refactor: remove debug code
  [dda0c9b] Better Harvest source display order
  [24b0f18] Update the harvest dashboard action callbacks
  [19cd471] Drop portal administartor from dashboard perms
  [32858d6] Drop unneeded harvest list view
  [d55de27] More friendly names for harvest dashboard actions
  [eb6cd8b] Fix HarvestList cache directory initialisation
  [67252f5] Code documentation, styling and checks
  [e1cba7a] Add dependency to drush command
  [1a11df2] Better name for harvest sources status drush cmd
  [5086877] Fix drush harvest rollback command arguments check
  [979fe02] Add souce migration deregistration support
  [95394cc] Add harvest cache operation information return
  [bd4e28c] Tags creation handled by the base harvest migration
  [4c6271c] Better exception handeling in the harvest migration
  [0628e23] More work on the pod harvest cache/migration
  [175edee] Fix wrong machine name field name in harvest source
  [f5c3b76] Add "Last Updated" field to harvest dashboard
  [0b7eec4] Rework querying the harvest source by machine
* Add getMigrationCountFromMachineName method.
* Add views_handler_field_numeric_harvest_count views field handler.
Based on this ckan extension https://github.com/HHS/ckanext-datajson
which seems to be the only ckan harvest extension supported right now.
Error message: "Argument 1 passed to HarvestSourceType::__construct()
must be an instance of String, string given"
* Add harvest (cache and migrate) dkan-hh drush command.
Add the ability to disable provided rules list during a harvest
migration.
* Restore support for dkan-cache-harvested-data, dkan-harvest-run, and
 dkan-migrate-cached-data drush command.
* Initial support for dkan_workflow status during harvest migration.
* Provid default value for field_license field.
* Set to create new revision by default.
* Set the default revision_uid to user 1.
The "dkan_harvest_object_id" is the cache file id/name (without the
extension). Since we are using this in the migrate map it makes sense to
use it as the dataset identifier.
Reduce everything to lower case to support urls with uppercase
extension.
Less clumsy on the command line. Those rules would have to be set for
all the harvested sources anyway so it make more sense to have those
stored in a global drupal variable.
We run into issues with behat with wrong harvest source types returned
by "dkan_harvest_source_types_definition" on a @AfterScenario call. The
possible cause being the Behat/Dkanextension context failing to properly
support drupal hooks implementation and calls. A possible solution for
this is to cache the harvest source types entries and bypass the hooks
invocation loop alltogether.

Included in this commit are:
* Caching code added to "dkan_harvest_source_types_definition".
* Enabling/Disable modules that implements the "harvest_source_types"
hook awearness.
* Phpunit tests.
Those export were done initially on USDA with a different features
module version.
topicus and others added 26 commits August 9, 2016 13:58
* Added group import.

* Added checking of group imports on unit tests.

* Fixed group tests.
Fix node-view dataset count function
Remove 'groups were updated...' messages when harvesting datasets
@topicus topicus mentioned this pull request Oct 3, 2016
3 tasks
Sol Villar and others added 3 commits October 6, 2016 18:20
List of commits done in core:
e6791363 Fix socrata file download in harvest (#1590) <Riadh Habbachi>
2d90af47 Add refresh button to harvest preview page CIVIC-4700 (#1584) <Janette Day>
d89c8042 Improve resource harvest link processing (#1434) <Riadh Habbachi>
c8bfac7a Allow navigate input source field in the source page (#1557) <Mariano Carballal>
53d44be3 Coder fixes (#1563) <Dan Feder>
a942bdba Restore harvest status set up <Janette Day>
25422d25 Move harvest error page css to theme, remove deprecated js <Janette Day>
550f9346 Move harvest css to the theme, fix harvest source uri display <Janette Day>
41dc84c6 CSS cleanup and reorganization. <Sol Villar>
f1a842c1 Fixed breadcrumb for harvest source pages. <Sol Villar>
75b0ead3 Fixed title on manage datasets view. <Sol Villar>
40fa0144 Added missing titles on harvest source pages. <Sol Villar>
4b4163e9 Removed weight settings from path bradcrumb config. <Sol Villar>
26291df9 Fixed breadcrumbs on harvest sources preview page. <Sol Villar>
dc82d6c1 Added 'Add Source' shortcut on Harvest Dashboard pages. <Sol Villar>
acd39ea9 Decrease the description field number of rows. Update doc. <Mariano Carballal>
9fef8e8c Finished cleaning up modules folder code <Dan Feder>
dcf891a7 Code review and cleanup for dkan harvest (#1507) <Mariano Carballal>
c7eda018 Renamed old ODFE fields. (#1511) <msolv>
a3fbbab1 Update documentation #CIVIC-4591 <Jacinto capote Robles>
e92d079a Fix Error when harvesting source datasets that does not have associated resources (#1470) <Riadh Habbachi>
ae7d448a Improved 'Preview' page on DKAN Harvest (#1482) <msolv>
9e97b345 Do not set the file if does not exists for metadata_source node <Riadh Habbachi>
6d0e2452 Fix static method call <Riadh Habbachi>
5f531b78 Cannot log messages on Migration constructor <Riadh Habbachi>
499400f2 Refactor error message out of the HarvestMigration::createTax() <Riadh Habbachi>
546f1a06 Update README.md <Mariano Carballal>
8f635379 Fix metadata harvest (#1430) <Riadh Habbachi>
13d0f752 Update harvester README <Dan Feder>
8431266d Update README.md <Mariano Carballal>
07a848df Remove field notes <Mariano Carballal>
aaf5ae3d Update README.md <Mariano Carballal>
6a127ca0 Replace notes with body in harvest sources #CIVIC-4572 <Mariano Carballal>
76096f9b Add compound filters (#1468) <Mariano Carballal>
0c40decc Improve Harvester docs (#1483) <Dan Feder>
0620e198 Added PHP Unit tests on DKAN Core. (#1403) <msolv>
cde6c056 Added Batch API on 'Harvest now' action. (#1472) <msolv>
dcb5af85 Fix space in multiline error message (#1474) <Riadh Habbachi>
17c956c7 Removed row from migration map table when a dataset is deleted (#1462) <msolv>
b0c9b2ee Added status field to harvest event table #CIVIC-4504 <Jacinto capote Robles>
28a3244b Warning message edit resource without dataset civic 4447 (#1459) <Jacinto>
f65b76d8 Set html format for harvested resource in the body field (#1454) <Mariano Carballal>
c3fcb461 Fix error tab first entry header misleading timing (#1450) <Riadh Habbachi>
4d0a0195 Fixed text on confirmation screen. (#1451) <msolv>
682fc4a9 Fix Harvest Source breacrumb path (#1453) <Mariano Carballal>
aa8dcdaa Fix event id sort column <Mariano Carballal>
3b360a01 Fix field orphan output format <Mariano Carballal>
96177651 Add field_orphan to resource in the info file <Mariano Carballal>
70eb3442 Add sorting to orphan and published columns <Mariano Carballal>
a67ed3f4 Fix dkan harvest dashboard view <Mariano Carballal>
d5b8484d Log all Harvest error to the log table <Riadh Habbachi>
e54ed542 DKAN Repo Consolidation (#1387) <Janette Day>
f099b175 Harvest dkan integration (#1287) <Mariano Carballal>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants