• Blog >
  • Sharing content part 1

CTI have recently been working with The Wildlife Trusts to create a platform that will be available to run all of their local trust websites. In addition to their UK-wide site, each regional Trust has their own website (e.g. Surrey Wildlife Trust and North Wales Wildlife Trust). Rather than the overarching case study, this blog post is a deeper technical dive into how we approached two of the key requirements: content sharing and cross-site searching.

Share and Share Alike

Content needed to be searchable by location: nature reserves, events, jobs, etc. As you might expect, the national site needed to know about all of the content that existed on the regional sites. Rather than expecting administrators to manually duplicate content, we used migrate from Drupal 8 core and jsonapi modules to share content between the sites.

The jsonapi module has no configuration and simply requires enabling (although we did use jsonapi_extras to tweak things slightly). The exact details of Drupal 8 migrations are beyond the scope of this post but the community documentation is a good place to start.

In this instance, the main customisation requirement was to manage all of the regional domains.

Normally, a fully-qualified URL is required for migrations fetching from a URL. The national site was already keeping a list of the regional site names and domains, managed through a custom configuration page. To prevent every migration configuration file needing its own copy of the domains, we created a custom source plugin and a custom data_parser_plugin (plugin types provided by the migrate_plus module).

The relevant part of the migration configuration files looks like:

source:
 plugin: wildlife_sharing_url_title
 data_fetcher_plugin: http
 data_parser_plugin: wildlife_sharing_json_title
 path: /jsonapi/node/reserve


As you can see, the path setting doesn’t contain the domain part of the URL as it normally would. The custom wildlife_sharing_url_title plugin extended Drupal\migrate_plus\Plugin\migrate\source\Url and only had one method overridden, the constructor:

public function __construct(array $configuration, $plugin_id, $plugin_definition, \Drupal\migrate\Plugin\MigrationInterface $migration) {
   $config_factory = \Drupal::service('config.factory');
   $domains = $config_factory->get('wildlife_sharing.settings')->get('domains');
   foreach ($domains as &$domain) {
     $domain['url'] .= $configuration['path'];
   }
   $configuration['urlTitles'] = $domains;
   $configuration['urls'] = array_column($domains, 'url');
   parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
 }

 

This code loads the wildlife_sharing.settings.domains configuration variable and loops through the domains, appending the path to each one before passing the array to be handled by Drupal\migrate_plus\Plugin\migrate\source\Url as normal.

The wildlife_sharing_json_title plugin is only slightly more complex. It extends Drupal\migrate_plus\Plugin\migrate_plus\data_parser\Json, overrides the constructor, getSourceData(), and fetchNextRow() methods, before adding an additional property called urlTitles.

The property is to store the URLs modified in the source plugin from the wildlife_sharing_url_title plugin and the only change to the constructor is to set this value.


 /**
  * Array of arrays. Leaf arrays have uri and title keys.

  *
  * @var array
  */
 protected $urlTitles = [];

   /**

  * {@inheritdoc}

  */

 public function __construct(array $configuration, $plugin_id, $plugin_definition) {
   parent::__construct($configuration, $plugin_id, $plugin_definition);
   $this->urlTitles = $configuration['urlTitles'];
 }


The getSourceData() method skips the domain if it points to the current site, as we don’t want a site to migrate its own content:


 /**
  * {@inheritdoc}
  */
 function getSourceData($url, $item_selector) {
   // Do not get data from the current site.
   if (strpos($url, \Drupal::request()->getHost())) {
     return [];
   }

   try {
     $source_data = parent::getSourceData($url, $item_selector);
   }
   catch (\Exception $exception) {
     return [];
   }
   return !empty($source_data) ? $source_data : [];
 }


The fetchNextRow() method looks complicated but only has a single additional line compared to the fetchNewRow() method it is overriding. This new line adds the URL for the current record to the item source (so that it can be stored on the migrated entity in the migration):

  /**
  * {@inheritdoc}
  */
 protected function fetchNextRow() {
   $current = $this->iterator->current();
   if ($current) {
     foreach ($this->fieldSelectors() as $field_name => $selector) {
       $field_data = $current;
       $field_selectors = explode('/', trim($selector, '/'));
       foreach ($field_selectors as $field_selector) {
         $field_data = $field_data[$field_selector];
       }

       $this->currentItem[$field_name] = $field_data;
     }
     // This line is the only change from the parent class. It adds the
     // URL for the current record to the item source.
$this->currentItem['active_url'] = $this->urlTitles[$this->activeUrl];

     if (!empty($this->configuration['include_raw_data'])) {
       $this->currentItem['raw'] = $current;
     }
     $this->iterator->next();
   }
 }
}


Running these migrations on cron means that the national site now has copies of all of the content that can be part of location searches. This content is non-editable at national level, so that edits must be made to the original content and the next cron run will update them through the migrations.

 

What’s that over there?

So far, so good. But how does national-level content storage help with regional site searches? We’re fans of using each tool to its strengths and, as Drupal is good at integrating third-party solutions, we prefer Apache Solr as a search platform. Addressing Solr integration into Drupal is beyond the scope of this post, but I will cover some of the key customised parts.

The national site contains all of the searchable content, so it made sense to maintain a Solr index for location-searchable content at a national level. There was no reason for the regional sites to manage their own indexes for subsets of the data, so all sites share the Solr index that the national site maintains. Importantly, regional sites have read only access, to prevent conflicting edits.

Following discussions with the client, we decided that search results should appear from the current site only. Then, if a user performed an active location search, they should see all results within that radius, regardless of which site they belonged to. These results would identify themselves as external links where relevant. 

To implement this, we used hook_search_api_solr_query_alter() like so:

function

wildlife_search_search_api_solr_query_alter(\Solarium\Core\Query\QueryInterface $solarium_query, \Drupal\search_api\Query\QueryInterface &$query) {

 // Only alter location index queries.
 if ($query->getIndex()->getOriginalId() != 'location') {
   return;
 }

 // Only alter queries for the location_search View.
 /** @var \Drupal\views\ViewExecutable $view */
 $view = $query->getOption('search_api_view');
 if (empty($view) || $view->id() != 'location_search') {
   return;
 }

 // If the search includes a location, show all results.
 $location = Drupal::request()->get('location');
 if (!empty($location)) {
   return;
 }

 // If the search is location-less, only show results from the current site.
 // (Unless this is the national site).
 $domain = Drupal::request()->getHost();
 if (_wildlife_sharing_get_site_type() != 'national') {
   $helper = $solarium_query->getHelper();
   $solarium_query->createFilterQuery('site')
     ->setQuery('ss_field_external_link:*' . $helper->escapeTerm($domain) . '*');
 }
}

The first three if() statements ensure that we only alter the relevant queries for location searches on the correct View. The final section is the important part: here we check that the site is not the national site (as those searches will always show all results).

We then use Solarium to add an additional filter to the query. Solarium is a PHP library for working with Solr queries in a more manageable way. 

The filter created checks whether the ss_field_external_link property contains the current domain (after being escaped to make it suitable for Solr). The external link field is a normal Drupal Link field; populated by the migration, this links back to the original location of the content. The ss_ prefix identifies the field as a string field and comes from the default search_api_solr module Solr configuration.

 

This seems like a good place to end the first part of this post. Check out part 2 here for an exploration into rendering the results.

Part 2: Rendering The Results

 

● ● ●