Skip to content

Commit

Permalink
Merge pull request #902 from dlcs/feature/path_template_extensions
Browse files Browse the repository at this point in the history
PathTemplates handles assetPath formatting
  • Loading branch information
donaldgray committed Sep 18, 2024
2 parents 70a308a + a8f87d1 commit 63574fa
Show file tree
Hide file tree
Showing 6 changed files with 153 additions and 50 deletions.
54 changes: 54 additions & 0 deletions docs/path-templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# PathTemplates

The default path template for requests is `/{prefix}/{customer}/{space}/{assetPath}`, where:

* `prefix` is route path (e.g. `iiif-manifest`, `iiif-av`, `iiif-img`) and includes version.
* `customer` and `space` are self explanatory
* `assetPath` is the asset identifier plus any specific elements for the current request - e.g. for image requests it will contain the full IIIF image request.

By default the above format is reflected on info.json (from Thumbs and Orchestrator).

To facilitate using proxy servers to receive alternative URLs that are then rewritten to standard DLCS URLs, overrides to the default rules can be specified. These are used when outputting any self-referencing URIs (e.g. info.json `id` element).

> [!IMPORTANT]
> For the below to work the expectation is that the `x-forwarded-host` header is set in the proxy.
```
"PathRules": {
"Default": "/{prefix}/{customer}/{space}/{assetPath}",
"Overrides": {
"exclude-space.com": "/{prefix}/{customer}/extra/{assetPath}/",
"customer-specific.io": "/{prefix}/{assetPath}"
"i-have-ark.io": "/{prefix}/ark:{assetPath:US}"
}
}
```

As an convenience you can specify `"PathRules:OverridesAsJson"` appSetting, for Orchestrator only, that includes a string-based config. This makes it easier to configure via environment variables etc

## Formatters

`assetPath` supports formatting via a known formatting parameter, e.g. `{assetPath}` can be formatted with `{assetPath:FMT}`.

Supported format parameter values are:

* `3US` - replaces triple _U_nderscores with _S_lashes (e.g. assetPath `"foo___bar_baz"` -> `"foo/bar_baz"`).

## Auth PathTemplates

There is a similar config block availabe for authentication under the `"Auth"` key for Orchestrator.

For auth the path replacements are simpler:
* `customer` is the customer the auth service is for
* `behaviour` is the name of the auth service.

```
"Auth": {
"AuthPathRules": {
"Default": "/auth/{customer}/{behaviour}",
"Overrides": {
"exclude-space.com": "/auth/{behaviour}"
}
}
},
```
41 changes: 40 additions & 1 deletion src/protagonist/DLCS.Core.Tests/DlcsPathHelpersTests.cs
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
namespace DLCS.Core.Tests;
using System;

namespace DLCS.Core.Tests;

public class DlcsPathHelpersTests
{
Expand Down Expand Up @@ -70,4 +72,41 @@ public void GeneratePathFromTemplate_RemovesDoubleSlashes(string template, strin
// Assert
replaced.Should().Be(expected);
}

// Specific example here is for ARK id https://en.wikipedia.org/wiki/Archival_Resource_Key#Structure
[Theory]
[InlineData("https://dlcs.digirati.io/{prefix}/{version}/{customer}/{space}/path/ark:{assetPath}",
"https://dlcs.digirati.io/images/first-space/path/ark:NAAN___Name")]
[InlineData("https://dlcs.digirati.io/{prefix}/{version}/{customer}/{space}/path/ark:{assetPath:3US}",
"https://dlcs.digirati.io/images/first-space/path/ark:NAAN/Name")]
[InlineData("https://dlcs.digirati.io/{prefix}/{assetPath}/path/ark:{assetPath:3US}",
"https://dlcs.digirati.io/images/NAAN___Name/path/ark:NAAN/Name")]
public void GeneratePathFromTemplate_AssetPath_ObeysFormattingInstruction(string template, string expected)
{
// Act
var replaced = DlcsPathHelpers.GeneratePathFromTemplate(template,
prefix: "images",
space: "first-space",
assetPath: "NAAN___Name");

// Assert
replaced.Should().Be(expected);
}

[Fact]
public void GeneratePathFromTemplate_AssetPath_Throws_IfUnknownFormattingInstruction()
{
// Arrange
const string template = "https://dlcs.digirati.io/{prefix}/{version}/{customer}/{space}/path/ark:{assetPath:XY}";

// Act
Action action = () => DlcsPathHelpers.GeneratePathFromTemplate(template,
prefix: "images",
space: "first-space",
assetPath: "NAAN___Name");

// Assert
action.Should().Throw<ArgumentException>()
.WithMessage("'XY' is not a known assetPath format (Parameter 'format')");
}
}
7 changes: 4 additions & 3 deletions src/protagonist/DLCS.Core/DlcsPathHelpers.cs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
using System.Text.RegularExpressions;
using DLCS.Core.Formats;
using DLCS.Core.Types;

namespace DLCS.Core;
Expand All @@ -18,7 +19,7 @@ public static class DlcsPathHelpers
/// <param name="version">Value to replace {version} with</param>
/// <param name="customer">Value to replace {customer} with</param>
/// <param name="space">Value to replace {space} with</param>
/// <param name="assetPath">Value to replace {assetPath} with</param>
/// <param name="assetPath">Value to replace {assetPath} with, optionally formatted</param>
/// <returns>Template with string replacements made</returns>
public static string GeneratePathFromTemplate(
string template,
Expand All @@ -33,8 +34,8 @@ public static string GeneratePathFromTemplate(
.Replace("{version}", version ?? string.Empty)
.Replace("{customer}", customer ?? string.Empty)
.Replace("{space}", space ?? string.Empty)
.Replace("{assetPath}", assetPath ?? string.Empty), "/");
.ReplaceAssetPath(assetPath ?? string.Empty), "/");

/// <summary>
/// Replace known slugs in DLCS auth path template.
/// </summary>
Expand Down
54 changes: 54 additions & 0 deletions src/protagonist/DLCS.Core/Formats/AssetPathFormat.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
using System;
using System.Text.RegularExpressions;

namespace DLCS.Core.Formats;

/// <summary>
/// Helper function for formatting {assetPath} template value, handling replacements
/// </summary>
internal static class AssetPathFormatter
{
// match {assetPath} or {assetPath:FMT}
private static readonly Regex AssetPath = new("({assetPath:?.*})", RegexOptions.Compiled);

public static string ReplaceAssetPath(this string template, string assetPath)
{
var match = AssetPath.Match(template);
if (!match.Success) return template;

for (var x = 0; x < match.Captures.Count; x++)
{
var capture = match.Captures[x].Value;
var forFormat = capture.Replace("assetPath", "0");
template = template.Replace(capture, string.Format(AssetPathFormat.Instance, forFormat, assetPath));
}

return template;
}
}

internal class AssetPathFormat : IFormatProvider, ICustomFormatter
{
public static AssetPathFormat Instance { get; } = new();

// Replace "___" with "/"
private const string UnderscoreToSlash = "3US";

public object? GetFormat(Type? formatType)
=> formatType == typeof(ICustomFormatter) ? this : null;

public string Format(string? format, object? arg, IFormatProvider? formatProvider)
{
if (string.IsNullOrEmpty(format) || arg == null) return arg?.ToString() ?? string.Empty;

var result = arg.ToString();
if (string.IsNullOrEmpty(result)) return string.Empty;

if (format == UnderscoreToSlash)
{
return result.Replace("___", "/");
}

throw new ArgumentException($"'{format}' is not a known assetPath format", nameof(format));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ namespace DLCS.Web.Response;
/// <remarks>
/// This class uses <see cref="PathTemplateOptions"/> to determine different URL patterns for different hostnames,
/// this allows e.g. "id" values on manifests to use different URL structures than the default DLCS paths.
/// e.g. /images/{image}/ rather than default of /iiif-img/{cust}/{space}/{image}
/// e.g. /images/{assetPath}/ rather than default of /iiif-img/{cust}/{space}/{assetPath}
/// </remarks>
public class ConfigDrivenAssetPathGenerator : IAssetPathGenerator
{
Expand Down
45 changes: 0 additions & 45 deletions src/protagonist/Orchestrator/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,51 +96,6 @@ E.g., the following shows IIPImage supports v2 only and Cantaloupe supports v2 +
}
```

### PathTemplates

The default path template for requests is `/{prefix}/{customer}/{space}/{assetPath}`, where:

* `prefix` is route path (e.g. `iiif-manifest`, `iiif-av`, `iiif-img`) and includes version.
* `customer` and `space` are self explanatory
* `assetPath` is the asset identifier plus any specific elements for the current request - e.g. for image requests it will contain the full IIIF image request.

By default the above format is reflected on info.json and single-item manifests.

To facilitate using proxy servers to receive alternative URLs that are then rewritten to standard DLCS URLs, overrides to the default rules can be specified. These are used when outputting any self-referencing URIs (e.g. info.json `id` element).

> For the below to work the expectation is that the `x-forwarded-host` header is set in the proxy.
```
"PathRules": {
"Default": "/{prefix}/{customer}/{space}/{assetPath}",
"Overrides": {
"exclude-space.com": "/{prefix}/{customer}/extra/{assetPath}/",
"customer-specific.io": "/{prefix}/{assetPath}"
}
}
```

As an convenience you can specify "PathRules:OverridesAsJson" appSetting that includes a string-based config. This makes it easier to configure via environment variables etc

#### Auth PathTemplates

There is a similar config block availabe for authentication under the `"Auth"` key.

For auth the path replacements are simpler:
* `customer` is the customer the auth service is for
* `behaviour` is the name of the auth service.

```
"Auth": {
"AuthPathRules": {
"Default": "/auth/{customer}/{behaviour}",
"Overrides": {
"exclude-space.com": "/auth/{behaviour}"
}
}
},
```

### Versioned Requests

`DefaultIIIFImageVersion` and `DefaultIIIFPresentationVersion` specify the default IIIF Image and Presentation API's supported.
Expand Down

0 comments on commit 63574fa

Please sign in to comment.