Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link Suggestions/Management #348

Draft
wants to merge 94 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
68b6ded
WIP
JohnathonKoster Sep 11, 2024
4965ca9
Merge branch 'master' into content-linking
JohnathonKoster Sep 11, 2024
3a0444e
Improve context retrieval
JohnathonKoster Sep 14, 2024
9ca1246
Be smarter about how data is updated
JohnathonKoster Sep 14, 2024
cf935ac
Use correct entry id
JohnathonKoster Sep 14, 2024
d144ab7
Be smart about detecting added/removed links
JohnathonKoster Sep 14, 2024
1f3b2af
Create InternalLinksUpdated.php
JohnathonKoster Sep 14, 2024
8964e51
Check for internal changes when updating
JohnathonKoster Sep 14, 2024
297e9d2
Add cached_uri to list of fields searched
JohnathonKoster Sep 14, 2024
6f50816
Adds support for inserting links into Bard fields 😮‍💨
JohnathonKoster Sep 14, 2024
5f29eb8
Wire some stuff up!
JohnathonKoster Sep 14, 2024
2869f04
Tidy up a bit
JohnathonKoster Sep 14, 2024
cabd1e5
Remove for now
JohnathonKoster Sep 15, 2024
f12394f
Ability to disable the text linking stuff
JohnathonKoster Sep 15, 2024
803cf5b
Dramatically improves success rate of suggestion linkability
JohnathonKoster Sep 15, 2024
0dbf070
Adds support for automatically inserting "global auto" links into con…
JohnathonKoster Sep 15, 2024
8529300
Inject current site, if available
JohnathonKoster Sep 15, 2024
6a45891
Related content tag
JohnathonKoster Sep 16, 2024
7e8b72b
Reset this for anyone who might pull it down to experiment
JohnathonKoster Sep 16, 2024
76c0f54
Add filter presets to UI
JohnathonKoster Sep 16, 2024
fc10552
Adds a quick way to mark an entry as "not related"
JohnathonKoster Sep 16, 2024
a5fc1c6
Prevent fatal errors if an entry isn't available
JohnathonKoster Sep 16, 2024
b9c9f3e
Improve smartness of locating related entries.
JohnathonKoster Sep 16, 2024
72fb551
Provide a quick way to edit related entries
JohnathonKoster Sep 16, 2024
0be9e78
Link to related content suggestions instead
JohnathonKoster Sep 16, 2024
830bd14
Consistency
JohnathonKoster Sep 16, 2024
3467b29
Consistent linking here, too.
JohnathonKoster Sep 16, 2024
8b1fe3e
Correct removed offset.
JohnathonKoster Sep 16, 2024
8151a4d
Merge branch 'master' into content-linking
JohnathonKoster Sep 17, 2024
5b8b84b
🍻
JohnathonKoster Sep 17, 2024
8e3124d
Stop the failed test spam for the day 😅
JohnathonKoster Sep 17, 2024
e511c9e
Make method easier to read
JohnathonKoster Sep 17, 2024
71e7e6b
Update ScanEntryLinks.php
JohnathonKoster Sep 17, 2024
13aeb1b
Update ServiceProvider.php
JohnathonKoster Sep 17, 2024
f6f79bb
Update index.blade.php
JohnathonKoster Sep 17, 2024
7581c34
Correct issue with wrapping
JohnathonKoster Sep 17, 2024
7512fb2
Delete hot
JohnathonKoster Sep 17, 2024
107825e
line endings
JohnathonKoster Sep 17, 2024
66521ea
Correct an issue when match appears at the end of the suggestion's ph…
JohnathonKoster Sep 17, 2024
f6d6cfa
Cleanup a bit
JohnathonKoster Sep 17, 2024
fdbaa1d
Consolidate some URL generation
JohnathonKoster Sep 21, 2024
3a9d0a3
Cleanup/refactor
JohnathonKoster Sep 21, 2024
4b388d7
Don't force-cast here to also allow "."
JohnathonKoster Sep 21, 2024
68a48d6
Group fieldtype support
JohnathonKoster Sep 21, 2024
7fe8d22
Update AbstractFieldMapper.php
JohnathonKoster Sep 21, 2024
6534a32
Refactoring/cleanup
JohnathonKoster Sep 21, 2024
04404f7
Sneaky little guy
JohnathonKoster Sep 21, 2024
7a7b94b
More cleanup
JohnathonKoster Sep 21, 2024
ccc08e1
Update ContentMapper.php
JohnathonKoster Sep 21, 2024
44b9a5b
Update ContentMapper.php
JohnathonKoster Sep 21, 2024
0c62513
Update ContentMapper.php
JohnathonKoster Sep 21, 2024
102a184
Always use fresh mappings to account for potentially stale cache
JohnathonKoster Sep 21, 2024
d08ab0d
Refactor how nested field names are retrieved
JohnathonKoster Sep 21, 2024
463ec34
More cleanup
JohnathonKoster Sep 21, 2024
42097b3
Dont return non-replaceable suggestions if we have a suggestion that …
JohnathonKoster Sep 21, 2024
74c4884
Update ServiceProvider.php
JohnathonKoster Sep 21, 2024
dcd667f
Update LinkRepository.php
JohnathonKoster Sep 21, 2024
0d11832
Add support for preventing circular links when suggesting related con…
JohnathonKoster Sep 21, 2024
b10b4be
Update seo-pro.php
JohnathonKoster Sep 21, 2024
7f0389f
"linking" sounds less confusing that "text_analysis"
JohnathonKoster Sep 22, 2024
8966de0
Consistency with other links
JohnathonKoster Sep 22, 2024
68b58eb
Move "Content" namespace to root
JohnathonKoster Sep 22, 2024
8f79404
Update ContentMapper.php
JohnathonKoster Sep 22, 2024
aaa66ab
Namespace move
JohnathonKoster Sep 22, 2024
2b6503e
More moves
JohnathonKoster Sep 22, 2024
d1d6d15
New queries namespace
JohnathonKoster Sep 22, 2024
09d6c10
Make IDE happy, formatting of long query method chain
JohnathonKoster Sep 22, 2024
dbcb040
Remove internal weighting to simplify
JohnathonKoster Sep 22, 2024
a798ee5
Update KeywordsRepository.php
JohnathonKoster Sep 22, 2024
fab5888
Some breathing room
JohnathonKoster Sep 22, 2024
5493d0a
Update ReportBuilder.php
JohnathonKoster Sep 22, 2024
544935d
Update ReportBuilder.php
JohnathonKoster Sep 22, 2024
ac6c98e
Squiggly be gone
JohnathonKoster Sep 22, 2024
4f1c834
Make related/result limits fine-tuneable
JohnathonKoster Sep 22, 2024
9539cd1
squigglies
JohnathonKoster Sep 22, 2024
653589d
Cleanup/make chunk size not hard-coded
JohnathonKoster Sep 22, 2024
a1b9228
Chunksize/return types
JohnathonKoster Sep 22, 2024
ff80104
Cleanup
JohnathonKoster Sep 22, 2024
011beeb
Reducio 🪄
JohnathonKoster Sep 22, 2024
7eba0d3
Update ResolvesSimilarItems.php
JohnathonKoster Sep 22, 2024
b80a13e
help make future diffs less nasty
JohnathonKoster Sep 22, 2024
c23c5b3
Get outa here 🥾
JohnathonKoster Sep 22, 2024
605e95d
Update ReportBuilder.php
JohnathonKoster Sep 22, 2024
3c2a3bd
Get rid of a nesting level
JohnathonKoster Sep 22, 2024
216d41e
And another one
JohnathonKoster Sep 22, 2024
584f937
clarity.
JohnathonKoster Sep 22, 2024
6d64f05
More cleanup
JohnathonKoster Sep 22, 2024
29df5a9
That feels better
JohnathonKoster Sep 22, 2024
f3013b9
new 🏠
JohnathonKoster Sep 22, 2024
d791a58
he can go live with his friends, too
JohnathonKoster Sep 22, 2024
36d487a
Update KeywordsRepository.php
JohnathonKoster Sep 22, 2024
3bd6cf5
Update LinkCrawler.php
JohnathonKoster Sep 22, 2024
13dd960
Prevent sneaky directory traversal/arbitrary code inclusion
JohnathonKoster Sep 22, 2024
8fd6f0b
Merge branch 'master' into content-linking
JohnathonKoster Nov 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,10 @@
}
},
"require": {
"statamic/cms": "^5.0.0"
"statamic/cms": "^5.0.0",
"donatello-za/rake-php-plus": "^1.0",
"openai-php/client": "^0.10.1",
"ext-dom": "*"
},
"require-dev": {
"orchestra/testbench": "^8.0 || ^9.0",
Expand All @@ -31,7 +34,8 @@
},
"config": {
"allow-plugins": {
"pixelfear/composer-dist-plugin": true
"pixelfear/composer-dist-plugin": true,
"php-http/discovery": true
}
},
"minimum-stability": "dev",
Expand Down
51 changes: 51 additions & 0 deletions config/seo-pro.php
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,55 @@
'queue_chunk_size' => 1000,
],

'jobs' => [
'connection' => env('SEO_PRO_JOB_CONNECTION'),
'queue' => env('SEO_PRO_JOB_QUEUE'),
],

'linking' => [

'enabled' => false,

'openai' => [
'api_key' => env('SEO_PRO_OPENAI_API_KEY'),
'model' => 'text-embedding-3-small',
'token_limit' => 8000,
],

'keyword_threshold' => 65,

'prevent_circular_links' => false,

'internal_links' => [
'min_desired' => 3,
'max_desired' => 6,
],

'external_links' => [
'min_desired' => 0,
'max_desired' => 3,
],

'suggestions' => [
'result_limit' => 10,
'related_entry_limit' => 20,
],

'rake' => [
'phrase_min_length' => 0,
'filter_numerics' => true,
],

'drivers' => [
'embeddings' => \Statamic\SeoPro\TextProcessing\Embeddings\OpenAiEmbeddings::class,
'keywords' => \Statamic\SeoPro\TextProcessing\Keywords\Rake::class,
'tokenizer' => \Statamic\SeoPro\Content\Tokenizer::class,
'content' => \Statamic\SeoPro\Content\ContentRetriever::class,
'link_scanner' => \Statamic\SeoPro\TextProcessing\Links\LinkCrawler::class,
],

'disabled_collections' => [
],

],
];
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
/**
* Run the migrations.
*/
public function up(): void
{
Schema::create('seopro_entry_embeddings', function (Blueprint $table) {
$table->id();
$table->string('entry_id')->index();
$table->string('site')->index();
$table->string('collection')->index();
$table->string('blueprint');
$table->string('content_hash');
$table->string('configuration_hash');
$table->json('embedding');
$table->timestamps();
});
}

/**
* Reverse the migrations.
*/
public function down(): void
{
Schema::dropIfExists('seopro_entry_embeddings');
}
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
/**
* Run the migrations.
*/
public function up(): void
{
Schema::create('seopro_entry_links', function (Blueprint $table) {
$table->id();
$table->string('entry_id')->index();
$table->string('cached_title');
$table->string('cached_uri');
$table->string('site')->index();
$table->string('collection')->index();
$table->string('content_hash');
$table->longText('analyzed_content');
$table->json('content_mapping');
$table->integer('external_link_count');
$table->integer('internal_link_count');
$table->integer('inbound_internal_link_count');

$table->json('external_links');
$table->json('internal_links');

$table->json('normalized_external_links');
$table->json('normalized_internal_links');

$table->boolean('can_be_suggested')->default(true)->index();
$table->boolean('include_in_reporting')->default(true)->index();

$table->json('ignored_entries');
$table->json('ignored_phrases');

$table->timestamps();
});
}

/**
* Reverse the migrations.
*/
public function down(): void
{
Schema::dropIfExists('seopro_entry_links');
}
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
/**
* Run the migrations.
*/
public function up(): void
{
Schema::create('seopro_entry_keywords', function (Blueprint $table) {
$table->id();
$table->string('entry_id')->index();
$table->string('site')->index();
$table->string('collection')->index();
$table->string('blueprint');
$table->string('content_hash');
$table->json('meta_keywords');
$table->json('content_keywords'); // Keywords retrieved from content.
$table->timestamps();
});
}

/**
* Reverse the migrations.
*/
public function down(): void
{
Schema::dropIfExists('seopro_entry_keywords');
}
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
/**
* Run the migrations.
*/
public function up(): void
{
Schema::create('seopro_site_link_settings', function (Blueprint $table) {
$table->id();
$table->string('site')->index();
$table->json('ignored_phrases');
$table->float('keyword_threshold');
$table->integer('min_internal_links');
$table->integer('max_internal_links');
$table->integer('min_external_links');
$table->integer('max_external_links');
$table->boolean('prevent_circular_links')->default(false);
$table->timestamps();
});
}

/**
* Reverse the migrations.
*/
public function down(): void
{
Schema::dropIfExists('seopro_site_link_settings');
}
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
/**
* Run the migrations.
*/
public function up(): void
{
Schema::create('seopro_global_automatic_links', function (Blueprint $table) {
$table->id();
$table->string('site')->nullable()->index();
$table->boolean('is_active')->index();
$table->string('link_text');
$table->string('entry_id')->nullable()->index();
$table->string('link_target');
$table->timestamps();
});
}

/**
* Reverse the migrations.
*/
public function down(): void
{
Schema::dropIfExists('seopro_global_automatic_links');
}
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
/**
* Run the migrations.
*/
public function up(): void
{
Schema::create('seopro_collection_link_settings', function (Blueprint $table) {
$table->id();
$table->string('collection')->index();
$table->boolean('linking_enabled')->index();

$table->boolean('allow_linking_across_sites');
$table->boolean('allow_linking_to_all_collections');
$table->json('linkable_collections');
$table->timestamps();
});
}

/**
* Reverse the migrations.
*/
public function down(): void
{
Schema::dropIfExists('seopro_collection_link_settings');
}
};
1 change: 0 additions & 1 deletion resources/dist/build/assets/cp-56146771.css

This file was deleted.

1 change: 0 additions & 1 deletion resources/dist/build/assets/cp-7025c2cd.css

This file was deleted.

Loading