Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching in the content of posts #558

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

mieszkou
Copy link
Contributor

I added the content of the posts to the index-sorted.txt file. The content of multiple files is read when creating the cache in function rebuilt_cache

$posts_cache_sorted[] = array_merge(pathinfo($file), ['content' => preg_replace('/\s+/', '', (file_get_contents($file)))]);

And I added content to the search function get_keyword:

 $filter = $arr[1] . ' ' . $arr[2] . ' ' . $v['content'];

@Hjertesvikt
Copy link

Thanks! It's a great addition to HTMLy!

@bttrx
Copy link
Contributor

bttrx commented May 22, 2022

I did a quick test of your code on my, nearly empty, HTMLy installation, because I'm new to HTMLy since today.
It seems to work fine, but I have two questions:

  1. Why does the content of each post appear twice in index-sorted.txt?
  2. What about the keyword_count() function? You didn't add $v['content'] there. Is this ok or a left-over?

@mieszkou
Copy link
Contributor Author

Why does the content of each post appear twice in index-sorted.txt?

For me, such a thing does not occur.

There is also a description field in the file with the content. If you don't enter anything in it, HTMLy will put the beginning of the content there. Maybe this is it?

What about the keyword_count() function? You didn't add $v['content'] there. Is this ok or a left-over?

I didn't notice it. 🤦‍♂️

Without it, pagination of search results does not work properly.

I will post a fix right away - thanks.

@bttrx
Copy link
Contributor

bttrx commented May 22, 2022

Why does the content of each post appear twice in index-sorted.txt?

For me, such a thing does not occur.

There is also a description field in the file with the content. If you don't enter anything in it, HTMLy will put the beginning of the content there. Maybe this is it?

You're right. It is the meta description field. So it's okay.

@dirmanhana
Copy link
Contributor

thanks!

sbenve added a commit to sbenve/htmly that referenced this pull request Feb 7, 2023
sbenve added a commit to sbenve/htmly that referenced this pull request Feb 7, 2023
sb0001 added a commit to sb0001/htmly that referenced this pull request Dec 8, 2023
@danpros
Copy link
Owner

danpros commented Dec 27, 2023

Hello,

Adding each content to the array when rebuilding the index if we have lots of posts it will have big effect in performance.

So the solution, for example, let say we add/edit post A, this post A will automatically added to the search index, if visitor/bot visiting the page it will added to the index either.

And there is a separate admin page to manage this, for example rebuilt the search index per specific posts manually, check which posts have not been indexed etc. and for this search index, it should not in cache folder but content/data folder, similar to views.json.

Edit: separate mean eg. /admin/search similar to admin/config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants