Skip to content
/ searchlab Public

Portal for YaCy Grid and Data Science Applications

License

Notifications You must be signed in to change notification settings

yacy/searchlab

Repository files navigation

The YaCy Searchlab

We are creating a YaCy portal which can be used to crawl the web and evaluate everyting that we find in multiple ways. It will be a Search-as-a-Service portal that is hosted online but can also be downloaded by everyone.

This is work in progress

What you can find here is the early stage of development.

The searchlab will make use of the existing YaCy Grid search engine technology. The public search portal will provide data science dashboards and user accounts. All elements are free software and hosted in this repository or other repositories of the github.com/yacy organization.

To follow the implementation process, have a look at the milestones M1-M6 within the https://github.com/yacy/searchlab/issues issues.

To read more details about the project, visit https://searchlab.eu/en/access/about/

Searchlab and YaCy Grid Architecture

The searchlab application (this repository) was made as the front-end for the YaCy Grid ecosystem. It uses mainly the following components:

The Searchlab Application Server

A careful selection of the correct web design, an appropriate application server and overall web technology for a typical full-stack application had to be made. We refrained from complex one-page node-based front-end application schema and created instead a more classical design using server-rendered web pages and a static-code generator together with modern data-driven concepts and API designes.

  • the web front-end is created using the static code generator MKDocs. Its source path is ui.
  • the template for the web front-end is based on MkDocs-Theme-Cinder-Superhero which is a privacy-aware combination of the Cinder Bootstrap-Theme for MKdocs with a theme adoption to turn it into a dark-mode version of Cinder using design ideas of Superhero.
  • the back-end server is written in java and uses Undertow as web server.
  • content within the mkdocs can use the handlebars template engine for dynamic/server-side content management. This feature has two elements:
    • a handlebars template engine integration in the undertow server usage
    • an api concept where each web page that uses undertow requires a json API which provides data for the undertow template. Even if the undertow template process also runs server-side, the API for the content that is handles must be provided as an external API.
  • server-side includes allow the integration of server-rendered add-on content. This can be used to inject tablesaw- or plotly-generated html (see below).
  • as a server-internal data structure, tablesaw provides data table libraries for data science functions. This library allows the ouput of plotly-based time-series data (see below).
  • plotly graphs to visualize tables as graphs are added by tablesaw
  • to further provide an excel-like experience to users who require this approach Bootstrap Table is used for extended table visualization. This contains a large variety of search end export function.
  • To visualize workflows, we integrated also Mermaid for diagrams using text and code inside the MKDocs code.

Source Code Release

The source code is released simply by providing a git clone opportunity using this github account. To get the source code, just run

git clone https://github.com/yacy/searchlab.git
git clone https://github.com/yacy/searchlab_apps.git

If you just want to download a zip file with all source, use this link: https://github.com/yacy/searchlab/archive/refs/heads/master.zip

To build the searchlab, you need the following components:

  • python 3 and mkdocs which can simply be installed with pip install mkdocs
  • java 8 (or higher) which can be obtained i.e. from https://adoptium.net/

The application is build in two steps:

  • first, the static web pages must be created:
cd ui
mkdocs build
  • second, the server must be compiled
./gradlew assemble
  • finally, the application can be started with
./gradlew run

The searchlab application can then be accessed at http://localhost:8400/

Docker Release

A docker release can be produced in one simple step: just run

cd ..
docker build -t searchlab -f searchlab/Dockerfile .

The image MUST be build from a directory path below the application folder. The repository searchlab_apps must exist in parallel to searchlab. Then a docker image will be in your local docker image store which can be started with

docker run -d --rm -p 8400:8400 --name searchlab searchlab

The searchlab application can be accessed at http://localhost:8400/

We publish docker images of the searchlab application also at dockerhub which can be obtained simply with

docker run -d --rm -p 8400:8400 --name searchlab yacy/searchlab