Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about custom crawls on startServer functionality #178

Open
pkoloveas opened this issue Apr 16, 2019 · 3 comments
Open

Question about custom crawls on startServer functionality #178

pkoloveas opened this issue Apr 16, 2019 · 3 comments

Comments

@pkoloveas
Copy link

pkoloveas commented Apr 16, 2019

Is there a way to define specifically configured crawls while running ache on server mode?
I am using a docker compose file which is structured as my docker-compose file with the startCrawl command but when I look into the data-ache folder, the config folder that has been created has the default ache.yml and no link-filters.yml file.

version: '2'
services:
  ache:
    image: vidanyu/ache
    entrypoint: sh -c 'sleep 10 && /ache/bin/ache startServer -d /data' -c /config/ 
    ports:
      - "8080:8080"
    volumes:
      - ./data-ache/:/data
      - ./:/config

Also, after the server is up, and I can create new crawls from the interface, how can I specify that for example, a new focused crawl has a different ache.yml file from a deep crawl that was in the first configuration when I ran the command, without restarting the docker container/server?

@aecio
Copy link
Member

aecio commented Apr 17, 2019

This is not supported yet. Currently, the server mode is able to start only two crawl modes (deep crawl, and focused crawl) which use the ache.yml file configured during start-up, with a few minimal configurations, necessary for deep crawls and focused crawls, overridden. These configs that are overridden come from these config files:

There is an ongoing pull request to support link filters in server mode via the REST API (#175). Support a custom ache.yml could be done in a similar way.

@pkoloveas
Copy link
Author

I assume that if I build from source instead of Docker, I can change the two files that you referenced to be my default configs for deep and focused crawls, correct?

Also, is there a way to run more than one different startCrawl commands but have them on the same port, so I can monitor them on the same dashboard? (right now I get the error from Docker that the port is busy)

@aecio
Copy link
Member

aecio commented Apr 18, 2019

Correct, if you change these files and rebuild it should work (you can rebuild the docker image as well).
Regarding multiple startCrawls in the same port/process, it not possible at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants