Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run GOC parts of joint pipeline code on new machine #407

Open
2 of 6 tasks
kltm opened this issue Nov 21, 2024 · 4 comments
Open
2 of 6 tasks

Run GOC parts of joint pipeline code on new machine #407

kltm opened this issue Nov 21, 2024 · 4 comments
Assignees

Comments

@kltm
Copy link
Member

kltm commented Nov 21, 2024

This is essentially a continuation of #351

Next steps are to:

  • start a new repo for pipeline code for the new machine (it will be too complicated to run in the same codebase)
  • complete config and setup of new jenkins instance
  • initially, port
    • derivatives-from-goa
    • go-raw-data
    • (go-ontology-dev?)

After evaluating speed, etc. we'll need to decide if we port everything over (return to original codebase, but with mods for the new machines, etc.) or bring only the things we want/need over piecemeal.

@kltm kltm self-assigned this Nov 21, 2024
@kltm kltm moved this to In Progress in GOC / GOA Data Exchange Nov 21, 2024
@kltm kltm changed the title Run GOC parts of joint pipeline code on Run GOC parts of joint pipeline code on new machine Nov 23, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Nov 26, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Nov 26, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Nov 27, 2024
@kltm
Copy link
Member Author

kltm commented Nov 27, 2024

geneontology/pipeline-from-goa@61f1bfc is a very dark hack, but I can proceed with testing.
I believe it's related to moby/moby#46199 (comment) and the resolv.conf on the host machine, but I want to have access before I start messing with that.

kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Nov 27, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Nov 27, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Nov 27, 2024
@kltm
Copy link
Member Author

kltm commented Dec 3, 2024

Noting some results for running the GOA derivatives on the new machine. It runs in about 4.5h vs 7-8h, so that is a nice improvement.
Disappointingly, I'm still seeing the docker error from #316:

Also:   org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: d954ad64-42e4-4129-bf88-08799eb3a17c
java.io.IOException: Failed to kill container '818ff90a3aa3c93e987526e96fc29b5e1e017a2e6110e07cca7f0320b398f38c'.
	at PluginClassLoader for docker-workflow//org.jenkinsci.plugins.docker.workflow.client.DockerClient.stop(DockerClient.java:187)

which is a little unexpected. I'm going to look into that a little.

@kltm
Copy link
Member Author

kltm commented Dec 3, 2024

For the above, it seems to be pretty well described in https://issues.jenkins.io/browse/JENKINS-73567

I also figured out why "write.lock" exists in the tarball and why the version on the new machine is so much larger than the one on the old machine.

Essentially, it looks like the command does to "running out of disk" during the optimize step. The farther you get there, the more optimization files can be written. That said, it does not affect the number of entities loaded, which was part of the mystery. It also explains why there is such high i/o when exiting: disassembling the tmpfs is pretty heavy and may well be the major contributor to get docker kill issue.

The two choices are: 1) remove the optimize step or 2) make sure that the tmpfs has enough space to optimize the very very large index that is produced.

tmpfs is reporting as
tmpfs /srv/solr/data 503.8G 0.0G 503.8G 0% tmpfs
on the new machine. So north of that.

kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Dec 3, 2024
…bly not sustainable, placeholders for removing optimize for testing; for geneontology/pipeline#407
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Dec 3, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Dec 3, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Dec 3, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Dec 3, 2024
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Dec 4, 2024
kltm added a commit to geneontology/noctua_app_stack that referenced this issue Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

1 participant