Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A way to monitor/watch tails of output files that being written on remote #6666

Open
khsrali opened this issue Dec 13, 2024 · 4 comments
Open
Assignees
Labels
type/feature request status undecided

Comments

@khsrali
Copy link
Contributor

khsrali commented Dec 13, 2024

One way to do it, is something like
verdi calcjob outputcat <pk> <Relative_PATH> --watch that would tail the latest lines
but that might be confusing for the users, because this is known to users as a command that prints out the retrieved files, not the remote ones.

A better suggestion might be
verdi watch <pk> <Relative_PATH>
verdi watch -n <seconds> <pk> <Relative_PATH> --tail <n_lines>

@agoscinski, @GeigerJ2, @Technici4n, @giovannipizzi

I vote for the second approach.

@khsrali khsrali added the type/feature request status undecided label Dec 13, 2024
@GeigerJ2
Copy link
Contributor

GeigerJ2 commented Dec 13, 2024

I'm strongly against the second approach, as that would introduce an entirely new verdi endpoint just for a single command. And I don't really see a reason that speaks against the first approach. Personally, before we just looked into the implementation, I didn't even know this was limited to retrieved only. But even if people are aware of that, isn't it great that we make the command more functional?

To avoid confusion, we could also modify the flag to, e.g., --remote-watch or sthg, but not sure there yet. Also, if we add it to verdi calcjob outputcat should we provide the head/tail behavior, and, if so, how? Wouldn't convolute the command too much...

Only thing we need to be aware about is opening many transport connections/keeping one alive all the time, but probably that won't really be a problem.

@khsrali
Copy link
Contributor Author

khsrali commented Dec 13, 2024

--remote-watch sounds good.

Wouldn't convolute the command too much...

this log files can be very long thousands of lines.. tail/head can be useful

@giovannipizzi
Copy link
Member

I agree not adding a top level command.

Indeed, outputcat at the moment means "retrieved files". We can have a clear flag, but then with the --remote option, you are really looking at a different node. If we think this makes things easier that's OK, but I fear that it can make things even more confusing.
Maybe bettere to have instead a command verdi calcjob remotedata outputcat (name to be discussed) that groups commands on the remote folder?

Also, I think we are mixing up getting files from the remote (as a general task, can be run also after calc ends, the endpoint above would work) with efficient monitoring while the job is running (we should then add a flag for a "tail -f" functionality, --watch would indeed be a good name, but we can brainstorm a bit more - BTW, what does the -f option of tail stands for?).

Probably indeed, since give files can be big and network transfer is slow, having specific --head/--tail options for this command would be nice (but not useful for the standard outputcat command, I guess? One more reason to keep them separated?) and can be combined with --tail if one wants to monitor but just starting from the last lines, rather than printing the whole file and then continue monitoring the file).

Final note, the --head/--tail command could (should?) have an option to select how many first/last lines to show (if feasible to implement easily)

@khsrali
Copy link
Contributor Author

khsrali commented Dec 16, 2024

Hi @giovannipizzi ,
thanks for the remarks, I'll take care of the implementation.

@khsrali khsrali self-assigned this Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature request status undecided
Projects
None yet
Development

No branches or pull requests

3 participants