Skip to content

Fix/hostnames#102

Merged
daniel-thom merged 7 commits intomainfrom
fix/hostnames
Feb 4, 2026
Merged

Fix/hostnames#102
daniel-thom merged 7 commits intomainfrom
fix/hostnames

Conversation

@daniel-thom
Copy link
Collaborator

No description provided.

This is a breaking change. The torc-server used to bind to localhost by
default, which was not appropriate for HPC deployments. It now defaults
to 0.0.0.0, which is routable. Also changed the option to specify the
host to be more clear (--url -> --host).
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Torc’s server binding configuration and HPC deployment guidance, while also extending CLI/dashboard functionality and adding a new built-in HPC profile.

Changes:

  • Rename torc-server bind setting/flag from url to host, adjust defaults, and update related docs/tests.
  • Extend torc jobs update to support --runtime and --resource-requirements-id, with new integration tests.
  • Add LLNL “Dane” as a built-in HPC profile; add torc-dash behavior to save inline workflow specs.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
torc-server/src/service.rs Renames service config field and service install args from urlhost.
torc-server/src/main.rs Introduces --host CLI option and uses it for bind address construction.
torc-dash/src/main.rs Adds inline spec saving utilities and hooks into create handlers.
src/config/server.rs Renames server config field to host with serde(alias = "url") for backward compat.
src/client/commands/jobs.rs Adds jobs update --runtime and --resource-requirements-id support.
tests/test_jobs.rs Adds tests covering new jobs update behaviors.
tests/test_config.rs Updates expected server default bind value to 0.0.0.0.
src/client/hpc.rs Exposes new dane module.
src/client/hpc/dane.rs Adds the Dane cluster profile definition.
src/client/hpc/profiles.rs Registers the Dane profile in the built-in registry.
docs/src/specialized/tools/dashboard-deployment.md Updates deployment examples/host guidance and adds troubleshooting note for routable hostnames.
docs/src/specialized/hpc/hpc-deployment.md Replaces -u references with --host and updates examples/config snippets.
docs/src/specialized/admin/server-deployment.md Updates server/service examples from --url to --host.
docs/src/getting-started/quick-start-remote.md Adjusts quick-start to rely on new default bind behavior.
docs/src/getting-started/quick-start-hpc.md Adds --host example and port conflict guidance.
docs/src/core/reference/cli-cheatsheet.md Wraps cheatsheet content in a .cheatsheet container for targeted styling.
docs/custom.css Refines table styling and scopes fixed column widths to cheatsheet tables.
Cargo.toml Bumps workspace version to 0.11.0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 508 to 514
async fn save_workflow_spec(spec_json: &str, workflow_id: Option<&str>) -> Option<String> {
// Parse the spec to extract the workflow name
let spec: serde_json::Value = match serde_json::from_str(spec_json) {
Ok(v) => v,
Err(e) => {
warn!("Failed to parse workflow spec for saving: {}", e);
return None;
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_workflow_spec parses the inline spec as JSON and always writes a .json file, but cli_create_handler/cli_create_slurm_handler accept inline specs with arbitrary file_extension (YAML/KDL/etc). This will prevent saving non-JSON specs and/or save them with the wrong extension. Consider saving the raw spec_content using the provided extension, or parsing based on file_extension rather than assuming JSON.

Copilot uses AI. Check for mistakes.
Comment on lines 551 to 560
// Try to save to the current directory (with trailing newline)
let filename = format!("{}.json", base_filename);
let content = format!("{}\n", pretty_json);
match tokio::fs::write(&filename, &content).await {
Ok(()) => {
info!("Saved workflow spec to: {}", filename);
Some(filename)
}
Err(e) => {
warn!("Failed to save workflow spec to {}: {}", filename, e);
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This writes the saved workflow spec directly into the process current working directory and will overwrite any existing file with the same name (and follow symlinks). Because this is triggered by an HTTP request, it can allow clobbering files on the torc-dash host (within process permissions). Prefer writing into a dedicated, configurable directory and use an atomic create-new strategy (fail if exists / add unique suffix) to avoid overwrites and symlink attacks.

Copilot uses AI. Check for mistakes.
Comment on lines +251 to 255
host: if cli_config.host != "0.0.0.0" {
cli_config.host
} else {
server_file_config.url.clone()
server_file_config.host.clone()
},
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI/file-config merging uses a sentinel default string ("0.0.0.0") to decide precedence. This makes it impossible to override a config-file server.host back to 0.0.0.0 via CLI, since passing --host 0.0.0.0 is indistinguishable from “not provided”. Prefer making host an Option<String> in the CLI struct (no default), then unwrap_or_else to the file/default value, or use clap’s default_value_t only at the final merged config layer.

Copilot uses AI. Check for mistakes.
Comment on lines +569 to +589
// Handle runtime update (requires updating resource requirements)
if let Some(new_runtime) = runtime {
let rr_id = job.resource_requirements_id.unwrap_or_else(|| {
eprintln!(
"Error: Cannot update runtime - job {} has no resource requirements assigned.",
id
);
eprintln!(
"Hint: First assign resource requirements with --resource-requirements-id"
);
std::process::exit(1);
});

// Get and update the resource requirements
match default_api::get_resource_requirements(config, rr_id) {
Ok(mut rr) => {
rr.runtime = new_runtime.clone();
match default_api::update_resource_requirements(config, rr_id, rr) {
Ok(_) => {
if format != "json" {
println!(
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jobs update --runtime ... updates the resource requirements first, then calls update_job. If update_job fails, the command exits with an error even though the runtime change was already applied, leaving a partial update that’s hard to reason about. Consider either (a) skipping update_job when only runtime changed, or (b) updating the job first (when RR id changes) and only updating the RR after update_job succeeds, and/or making the output clearly reflect which parts succeeded.

Copilot uses AI. Check for mistakes.
Comment on lines 505 to 508
/// Save a workflow spec to a file in the current directory.
/// The filename is derived from the workflow name in the spec.
/// Returns the path to the saved file, or None if saving failed.
async fn save_workflow_spec(spec_json: &str, workflow_id: Option<&str>) -> Option<String> {
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torc-dash still spawns torc-server using the old --url flag (see earlier in this file: args include "--url"), but torc-server CLI now expects --host. Standalone mode will fail to start torc-server unless this is updated (or torc-server keeps --url as an alias).

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +20
Start a Torc server. By default, it binds to `0.0.0.0` so it's accessible from remote machines:

```console
torc-server run --database torc.db --url 0.0.0.0 --port 8080
torc-server run --database torc.db --port 8080
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These Quick Start instructions start torc-server bound to 0.0.0.0 without --auth-file/--require-auth, and the authentication docs state that in this mode authentication is disabled and all requests are allowed. On any network where this host is reachable, an attacker can point their own torc CLI or HTTP client at the server and create, modify, or read workflows and jobs without credentials. To avoid exposing a fully unauthenticated orchestration API, update the example to enable authentication (e.g. using --auth-file with --require-auth and, on untrusted networks, --https) or clearly scope it to localhost or an isolated network instead of 0.0.0.0.

Copilot uses AI. Check for mistakes.
Comment on lines 19 to 23
**Note**: This uses a specific hostname routable from compute nodes, which may vary across HPC
systems. Adjust as necessary or exclude `--host` to use the default.

```console
torc-server run --database torc.db --completion-check-interval-secs 5
torc-server run --database torc.db --host kl1.hsn.cm.kestrel.hpc.nrel.gov --completion-check-interval-secs 5
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Quick Start runs torc-server on an HPC login node with a routable hostname but without --auth-file/--require-auth, which per the authentication docs means authentication is disabled and all requests are allowed. On a multi-user cluster, any other user who can reach that hostname and port can call the server API (via torc or plain HTTP) to submit, inspect, or control workflows under the owner’s account. The guide should recommend starting the server with authentication enabled (and HTTPS where appropriate) or explicitly constraining it to a trusted, isolated network/IP to prevent unauthorized access.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 299 to 306
// Determine API URL
// - Standalone mode: dashboard connects to the managed server via localhost
// - Non-standalone: use the user-provided --api-url
let final_api_url = if standalone {
// In standalone mode, we need to determine the correct host for the API URL.
// If the user provided a custom --api-url, extract the host from it.
// Otherwise, use the host the dashboard is binding to (or localhost as fallback).
let api_host = if api_url != "http://localhost:8080/torc-service/v1" {
// User provided a custom API URL, extract the host
url::Url::parse(&api_url)
.ok()
.and_then(|u| u.host_str().map(|h| h.to_string()))
.unwrap_or_else(|| "localhost".to_string())
} else if host != "127.0.0.1" {
// User specified a non-default host, use it
host.clone()
} else {
"localhost".to_string()
};
format!("http://{}:{}/torc-service/v1", api_host, actual_server_port)
// In standalone mode, the dashboard connects to the server locally.
// The server_host controls external accessibility, but we always connect via localhost.
format!("http://localhost:{}/torc-service/v1", actual_server_port)
} else {
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torc-dash starts the managed torc-server with --host {server_host}, but in standalone mode it always sets final_api_url to http://localhost:{port}/.... If server_host is a specific interface hostname/IP (not 0.0.0.0), the server will not be listening on 127.0.0.1 and the dashboard’s proxy calls will fail. Consider computing the connect host from server_host (use localhost/127.0.0.1 only when binding to 0.0.0.0/::).

Copilot uses AI. Check for mistakes.
Comment on lines +165 to +169
let server_host = if cli.server_host != "0.0.0.0" {
cli.server_host.clone()
} else {
dash_config.server_host.clone()
};
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI/file-config merge logic treats "0.0.0.0" as a sentinel for “not provided”. This makes it impossible to explicitly override a config-file server_host back to 0.0.0.0 via CLI (because clap always populates the default). Prefer Option<String> for --server-host (no default) and apply defaults only after merging, or use clap’s “value source” APIs to detect whether the arg was set.

Suggested change
let server_host = if cli.server_host != "0.0.0.0" {
cli.server_host.clone()
} else {
dash_config.server_host.clone()
};
let server_host = cli.server_host.clone();

Copilot uses AI. Check for mistakes.
Comment on lines 665 to 682
@@ -607,9 +677,8 @@ async fn cli_create_handler(
.await
} else {
// Spec is inline content - write to temp file with correct extension
let extension = req.file_extension.as_deref().unwrap_or(".json");
let unique_id = uuid::Uuid::new_v4();
let temp_path = format!("/tmp/torc_spec_{}{}", unique_id, extension);
let temp_path = format!("/tmp/torc_spec_{}{}", unique_id, file_extension);
if let Err(e) = tokio::fs::write(&temp_path, &req.spec).await {
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file_extension is taken directly from the request and concatenated into both the temp path (/tmp/torc_spec_*{file_extension}) and the saved filename. If it contains path separators (e.g. "/../../...") this can escape the intended directory or create unexpected paths. Validate/sanitize file_extension (e.g., allowlist {.json, .yaml, .yml, .kdl} and reject anything containing / or ..).

Copilot uses AI. Check for mistakes.
Comment on lines 557 to 565
if let Some(caps) = stdout
.lines()
.find(|line| line.contains("Created workflow"))
{
// Extract the number after "Created workflow"
caps.split_whitespace().last().map(|s| s.trim().to_string())
} else {
None
}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says it handles output patterns like "ID: 123", but the implementation only searches for lines containing "Created workflow". Either implement the additional pattern(s) or update the comment so it matches behavior.

Suggested change
if let Some(caps) = stdout
.lines()
.find(|line| line.contains("Created workflow"))
{
// Extract the number after "Created workflow"
caps.split_whitespace().last().map(|s| s.trim().to_string())
} else {
None
}
for line in stdout.lines() {
if line.contains("Created workflow") {
// Extract the number after "Created workflow"
if let Some(id) = line.split_whitespace().last() {
return Some(id.trim().to_string());
}
} else if let Some(pos) = line.find("ID:") {
// Extract the ID after "ID:"
let after = &line[pos + "ID:".len()..];
if let Some(id) = after.trim().split_whitespace().next() {
return Some(id.to_string());
}
}
}
None

Copilot uses AI. Check for mistakes.
Comment on lines 30 to 31
/// Hostname or IP address to bind the server to
#[arg(short = 'H', long, default_value = "0.0.0.0")]
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --host CLI option replaces the previous --url/-u bind option. This is a breaking change for existing scripts. Consider adding clap aliases (e.g., long alias url and possibly a short alias u) with help text marking them deprecated, to preserve backward compatibility.

Suggested change
/// Hostname or IP address to bind the server to
#[arg(short = 'H', long, default_value = "0.0.0.0")]
/// Hostname or IP address to bind the server to.
/// Deprecated aliases: `--url`, `-u` (use `--host` instead).
#[arg(short = 'H', long, alias = "url", short_alias = 'u', default_value = "0.0.0.0")]

Copilot uses AI. Check for mistakes.
Comment on lines +251 to 255
host: if cli_config.host != "0.0.0.0" {
cli_config.host
} else {
server_file_config.url.clone()
server_file_config.host.clone()
},
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI/file-config merge treats "0.0.0.0" as “not provided”. With clap’s default, this prevents users from explicitly overriding a config-file host back to 0.0.0.0. Using Option<String> for --host (and merging with unwrap_or_else(...)) or checking clap’s value source would avoid this ambiguity.

Copilot uses AI. Check for mistakes.
@daniel-thom daniel-thom marked this pull request as ready for review February 4, 2026 17:50
@daniel-thom daniel-thom merged commit 54f57d3 into main Feb 4, 2026
3 checks passed
@daniel-thom daniel-thom deleted the fix/hostnames branch February 4, 2026 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant