Skip to content

[NEW DATA EXTENSION] Kotlin HTTP4k (core, format-jackson, format-moshi, format-gson, multipart)#26

Draft
Copilot wants to merge 6 commits intomainfrom
copilot/add-http4k-data-extension
Draft

[NEW DATA EXTENSION] Kotlin HTTP4k (core, format-jackson, format-moshi, format-gson, multipart)#26
Copilot wants to merge 6 commits intomainfrom
copilot/add-http4k-data-extension

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 26, 2026

📝 Data Extension Information

  • Language: java
  • Extension Name(s): http4k-core.model.yml, http4k-format-jackson.model.yml, http4k-format-moshi.model.yml, http4k-format-gson.model.yml, http4k-multipart.model.yml
  • Extension Types: sourceModel, sinkModel, summaryModel, neutralModel
  • Target Library/Framework: http4k
  • Library Modules Covered: org.http4k.core (http4k-core), org.http4k.format (http4k-format-jackson, http4k-format-moshi, http4k-format-gson), org.http4k.lens (http4k-multipart)

🎯 Description

What This Data Extension Models

Comprehensive models for the http4k Kotlin HTTP toolkit, with one model file per external API artifact:

http4k-core (http4k-core.model.yml):

  • Sources: Request.query(String), Request.queries(String), Request.header(String), Request.headerValues(String), Request.bodyString(), Request.getUri() — all as remote input sources
  • Sinks: Response.body(String) for html/js-injection, Response.header(String,String) for response-splitting (Argument[0..1]) and request-forgery (Argument[1])
  • Summaries: Immutable builder pattern taint propagation (Argument[this]ReturnValue as value, argument values → ReturnValue as taint) for Request/Response method chains

http4k-format-jackson (http4k-format-jackson.model.yml):

  • Sinks: ConfigurableJackson.asA() for unsafe-deserialization (String and InputStream overloads)
  • Summaries: Taint propagation through deserialization (Argument[0]ReturnValue) and JSON string parsing via asJsonObject()

http4k-format-moshi (http4k-format-moshi.model.yml):

  • Sinks: ConfigurableMoshi.asA() for unsafe-deserialization (String and InputStream overloads)
  • Summaries: Taint propagation through deserialization (Argument[0]ReturnValue) and JSON string parsing via asJsonObject()

http4k-format-gson (http4k-format-gson.model.yml):

  • Sinks: ConfigurableGson.asA() for unsafe-deserialization (String and InputStream overloads)
  • Summaries: Taint propagation through deserialization (Argument[0]ReturnValue) and JSON string parsing via asJsonObject()

http4k-multipart (http4k-multipart.model.yml):

  • Sources: MultipartFormBody.field/fields/fieldValue/fieldValues/file/files for form data, MultipartFormFile.getFilename()/getContent() for uploaded file metadata and content, MultipartFormField.getValue() for field values — all as remote sources
  • Summaries: MultipartFormBody.from() taint propagation from HttpMessage to parsed body

Threat Model

remote

Example Vulnerable Code

// XSS: unsanitized query param reflected in response body
fun handleGreet(request: Request): Response {
    val name = request.query("name") ?: "Guest"
    return Response(Status.OK).body("<h1>Hello, $name!</h1>")
}

// SQL Injection: query param flows into raw SQL
fun handleSearch(request: Request): Response {
    val query = request.query("q") ?: ""
    val sql = "SELECT * FROM products WHERE name LIKE '%$query%'"
    statement.executeQuery(sql)
    // ...
}

// Unsafe deserialization: untrusted JSON parsed via Jackson
fun handleJson(request: Request): Response {
    val obj = Jackson.asA<UserData>(request.bodyString())
    // ...
}

// Path traversal: multipart file upload with attacker-controlled filename
fun handleUpload(request: Request): Response {
    val form = MultipartFormBody.from(request)
    val file = form.file("upload")!!
    File("/uploads/${file.filename}").writeBytes(file.content.readBytes())
    // ...
}

Example Safe Code

// Parameterized query prevents SQL injection
fun handleSearch(request: Request): Response {
    val query = request.query("q") ?: ""
    val stmt = conn.prepareStatement("SELECT * FROM products WHERE name LIKE ?")
    stmt.setString(1, "%$query%")
    // ...
}

// Sanitized filename prevents path traversal
fun handleUpload(request: Request): Response {
    val form = MultipartFormBody.from(request)
    val file = form.file("upload")!!
    val safeName = File(file.filename).name // strip path components
    File("/uploads/$safeName").writeBytes(file.content.readBytes())
    // ...
}

📦 Extension Details

Extension YAML

http4k-core.model.yml:

extensions:
  - addsTo:
      pack: codeql/java-all
      extensible: sourceModel
    data:
      - ["org.http4k.core", "Request", True, "query", "(String)", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.core", "Request", True, "queries", "(String)", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.core", "Request", True, "header", "(String)", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.core", "Request", True, "headerValues", "(String)", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.core", "Request", True, "bodyString", "()", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.core", "Request", True, "getUri", "()", "", "ReturnValue", "remote", "manual"]
  - addsTo:
      pack: codeql/java-all
      extensible: sinkModel
    data:
      - ["org.http4k.core", "Response", True, "body", "(String)", "", "Argument[0]", "html-injection", "manual"]
      - ["org.http4k.core", "Response", True, "body", "(String)", "", "Argument[0]", "js-injection", "manual"]
      - ["org.http4k.core", "Response", True, "header", "(String,String)", "", "Argument[0..1]", "response-splitting", "manual"]
      - ["org.http4k.core", "Response", True, "header", "(String,String)", "", "Argument[1]", "request-forgery", "manual"]

http4k-format-jackson.model.yml:

extensions:
  - addsTo:
      pack: codeql/java-all
      extensible: sinkModel
    data:
      - ["org.http4k.format", "ConfigurableJackson", True, "asA", "(String,Class)", "", "Argument[0]", "unsafe-deserialization", "manual"]
      - ["org.http4k.format", "ConfigurableJackson", True, "asA", "(InputStream,Class)", "", "Argument[0]", "unsafe-deserialization", "manual"]

http4k-format-moshi.model.yml:

extensions:
  - addsTo:
      pack: codeql/java-all
      extensible: sinkModel
    data:
      - ["org.http4k.format", "ConfigurableMoshi", True, "asA", "(String,Class)", "", "Argument[0]", "unsafe-deserialization", "manual"]
      - ["org.http4k.format", "ConfigurableMoshi", True, "asA", "(InputStream,Class)", "", "Argument[0]", "unsafe-deserialization", "manual"]

http4k-format-gson.model.yml:

extensions:
  - addsTo:
      pack: codeql/java-all
      extensible: sinkModel
    data:
      - ["org.http4k.format", "ConfigurableGson", True, "asA", "(String,Class)", "", "Argument[0]", "unsafe-deserialization", "manual"]
      - ["org.http4k.format", "ConfigurableGson", True, "asA", "(InputStream,Class)", "", "Argument[0]", "unsafe-deserialization", "manual"]

http4k-multipart.model.yml:

extensions:
  - addsTo:
      pack: codeql/java-all
      extensible: sourceModel
    data:
      - ["org.http4k.core", "MultipartFormBody", True, "file", "(String)", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.core", "MultipartFormBody", True, "fieldValue", "(String)", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.lens", "MultipartFormFile", True, "getFilename", "()", "", "ReturnValue", "remote", "manual"]
      - ["org.http4k.lens", "MultipartFormFile", True, "getContent", "()", "", "ReturnValue", "remote", "manual"]

Access Path Explanation

Model Access Path Rationale
Request.queryReturnValue Query param string returned to caller Standard remote source pattern
Request.headerReturnValue HTTP header value returned to caller Headers can be attacker-controlled
Request.bodyStringReturnValue Request body as string Full request body is user-controlled
Request.getUriReturnValue Full URI object URI path/query is user-controlled
Response.headerArgument[0..1] Both header name and value Newline injection in either enables response splitting
Response.headerArgument[1] Header value for request-forgery e.g., Location header for SSRF/redirect
Response.bodyArgument[0] String written to HTTP response body XSS vector
ConfigurableJackson.asAArgument[0] Untrusted input deserialized Unsafe deserialization (CWE-502)
MultipartFormBody.fileReturnValue Uploaded file from multipart form File uploads are classic attack vector
MultipartFormFile.getFilenameReturnValue Attacker-controlled filename Path traversal risk
MultipartFormFile.getContentReturnValue Attacker-controlled file content Malicious file content

Note: The issue originally specified Argument[0] for header, but this was corrected to Argument[1] since the header value (second parameter) is what's exploitable for request-forgery. Response-splitting uses Argument[0..1] since newline injection in either name or value is dangerous.

🧪 Testing

  • Extension YAML resolves without errors
  • Database created with sample code (codeql database create or codeql test extract)
  • Single query verified with extension applied (codeql query run --additional-packs=<model-pack-dir>)
  • Unit tests pass with extension applied (codeql test run --additional-packs=<model-pack-dir>)
  • Positive test cases (vulnerable code detected)
  • Negative test cases (safe code not flagged)

📋 Checklist

  • Extension YAML is valid and properly formatted
  • Extension placed in correct location (languages/java/custom/src/)
  • qlpack.yml includes dataExtensions configuration
  • Documentation/comments included in YAML
  • One model file per external API artifact
  • Access paths verified via API graph queries
  • No false positives in test cases

🔗 References


Note: This data extension was developed following CodeQL Models as Data best practices.


⌨️ Start Copilot coding agent tasks without leaving your editor — available in VS Code, Visual Studio, JetBrains IDEs and Eclipse.

Copilot AI linked an issue Mar 26, 2026 that may be closed by this pull request
1 task
Copilot AI changed the title [WIP] Add data extension for http4k in Java [NEW DATA EXTENSION] Kotlin HTTP4k Mar 26, 2026
Copilot AI requested a review from felickz March 26, 2026 21:24
@felickz
Copy link
Copy Markdown
Contributor

felickz commented Mar 30, 2026

@copilot evaluate the docs for http4k core and model any additional remote/local sources and sinks for this library https://www.http4k.org/api/http4k-core/index.html

Also add models for http4k-format-jackson / http4k-format-moshi / http4k-format-gson for Deserialization sinks/sources. Untrusted JSON/XML is parsed

http4k-multipart for File uploads — a classic source of malicious input (path traversal, file type abuse, size bombs).

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 30, 2026

@copilot evaluate the docs for http4k core and model any additional remote/local sources and sinks for this library https://www.http4k.org/api/http4k-core/index.html

Also add models...

Expanded the data extensions across three model files in commits c7b9b78 and b36259f:

http4k-core (http4k.model.yml):

  • Added remote sources: Request.header(String), Request.headerValues(String), Request.bodyString(), Request.queries(String), Request.getUri()
  • Added response-splitting sink on Response.header() covering both Argument[0..1]
  • Added summary models for immutable builder pattern taint propagation through method chains

http4k-format (http4k-format.model.yml):

  • ConfigurableJackson, ConfigurableMoshi, ConfigurableGson .asA() methods modeled as unsafe-deserialization sinks
  • Taint summary models for asA() and asJsonObject() so tainted JSON input propagates to deserialized objects

http4k-multipart (http4k-multipart.model.yml):

  • MultipartFormBody.field/fields/fieldValue/fieldValues/file/files as remote sources
  • MultipartFormFile.getFilename() and .getContent() as remote sources (path traversal, malicious content)
  • MultipartFormField.getValue() as remote source
  • MultipartFormBody.from() summary for taint propagation from HttpMessage

Copilot AI changed the title [NEW DATA EXTENSION] Kotlin HTTP4k [NEW DATA EXTENSION] Kotlin HTTP4k (core, format, multipart) Mar 30, 2026
@felickz
Copy link
Copy Markdown
Contributor

felickz commented Mar 31, 2026

@copilot

The models should be stored in a series of YAML data extension files, one for each external API. For example:

    jmh-core.model.yml                  # models calls to jmh-core@1.20
    rocksdbjni.model.yml                # models calls to rocksdbjni@7.7.3

…-jackson, http4k-format-moshi, http4k-format-gson, http4k-multipart

Agent-Logs-Url: https://github.com/testing-felickz/codeql-development-template/sessions/845edef4-ae1e-46d7-a82b-6fbcfeb02871

Co-authored-by: felickz <1760475+felickz@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 31, 2026

@copilot

The models should be stored in a series of YAML data extension files, one for each external API. For example:
...

Reorganized in commit 3f4b098. The model files are now split one per external API artifact:

  • http4k-core.model.yml — core Request/Response models
  • http4k-format-jackson.model.yml — Jackson deserialization models
  • http4k-format-moshi.model.yml — Moshi deserialization models
  • http4k-format-gson.model.yml — Gson deserialization models
  • http4k-multipart.model.yml — multipart form/file upload models

Copilot AI changed the title [NEW DATA EXTENSION] Kotlin HTTP4k (core, format, multipart) [NEW DATA EXTENSION] Kotlin HTTP4k (core, format-jackson, format-moshi, format-gson, multipart) Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Data Extension Create]: Kotlin HTTP4k

2 participants