Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a functional API for importers? #108

Open
njlr opened this issue Mar 21, 2022 · 1 comment
Open

Provide a functional API for importers? #108

njlr opened this issue Mar 21, 2022 · 1 comment

Comments

@njlr
Copy link
Contributor

njlr commented Mar 21, 2022

A little-used (but very useful!) feature in Npgsql is the NpgsqlBinaryImporter class, which allows more efficient bulk data loading with the Postgres COPY command.

Here is a demo using Npgsql.FSharp:

#r "nuget: Npgsql"
#r "nuget: Npgsql.FSharp"

open System
open Npgsql
open NpgsqlTypes
open Npgsql.FSharp

// Type extension so we can use the Npgsql.FSharp DU
type NpgsqlBinaryImporter with
  member this.WriteAsync(value : SqlValue) =
    match value with
    | SqlValue.Null ->
      this.WriteNullAsync()
    | SqlValue.Uuid x ->
      this.WriteAsync(x, NpgsqlDbType.Uuid)
    | SqlValue.String x ->
      this.WriteAsync(x, NpgsqlDbType.Varchar)
    | SqlValue.Int x ->
      this.WriteAsync(x, NpgsqlDbType.Integer)
    | _ ->
      failwith $"Unsupported value {value}" // TODO

// Domain object
type Book =
  {
    ID : Guid
    Title : string
    Year : int
  }

// Sample data (would typically come from a file)
let books =
  [
    { ID = Guid.Parse "74e99bf4-0b97-45c7-a078-698b96bc7421"; Title = "Consider Phlebas"; Year = 1987 }
    { ID = Guid.Parse "81023fb5-a54d-4a3f-8a8b-f36e022e6a11"; Title = "Hyperion"; Year = 1989 }
    { ID = Guid.Parse "89712977-3b2c-4fef-b902-8d20d1532084"; Title = "The Three-Body Problem"; Year = 2008 }
  ]

task {
  // Setup
  let connectionString = Environment.GetEnvironmentVariable "PG_CONNECTION_STRING"

  let db = Sql.connect connectionString

  let connection =
    db
    |> Sql.createConnection

  do! connection.OpenAsync()

  let db = Sql.existingConnection connection

  let! _ =
    db
    |> Sql.query
      """
      CREATE TABLE IF NOT EXISTS books (
        id UUID NOT NULL PRIMARY KEY,
        title TEXT NOT NULL,
        year INT NOT NULL
      )
      """
    |> Sql.executeNonQueryAsync

  // Binary import
  let writer = connection.BeginBinaryImport("COPY books (id, title, year) FROM STDIN BINARY")

  for book in books do
    let values =
      [
        Sql.uuid book.ID
        Sql.string book.Title
        Sql.int book.Year
      ]

    do! writer.StartRowAsync()

    for v in values do
      do! writer.WriteAsync(v)

  let! numberOfRowsWritten = writer.CompleteAsync()

  do! writer.CloseAsync()

  printfn $"Wrote {numberOfRowsWritten} row(s)"

  // Query back the data to check
  let! books =
    db
    |> Sql.query "SELECT * FROM books"
    |> Sql.executeAsync
      (fun read ->
        {
          ID = read.uuid "id"
          Title = read.string "title"
          Year = read.int "year"
        })

  for book in books do
    printfn $"{book}"

}
|> fun t -> t.Wait()

Perhaps this library could include a wrapper for this functionality?

@thomasd3
Copy link
Contributor

I would like that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants