Skip to content

Luau Helper Functions Examples

Joel Natividad edited this page Feb 23, 2025 · 1 revision

qsv Luau Helper Functions Examples

qsv provides several helper functions that can be called from Luau scripts to enhance functionality. Here's a comprehensive guide to using these helpers.

To check the documentation for the helper functions, go here.

Table of Contents

Logging Functions

qsv_log(level, ...args)

Logs messages at specified levels. Messages are concatenated and logged according to QSV_LOG_LEVEL.

Parameters:

  • level: string - One of "info", "warn", "error", "debug", "trace"
  • ...args: any - Values to log (up to 255 arguments)

Examples:

BEGIN {
    qsv_log("info", "Starting processing")
}!

-- Log details for each row
qsv_log("debug", "Processing row", _IDX, "Amount:", Amount)
return Amount

Command line:

# Process data.csv with logging enabled
QSV_LOG_LEVEL=debug qsv luau map NewAmount 'file:script.lua' data.csv

Data Manipulation Functions

qsv_loadcsv(table_name, filepath, key_column)

Loads a CSV file into a Lua table with optional key column indexing.

Parameters:

  • table_name: string - Name of the Lua table to create
  • filepath: string - Path to the CSV file
  • key_column: string - Optional column to use as key (defaults to row number)

Example:

BEGIN {
    -- Load CSV with letter column as key
    qsv_loadcsv("data_table", "lookup.csv", "letter")
    
    -- Access data
    qsv_log("debug", "Value:", data_table["A"]["description"])
}!

qsv_loadjson(table_name, filepath)

Loads a JSON file into a Lua table.

Example:

BEGIN {
    qsv_loadjson("config", "settings.json")
    qsv_log("debug", "Setting:", config.max_value)
}!

Environment and System Functions

qsv_setenv(name, value)

Sets an environment variable.

Example:

BEGIN {
    qsv_setenv("MY_ENV", "my_value")
}!

qsv_getenv(name)

Returns the value of an environment variable.

Example:

local value = qsv_getenv("MY_ENV")
qsv_log("info", "MY_ENV value:", value)

File Operations

qsv_fileexists(filename)

Checks if a file exists.

Example:

if qsv_fileexists("myfile.csv") then
    qsv_log("info", "myfile.csv exists")
end

qsv_writefile(filename, data)

Writes data to a file, creating or appending as needed.

Example:

BEGIN {
    -- Create new log file
    qsv_writefile("process.log", "_NEWFILE!")
    qsv_log("info", "Started processing")
}!

-- Log each row's running total
local total = qsv_cumsum(Amount)
qsv_writefile("process.log", 
    string.format("Row %d: Amount=%s, Running Total=%s\n", 
        _IDX, Amount, total))
return total

Command line:

qsv luau map RunningTotal 'file:log_totals.lua' transactions.csv

Sample input (transactions.csv):

Date,Amount
2024-01-01,100
2024-01-02,150
2024-01-03,75

Sample output:

Date,Amount,RunningTotal
2024-01-01,100,100
2024-01-02,150,250
2024-01-03,75,325

Generated process.log:

Row 1: Amount=100, Running Total=100
Row 2: Amount=150, Running Total=250
Row 3: Amount=75, Running Total=325

qsv_writejson(table_or_value, filepath, pretty)

Writes a Lua value to a JSON file.

Example:

BEGIN {
    -- Create a table to store
    local data = {
        numbers = {13, 24, 72, 7},
        letters = {"a", "b", "c", "d"},
        nested = {
            key = "value",
            array = {1, 2, 3}
        }
    }
    -- Write both pretty and compact JSON
    qsv_writejson(data, "output.json", false)
    qsv_writejson(data, "output_pretty.json", true)
}!

return "ok"

Command line:

qsv luau map result 'file:writejson.lua' data.csv

Sample input (data.csv):

letter,number
a,13
b,24
c,72
d,7

Generated output.json:

{"numbers":[13,24,72,7],"nested":{"key":"value","array":[1,2,3]},"letters":["a","b","c","d"]}

Generated output_pretty.json:

{
  "numbers": [
    13,
    24,
    72,
    7
  ],
  "nested": {
    "key": "value",
    "array": [
      1,
      2,
      3
    ]
  },
  "letters": [
    "a",
    "b",
    "c",
    "d"
  ]
}

qsv_cmd(qsv_args)

Executes another QSV command and returns its output.

Example:

BEGIN {
    -- Get statistics of the reference dataset
    local stats = qsv_cmd("stats reference.csv")
    qsv_writejson(stats, "reference_stats.json", true)
    
    -- Index the reference file if needed
    if not qsv_fileexists("reference.csv.idx") then
        qsv_cmd("index reference.csv")
    end
}!

-- Use stats from another file for comparison
local ref_stats = qsv_cmd(string.format("stats -s %s reference.csv", Amount))
return ref_stats.stdout

Command line:

qsv luau map Stats 'file:compare_stats.lua' current.csv

Sample input (current.csv):

Category,Amount
Food,150
Transport,75
Housing,500

Sample input (reference.csv):

Category,Amount
Food,125
Transport,80
Housing,450

Sample output:

Category,Amount,Stats
Food,150,"mean: 218.33, stddev: 205.77"
Transport,75,"mean: 218.33, stddev: 205.77"
Housing,500,"mean: 218.33, stddev: 205.77"

qsv_shellcmd(command, args)

Executes allowed shell commands safely.

Example:

BEGIN {
    -- Create directory for outputs if it doesn't exist
    qsv_shellcmd("mkdir", "-p outputs")
    
    -- Get current date for logging
    local date = qsv_shellcmd("date", "+%Y-%m-%d")
    qsv_writefile("outputs/process.log", 
        string.format("Processing started on %s\n", date.stdout))
}!

-- Process each row and log to file
local wc = qsv_shellcmd("wc", "-w <<< " .. Description)
local word_count = tonumber(wc.stdout) or 0

qsv_writefile("outputs/process.log", 
    string.format("Row %d: %d words\n", _IDX, word_count))

return word_count

Command line:

qsv luau map WordCount 'file:word_count.lua' descriptions.csv

Sample input (descriptions.csv):

ID,Description
1,"Short description here"
2,"This is a longer product description with more words"
3,"Single"

Sample output:

ID,Description,WordCount
1,"Short description here",3
2,"This is a longer product description with more words",10
3,"Single",1

Generated outputs/process.log:

Processing started on 2024-03-21
Row 1: 3 words
Row 2: 10 words
Row 3: 1 words

qsv_break(message) and qsv_skip()

Control flow functions for stopping processing or skipping rows.

Example:

BEGIN {
    -- Initialize counters
    skipped = 0
    total = 0
    qsv_writefile("process.log", "_NEWFILE!")
}!

-- Skip rows with zero amount
if Amount == "0" then
    skipped = skipped + 1
    qsv_writefile("process.log", 
        string.format("Skipped row %d: Zero amount\n", _IDX))
    qsv_skip()
end

-- Break if we exceed threshold
total = total + tonumber(Amount)
if total > 1000 then
    qsv_writefile("process.log",
        string.format("Total %f exceeds threshold\n", total))
    qsv_break("Total amount exceeded 1000")
end

return Amount

Command line:

qsv luau map ProcessedAmount 'file:validate.lua' transactions.csv

Sample input (transactions.csv):

ID,Amount
1,100
2,0
3,500
4,600
5,50

Sample output:

ID,Amount,ProcessedAmount
1,100,100
3,500,500
4,600,<Processing stopped: Total amount exceeded 1000>

Generated process.log:

Skipped row 2: Zero amount
Total 1200.000000 exceeds threshold

Lookup Functions

qsv_register_lookup(lookup_name, lookup_table_uri, cache_age_secs)

Registers and loads a lookup table from a CSV file. The CSV can be local, from a URL, or from a CKAN instance.

Example 1 - Local File:

BEGIN {
    -- Load state tax rates from local CSV
    local tax_headers = qsv_register_lookup("tax_rates", "tax_rates.csv", 0)
    if not tax_headers then
        qsv_break("Failed to load tax rates")
    end
    qsv_log("info", "Loaded tax headers:", tax_headers)
}!

-- Calculate price with tax
local state_tax = tonumber(tax_rates[State]["Rate"]) or 0
local tax_amount = Price * (state_tax / 100)
return {Price + tax_amount, tax_amount}

Command line:

qsv luau map 'TotalPrice,TaxAmount' 'file:calculate_tax.lua' sales.csv

Sample input (sales.csv):

State,Price
NY,100
CA,200
TX,150

Sample input (tax_rates.csv):

State,Rate,LastUpdated
NY,8.875,2024-01-01
CA,7.25,2024-01-01
TX,6.25,2024-01-01

Sample output:

State,Price,TotalPrice,TaxAmount
NY,100,108.88,8.88
CA,200,214.50,14.50
TX,150,159.38,9.38

Example 2 - Remote Lookup Table:

BEGIN {
    -- Load currency rates from datHere's lookup repo
    local rate_headers = qsv_register_lookup(
        "forex_rates", 
        "dathere://forex-rates.csv",
        3600  -- Cache for 1 hour
    )
    if not rate_headers then
        qsv_break("Failed to load forex rates")
    end
}!

-- Convert amount to target currency
local rate = tonumber(forex_rates[TargetCurrency]["Rate"]) or 1
return Amount * rate

Command line:

qsv luau map ConvertedAmount 'file:currency_convert.lua' transactions.csv

Sample input (transactions.csv):

Date,Amount,TargetCurrency
2024-03-21,100,EUR
2024-03-21,200,GBP
2024-03-21,150,JPY

Sample output:

Date,Amount,TargetCurrency,ConvertedAmount
2024-03-21,100,EUR,92.34
2024-03-21,200,GBP,157.24
2024-03-21,150,JPY,22350

Example 3 - CKAN Data Source:

BEGIN {
    -- Load city metrics from CKAN data source
    local metrics_headers = qsv_register_lookup(
        "cityscore", 
        "ckan://CityScore Summary?",
        1000  -- Cache for 1000 seconds
    )
    if not metrics_headers then
        qsv_break("Failed to load city metrics")
    end
}!

-- Get previous month's score for the metric
local prev_month_score = cityscore[metric_name].previous_month_score
return prev_month_score

Command line:

qsv luau map PreviousScore 'file:city_metrics.lua' --ckan-api 'https://data.boston.gov/api/3/action' metrics.csv

Sample input (metrics.csv):

metric_name,target_score
POTHOLE ON-TIME %,0.6
ON-TIME PERMIT REVIEWS,0.6
BFD RESPONSE TIME,0.6
BPS ATTENDANCE,0.6

Sample output:

metric_name,target_score,PreviousScore
POTHOLE ON-TIME %,0.6,0.85
ON-TIME PERMIT REVIEWS,0.6,0.92
BFD RESPONSE TIME,0.6,0.78
BPS ATTENDANCE,0.6,0.95

Notes:

  1. Local files are loaded with cache_age_secs = 0 for immediate updates
  2. Remote files support caching to reduce network requests
  3. CKAN sources require the --ckan-api parameter
  4. Lookup tables are accessed using table[key][column] syntax
  5. Headers from the lookup CSV are returned for validation

Cumulative Functions

qsv_cumsum(value, name)

Calculates the running sum of values with optional named instances.

Parameters:

  • value: number/string - Value to add to sum
  • name: string - Optional identifier for parallel sums

Example:

-- Calculate running totals for revenue and expenses
return {
    qsv_cumsum(Revenue, "rev"),
    qsv_cumsum(Expenses, "exp")
}

Command line:

qsv luau map 'RunningRev,RunningExp' 'file:cumsum.lua' input.csv

Sample input (input.csv):

Revenue,Expenses
100,50
200,75
150,80
300,100

Sample output:

Revenue,Expenses,RunningRev,RunningExp
100,50,100,50
200,75,300,125
150,80,450,205
300,100,750,305

qsv_cumprod(value, name)

Calculates the running product of values.

Example:

-- Calculate compound growth
BEGIN {
    qsv_log("info", "Calculating compound growth rates")
}!

return qsv_cumprod(1 + GrowthRate)

Command line:

qsv luau map CompoundGrowth 'file:compound.lua' growth.csv

Sample input (growth.csv):

GrowthRate
0.05
0.03
0.04
0.02

Sample output:

GrowthRate,CompoundGrowth
0.05,1.05
0.03,1.08150
0.04,1.12476
0.02,1.14726

qsv_cummax(value, name) and qsv_cummin(value, name)

Track running maximum and minimum values.

Example:

-- Track price ranges and return both high and low
return {
    qsv_cummax(Price, "price_range"),
    qsv_cummin(Price, "price_range")
}

Command line:

qsv luau map 'HighPrice,LowPrice' 'file:pricerange.lua' prices.csv

Sample input (prices.csv):

Date,Price
2024-01-01,100
2024-01-02,95
2024-01-03,105
2024-01-04,98

Sample output:

Date,Price,HighPrice,LowPrice
2024-01-01,100,100,100
2024-01-02,95,100,95
2024-01-03,105,105,95
2024-01-04,98,105,95

qsv_lag(value, periods, default, name)

Returns the value from N periods ago.

Example:

-- Calculate price change and percentage change
BEGIN {
    qsv_log("info", "Calculating price changes")
}!

local prev_price = qsv_lag(Price, 1, "0")
local price_change = tonumber(Price) - tonumber(prev_price)
local pct_change = prev_price ~= "0" 
    and string.format("%.2f%%", (price_change / tonumber(prev_price)) * 100)
    or "N/A"

return {price_change, pct_change}

Command line:

qsv luau map 'Change,PctChange' 'file:changes.lua' prices.csv

Sample input (prices.csv):

Date,Price
2024-01-01,100
2024-01-02,105
2024-01-03,102
2024-01-04,108

Sample output:

Date,Price,Change,PctChange
2024-01-01,100,0,N/A
2024-01-02,105,5,5.00%
2024-01-03,102,-3,-2.86%
2024-01-04,108,6,5.88%

qsv_diff(value, periods, name)

Calculates difference between current and previous values.

Example:

-- Calculate daily and weekly changes
BEGIN {
    qsv_log("info", "Calculating daily and weekly changes")
}!

return {
    qsv_diff(Price),           -- 1-day change
    qsv_diff(Price, 5, "week") -- 5-day change
}

Command line:

qsv luau map 'DailyChange,WeeklyChange' 'file:changes.lua' prices.csv

Sample input (prices.csv):

Date,Price
2024-01-01,100
2024-01-02,102
2024-01-03,105
2024-01-04,103
2024-01-05,106
2024-01-06,110

Sample output:

Date,Price,DailyChange,WeeklyChange
2024-01-01,100,0,0
2024-01-02,102,2,0
2024-01-03,105,3,0
2024-01-04,103,-2,0
2024-01-05,106,3,0
2024-01-06,110,4,10

Special Variables

  • _IDX: Current row number (1-based)
  • _INDEX: Enables random access mode
  • _ROWCOUNT: Total rows (available in END or with _INDEX)
  • _LASTROW: Last row number

Example:

BEGIN {
    -- Process CSV in reverse
    _INDEX = _LASTROW
}!

-- Process row
result = Amount * 2

-- Move to previous row
_INDEX = _INDEX - 1

return result

Best Practices

  1. Use BEGIN scripts to:

    • Initialize variables
    • Load lookup tables
    • Set up processing environment
  2. Use END scripts to:

    • Calculate final results
    • Write summary files
    • Clean up resources
  3. Use qsv_log for debugging:

    BEGIN {
        qsv_log("debug", "Starting process")
    }!
    
    qsv_log("debug", "Row", _IDX, "Amount:", Amount)
  4. Handle errors gracefully:

    local success, result = pcall(function()
        return tonumber(Amount) * 1.5
    end)
    return success and result or "<ERROR>"
  5. Use meaningful names for parallel operations:

    return {
        qsv_cumsum(Revenue, "monthly_revenue"),
        qsv_cumsum(Expenses, "monthly_expenses")
    }

Notes

  1. Helper functions are available in BEGIN, MAIN, and END scripts unless noted otherwise.
  2. Log levels: TRACE < DEBUG < INFO < WARN < ERROR
  3. Random access mode requires an indexed CSV or qsv_autoindex() call.
  4. Some functions may have performance implications on large datasets.