This Raku package, "ML::SparseMatrixRecommender", has different functions for computations of recommendations based on (user) profile or history using Sparse Linear Algebra (SLA). The package mirrors the Wolfram Language (WL) implementation, [AAp1]. There are also corresponding implementations in Python and R; see [AAp6, AAp2].
The package is based on a certain "standard" Information retrieval paradigm -- it utilizes Latent Semantic Indexing (LSI) functions like IDF, TF-IDF, etc. Hence, the package also has document-term matrix creation functions and LSI application functions. I included them in the package since I wanted to minimize the external package dependencies.
The package includes the data-set dfTitanic in order to make easier the
writing of introductory examples and unit tests.
For more theoretical description see the article "Mapping Sparse Matrix Recommender to Streams Blending Recommender" , [AA1].
For detailed examples see the files "Basic-usage.raku" and "Classification.raku", and the Jupyter notebooks in the GitHub repository "./docs" folder.
Remark: "SMR" stands for "Sparse Matrix Recommender". Most of the operations of this Raku package mirror the operations of the software monads "MonadicSparseMatrixRecommender", "SMRMon-R", [AAp1, AAp2] and the attributes and methods of the Python package [AAp7].
Here is a diagram that encompasses the workflows this package supports (or will support):
Here is a narration of a certain workflow scenario:
- Get a dataset.
- Create contingency matrices for a given identifier column and a set of "tag type" columns.
- Examine recommender matrix statistics.
- If the assumptions about the data hold apply LSI functions.
- For example, the "usual trio" IDF, Frequency, Cosine.
- Do (verify) example profile recommendations.
- If satisfactory results are obtained use the recommender as a nearest neighbors classifier.
Here is a diagram of typical pipeline building using a ML::SparseMatrixRecommender object:
flowchart TD
%% --- Top / Legend ---
%% SMR = Sparse Matrix Recommender
%% --- Inputs & Constructors ---
%%subgraph IO["Input/Output"]
IN1[/"data frame<br>or<br>a hashmap of<br>Math::SparseMatrix objects"/]
WIDE[("Data<br>(wide form)")]
ECHO[/Echo<br>output/]
OUT[/dataset<br>or<br>hashmap/]
%%end
IN1 -.- WIDE
IN1 --> cfwf
WIDE --> join
%% --- SMR object & pipeline value (context/state) ---
subgraph MON[" "]
SMR{{<br>SMR<br>object<br>}}
VAL([SMR<br>pipeline value])
end
%% --- Pipeline container ---
subgraph PIPE[SMR monad pipeline]
direction LR
unit["ML::SparseMatrixRecommender.new"]
cfwf[create-from-wide-form]
echo[echo-data-sumary]
twf[apply-term-weight-functions]
rec[recommend]
join[join-across]
prove[prove-by-metadata]
unit ==> cfwf ==> echo ==> twf ==> rec ==> join ==> prove
end
cfwf -.- |data<br>matrices<br>M|SMR
echo -.- |data|SMR
twf -.- |M|SMR
echo -- echo-value --> ECHO
join -- take-value --> OUT
prove -- take-value --> OUT
VAL === PIPE
VAL -.- SMR
SMR === PIPE
Remark: The monadic design allows "pipelining" of the SMR operations -- see the usage example section.
To install from GitHub use the shell command:
zef install https://github.com/antononcube/Raku-ML-SparseMatrixRecommender
To install from Zef ecosystem:
zef install ML::SparseMatrixRecommender
Here is an example of an SMR pipeline for creation of a recommender over Titanic data and recommendations for the profile "passengerSex:male" and "passengerClass:1st":
use ML::SparseMatrixRecommender;
use ML::SparseMatrixRecommender::Utilities;
my @dsTitanic = ML::SparseMatrixRecommender::Utilities::get-titanic-dataset();
my $smrObj =
ML::SparseMatrixRecommender
.new
.create-from-wide-form(
@dsTitanic,
tag-types => Whatever,
item-column-came => <id>)
.apply-term-weight-functions('IDF', 'None', 'Cosine')
.recommend-by-profile(["passengerSex:male", "passengerClass:1st"], 10, :!normalize)
.echo-value('recommendation by profile: ');# recommendation by profile: [10 => 2 101 => 2 102 => 2 107 => 2 11 => 2 110 => 2 111 => 2 115 => 2 116 => 2 119 => 2]
Remark: More examples can be found in the directory "./docs".
The Python package "SparseMatrixRecommender", [AAp6], implements a software monad for SMR workflows.
The Python package "LatentSemanticAnalyzer", [AAp7], can be used to make matrices for "SparseMatrixRecommender".
The Python package "SSparseMatrix", [AAp6], is fundamental in both "SparseMatrixRecommender" and "LatentSemanticAnalyzer". "SSparseMatrix" corresponds to the Raku package "Math::SparseMatrix", [AAp9], which is fundamental for this package.
Here is the Python "SparseMatrixRecommender" pipeline that corresponds to the Raku pipeline above:
from SparseMatrixRecommender.SparseMatrixRecommender import *
from SparseMatrixRecommender.DataLoaders import *
dfTitanic = load_titanic_data_frame()
smrObj = (SparseMatrixRecommender()
.create_from_wide_form(data = dfTitanic,
item_column_name="id",
columns=None,
add_tag_types_to_column_names=True,
tag_value_separator=":")
.apply_term_weight_functions(global_weight_func = "IDF",
local_weight_func = "None",
normalizer_func = "Cosine")
.recommend_by_profile(profile=["passengerSex:male", "passengerClass:1st"],
nrecs=12)
.join_across(data=dfTitanic, on="id")
.echo_value())The package "SMRMon-R", [AAp2], implements a software monad for SMR workflows. Most of "SMRMon-R" functions delegate to `SparseMatrixRecommender", [AAp3].
The package "SparseMatrixRecommenderInterfaces", [AAp4], provides functions for interactive Shiny interfaces for the recommenders made with "SparseMatrixRecommender" and/or "SMRMon-R".
The package "LSAMon-R", [AAp5], can be used to make matrices for "SparseMatrixRecommender" and/or "SMRMon-R".
Here is the "SMRMon-R" pipeline that corresponds to the Raku pipeline above:
smrObj <-
SMRMonCreate( data = dfTitanic,
itemColumnName = "id",
addTagTypesToColumnNamesQ = TRUE,
sep = ":") %>%
SMRMonApplyTermWeightFunctions(globalWeightFunction = "IDF",
localWeightFunction = "None",
normalizerFunction = "Cosine") %>%
SMRMonRecommendByProfile( profile = c("passengerSex:male", "passengerClass:1st"),
nrecs = 12) %>%
SMRMonJoinAcross( data = dfTitanic, by = "id") %>%
SMRMonEchoValueThe Wolfram Language (WL) software monad "MonadicSparseMatrixRecommender", [AAp1], provides recommendation pipelines similar to the pipelines created with this package.
Here is a WL monadic pipeline that corresponds to the Raku pipeline above:
smrObj =
SMRMonUnit[]⟹
SMRMonCreate[dfTitanic, "id",
"AddTagTypesToColumnNames" -> True,
"TagValueSeparator" -> ":"]⟹
SMRMonApplyTermWeightFunctions["IDF", "None", "Cosine"]⟹
SMRMonRecommendByProfile[{"passengerSex:male", "passengerClass:1st"}, 12]⟹
SMRMonJoinAcross[dfTitanic, "id"]⟹
SMRMonEchoValue[]; (Compare the pipeline diagram above with the corresponding diagram using Mathematica notation .)
The project repository "Scalable Recommender Framework", [AAr1], has documents, diagrams, tests, and benchmarks of a recommender system implemented in multiple programming languages.
This Python recommender package is a decisive winner in the comparison -- see the first 10 min of the video recording [AAv1] or the benchmarks at [AAr1].
The project "Raku for Prediction", [AAr2, AAv2, AAp7], has a Domain Specific Language (DSL) grammar and interpreters that generate SMR code for the corresponding Mathematica, Python, R, and Raku packages, [AAp11].
Here is Command Line Interface (CLI) invocation example that generate code for this package:
ToRecommenderWorkflowCode Raku 'create with dfTitanic; apply the LSI functions IDF, None, Cosine;recommend by profile 1st and male' # my $obj = ML::SparseMatrixRecommender.new.create-from-wide-form(dfTitanic).apply-term-weight-functions(global-weight-func => "IDF", local-weight-func => "None", normalizer-func => "Cosine").recommend-by-profile(["1st", "male"])
Here is an example using the NLP Template Engine, [AAp12, AAr2, AAv3], (which uses LLMs to fill in static templates):
use ML::NLPTemplateEngine;
'create recommender with dfTitanic; apply the LSI functions IDF, None, Cosine;recommend by profile 1st and male'
==> concretize(lang => "Raku")# my $smrObj = ML::SparseMatrixRecommender.new
# .create-from-wide-form(dfTitanic, item-column-name='id', :add-tag-types-to-column-names, tag-value-separator=':')
# .apply-term-weight-functions('IDF', 'None', 'Cosine')
# .recommend-by-profile(["male"], 12, :!normalize)
# .join-across(dfTitanic)
# .echo-value();
Instead of using grammars the individual commands translation can be done using LLMs and few-shot training examples, see "DSL::Examples", [AAp13]. Here is an example:
use DSL::Examples;
use LLM::Functions;
my &llm-pipeline-segment = llm-example-function(dsl-examples()<Raku><SMRMon>);
my $spec = q:to/END/;
new recommender;
create from @dsData;
apply LSI functions IDF, None, Cosine;
recommend by profile for passengerSex:male, and passengerClass:1st;
join across with @dsData on "id";
echo the pipeline value;
classify by profile passengerSex:female, and passengerClass:1st on the tag passengerSurvival;
echo value
END
my @commands = $spec.lines;
@commands
.map({ .&llm-pipeline-segment })
.map({ .subst(/:i Output \h* ':'?/, :g).trim })
.join("\n.")# ML::SparseMatrixRecommender.new
# .create(@dsData)
# .apply-term-weight-functions('IDF', 'None', 'Cosine')
# .recommend-by-profile(['male', '1st'])
# .join-across(@dsData, on => 'id')
# .echo-value()
# .classify-by-profile('passengerSurvival', ['passengerSex.female', 'passengerClass.1st'])
# .echo-value()
Two performance topics are more important than rest:
- Recommender object creation
- Recommendations computations
See the dedicated document "Performance.md" for a detailed discussion.
[AA1] Anton Antonov, "Mapping Sparse Matrix Recommender to Streams Blending Recommender" (2017), MathematicaForPrediction at GitHub.
[AAp1] Anton Antonov, MonadicSparseMatrixRecommender, WL paclet, (2018-2024), Wolfram Language Paclet Repository.
[AAp2] Anton Antonov, SMRMon, R package (2019-2024), R-packages at GitHub/antononcube.
[AAp3] Anton Antonov, SparseMatrixRecommender, R package (2019-2024), R-packages at GitHub/antononcube.
[AAp4] Anton Antonov, Sparse Matrix Recommender framework interface functions (2019), R-packages at GitHub/antononcube.
[AAp5] Anton Antonov, LSAMon, R package (2019), R-packages at GitHub/antononcube.
[AAp6] Anton Antonov, SSparseMatrix, Python package (2021), Python-packages at GitHub/antononcube.
[AAp7] Anton Antonov, SparseMatrixRecommender, Python package (2021), Python-packages at GitHub/antononcube.
[AAp8] Anton Antonov, LatentSemanticAnalyzer, Python package (2021), Python-packages at GitHub/antononcube.
[AAp9] Anton Antonov, Math::SparseMatrix, Raku package, (2024-2025), GitHub/antononcube. (At raku.land).
[AAp10] Anton Antonov, Math::SparseMatrix::Native, Raku package, (2024-2025), GitHub/antononcube. (At raku.land).
[AAp11] Anton Antonov, DSL::English::RecommenderWorkflows, Raku package, (2018-2022), GitHub/antononcube. (At raku.land).
[AAp12] Anton Antonov, ML::NLPTemplateEngine, Raku package, (2023-2025), GitHub/antononcube. (At raku.land).
[AAp13] Anton Antonov, DSL::Examples, Raku package, (2024-2025), GitHub/antononcube. (At raku.land).
[AAr1] Anton Antonov, Scalable Recommender Framework project, (2022) GitHub/antononcube.
[AAr2] Anton Antonov, "Raku for Prediction" book project, (2021-2022), GitHub/antononcube.
[AAv1] Anton Antonov, "TRC 2022 Implementation of ML algorithms in Raku", (2022), Anton A. Antonov's channel at YouTube.
[AAv2] Anton Antonov, "Raku for Prediction", (2021), The Raku Conference (TRC) at YouTube.
[AAv3] Anton Antonov, "NLP Template Engine, Part 1", (2021), Anton A. Antonov's channel at YouTube.
