Skip to content

Commit

Permalink
Synthetic Feature Generators Documenation (#71)
Browse files Browse the repository at this point in the history
* added complex synthetic feature generators

Added a suite of functions related to synthetic feature generation.

* removed .idea

* pre-commit, code review changes

pre-commit,
code review changes:
- added _feature_builder method to avoid duplicate code blocks
- added some new parameters to enable random value domains for features

* Rewrote tests with unittest instead of pytest

* removed if __name__ == '__main__' from file, small fix in cluster test

* code review fixes

renamed _feature_builder -> _configure_generate_featuer

Replace np.ndarray typing with ArrayLike from numpy typing, other typing fixes

* Added documentation for feature generation

Small demo code in DOCSMAIN as well as pdoc entry

* updated documentation with changes after merging PR

* fixed conflicting file
  • Loading branch information
98MM committed Jul 17, 2024
1 parent 147f037 commit bf540c5
Show file tree
Hide file tree
Showing 30 changed files with 7,072 additions and 2,656 deletions.
20 changes: 20 additions & 0 deletions docs/DOCSMAIN.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,23 @@ scores = [lowest_score, medium_score, high_score]
sorted_score_indices = np.argsort(scores)
assert np.sum(np.array([0, 1, 2]) - sorted_score_indices) == 0
```
---
## Creating a simple dataset
```python
from outrank.algorithms.synthetic_data_generators.cc_generator import CategoricalClassification

cc = CategoricalClassification()

# Creates a simple dataset of 10 features, 10k samples, with feature cardinality of all features being 35
X = cc.generate_data(9,
10000,
cardinality=35,
ensure_rep=True,
random_values=True,
low=0,
high=40)

# Creates target labels via clustering
y = cc.generate_labels(X, n=2, class_relation='cluster')

```
33 changes: 30 additions & 3 deletions docs/outrank.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="generator" content="pdoc 14.1.0"/>
<meta name="generator" content="pdoc 14.5.1"/>
<title>outrank API documentation</title>

<style>/*! * Bootstrap Reboot v5.0.0 (https://getbootstrap.com/) * Copyright 2011-2021 The Bootstrap Authors * Copyright 2011-2021 Twitter, Inc. * Licensed under MIT (https://github.com/twbs/bootstrap/blob/main/LICENSE) * Forked from Normalize.css, licensed MIT (https://github.com/necolas/normalize.css/blob/master/LICENSE.md) */*,::after,::before{box-sizing:border-box}@media (prefers-reduced-motion:no-preference){:root{scroll-behavior:smooth}}body{margin:0;font-family:system-ui,-apple-system,"Segoe UI",Roboto,"Helvetica Neue",Arial,"Noto Sans","Liberation Sans",sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji";font-size:1rem;font-weight:400;line-height:1.5;color:#212529;background-color:#fff;-webkit-text-size-adjust:100%;-webkit-tap-highlight-color:transparent}hr{margin:1rem 0;color:inherit;background-color:currentColor;border:0;opacity:.25}hr:not([size]){height:1px}h1,h2,h3,h4,h5,h6{margin-top:0;margin-bottom:.5rem;font-weight:500;line-height:1.2}h1{font-size:calc(1.375rem + 1.5vw)}@media (min-width:1200px){h1{font-size:2.5rem}}h2{font-size:calc(1.325rem + .9vw)}@media (min-width:1200px){h2{font-size:2rem}}h3{font-size:calc(1.3rem + .6vw)}@media (min-width:1200px){h3{font-size:1.75rem}}h4{font-size:calc(1.275rem + .3vw)}@media (min-width:1200px){h4{font-size:1.5rem}}h5{font-size:1.25rem}h6{font-size:1rem}p{margin-top:0;margin-bottom:1rem}abbr[data-bs-original-title],abbr[title]{-webkit-text-decoration:underline dotted;text-decoration:underline dotted;cursor:help;-webkit-text-decoration-skip-ink:none;text-decoration-skip-ink:none}address{margin-bottom:1rem;font-style:normal;line-height:inherit}ol,ul{padding-left:2rem}dl,ol,ul{margin-top:0;margin-bottom:1rem}ol ol,ol ul,ul ol,ul ul{margin-bottom:0}dt{font-weight:700}dd{margin-bottom:.5rem;margin-left:0}blockquote{margin:0 0 1rem}b,strong{font-weight:bolder}small{font-size:.875em}mark{padding:.2em;background-color:#fcf8e3}sub,sup{position:relative;font-size:.75em;line-height:0;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}a{color:#0d6efd;text-decoration:underline}a:hover{color:#0a58ca}a:not([href]):not([class]),a:not([href]):not([class]):hover{color:inherit;text-decoration:none}code,kbd,pre,samp{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;font-size:1em;direction:ltr;unicode-bidi:bidi-override}pre{display:block;margin-top:0;margin-bottom:1rem;overflow:auto;font-size:.875em}pre code{font-size:inherit;color:inherit;word-break:normal}code{font-size:.875em;color:#d63384;word-wrap:break-word}a>code{color:inherit}kbd{padding:.2rem .4rem;font-size:.875em;color:#fff;background-color:#212529;border-radius:.2rem}kbd kbd{padding:0;font-size:1em;font-weight:700}figure{margin:0 0 1rem}img,svg{vertical-align:middle}table{caption-side:bottom;border-collapse:collapse}caption{padding-top:.5rem;padding-bottom:.5rem;color:#6c757d;text-align:left}th{text-align:inherit;text-align:-webkit-match-parent}tbody,td,tfoot,th,thead,tr{border-color:inherit;border-style:solid;border-width:0}label{display:inline-block}button{border-radius:0}button:focus:not(:focus-visible){outline:0}button,input,optgroup,select,textarea{margin:0;font-family:inherit;font-size:inherit;line-height:inherit}button,select{text-transform:none}[role=button]{cursor:pointer}select{word-wrap:normal}select:disabled{opacity:1}[list]::-webkit-calendar-picker-indicator{display:none}[type=button],[type=reset],[type=submit],button{-webkit-appearance:button}[type=button]:not(:disabled),[type=reset]:not(:disabled),[type=submit]:not(:disabled),button:not(:disabled){cursor:pointer}::-moz-focus-inner{padding:0;border-style:none}textarea{resize:vertical}fieldset{min-width:0;padding:0;margin:0;border:0}legend{float:left;width:100%;padding:0;margin-bottom:.5rem;font-size:calc(1.275rem + .3vw);line-height:inherit}@media (min-width:1200px){legend{font-size:1.5rem}}legend+*{clear:left}::-webkit-datetime-edit-day-field,::-webkit-datetime-edit-fields-wrapper,::-webkit-datetime-edit-hour-field,::-webkit-datetime-edit-minute,::-webkit-datetime-edit-month-field,::-webkit-datetime-edit-text,::-webkit-datetime-edit-year-field{padding:0}::-webkit-inner-spin-button{height:auto}[type=search]{outline-offset:-2px;-webkit-appearance:textfield}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-color-swatch-wrapper{padding:0}::file-selector-button{font:inherit}::-webkit-file-upload-button{font:inherit;-webkit-appearance:button}output{display:inline-block}iframe{border:0}summary{display:list-item;cursor:pointer}progress{vertical-align:baseline}[hidden]{display:none!important}</style>
Expand All @@ -26,7 +26,10 @@ <h2>Contents</h2>
<li><a href="#welcome-to-outranks-documentation">Welcome to OutRank's documentation!</a></li>
<li><a href="#setup">Setup</a></li>
<li><a href="#example-use-cases">Example use cases</a></li>
<li><a href="#outrank-as-a-python-library">OutRank as a Python library</a></li>
<li><a href="#outrank-as-a-python-library">OutRank as a Python library</a>
<ul>
<li><a href="#creating-a-simple-dataset">Creating a simple dataset</a></li>
</ul></li>
</ul>


Expand All @@ -38,6 +41,7 @@ <h2>Submodules</h2>
<li><a href="outrank/core_utils.html">core_utils</a></li>
<li><a href="outrank/feature_transformations.html">feature_transformations</a></li>
<li><a href="outrank/task_generators.html">task_generators</a></li>
<li><a href="outrank/task_instance_ranking.html">task_instance_ranking</a></li>
<li><a href="outrank/task_ranking.html">task_ranking</a></li>
<li><a href="outrank/task_selftest.html">task_selftest</a></li>
<li><a href="outrank/task_summary.html">task_summary</a></li>
Expand Down Expand Up @@ -129,6 +133,29 @@ <h1 id="outrank-as-a-python-library">OutRank as a Python library</h1>
<span class="k">assert</span> <span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span> <span class="o">-</span> <span class="n">sorted_score_indices</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span>
</code></pre>
</div>

<hr />

<h2 id="creating-a-simple-dataset">Creating a simple dataset</h2>

<div class="pdoc-code codehilite">
<pre><span></span><code><span class="kn">from</span> <span class="nn"><a href="outrank/algorithms/synthetic_data_generators/cc_generator.html">outrank.algorithms.synthetic_data_generators.cc_generator</a></span> <span class="kn">import</span> <span class="n">CategoricalClassification</span>

<span class="n">cc</span> <span class="o">=</span> <span class="n">CategoricalClassification</span><span class="p">()</span>

<span class="c1"># Creates a simple dataset of 10 features, 10k samples, with feature cardinality of all features being 35</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">cc</span><span class="o">.</span><span class="n">generate_data</span><span class="p">(</span><span class="mi">9</span><span class="p">,</span>
<span class="mi">10000</span><span class="p">,</span>
<span class="n">cardinality</span><span class="o">=</span><span class="mi">35</span><span class="p">,</span>
<span class="n">ensure_rep</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">random_values</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">low</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">high</span><span class="o">=</span><span class="mi">40</span><span class="p">)</span>

<span class="c1"># Creates target labels via clustering</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">cc</span><span class="o">.</span><span class="n">generate_labels</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">n</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">class_relation</span><span class="o">=</span><span class="s1">&#39;cluster&#39;</span><span class="p">)</span>
</code></pre>
</div>
</div>

<input id="mod-outrank-view-source" class="view-source-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
Expand Down Expand Up @@ -326,4 +353,4 @@ <h1 id="outrank-as-a-python-library">OutRank as a Python library</h1>
}
});
</script></body>
</html>
</html>
12 changes: 6 additions & 6 deletions docs/outrank/algorithms.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="generator" content="pdoc 14.1.0"/>
<meta name="generator" content="pdoc 14.5.1"/>
<title>outrank.algorithms API documentation</title>

<style>/*! * Bootstrap Reboot v5.0.0 (https://getbootstrap.com/) * Copyright 2011-2021 The Bootstrap Authors * Copyright 2011-2021 Twitter, Inc. * Licensed under MIT (https://github.com/twbs/bootstrap/blob/main/LICENSE) * Forked from Normalize.css, licensed MIT (https://github.com/necolas/normalize.css/blob/master/LICENSE.md) */*,::after,::before{box-sizing:border-box}@media (prefers-reduced-motion:no-preference){:root{scroll-behavior:smooth}}body{margin:0;font-family:system-ui,-apple-system,"Segoe UI",Roboto,"Helvetica Neue",Arial,"Noto Sans","Liberation Sans",sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji";font-size:1rem;font-weight:400;line-height:1.5;color:#212529;background-color:#fff;-webkit-text-size-adjust:100%;-webkit-tap-highlight-color:transparent}hr{margin:1rem 0;color:inherit;background-color:currentColor;border:0;opacity:.25}hr:not([size]){height:1px}h1,h2,h3,h4,h5,h6{margin-top:0;margin-bottom:.5rem;font-weight:500;line-height:1.2}h1{font-size:calc(1.375rem + 1.5vw)}@media (min-width:1200px){h1{font-size:2.5rem}}h2{font-size:calc(1.325rem + .9vw)}@media (min-width:1200px){h2{font-size:2rem}}h3{font-size:calc(1.3rem + .6vw)}@media (min-width:1200px){h3{font-size:1.75rem}}h4{font-size:calc(1.275rem + .3vw)}@media (min-width:1200px){h4{font-size:1.5rem}}h5{font-size:1.25rem}h6{font-size:1rem}p{margin-top:0;margin-bottom:1rem}abbr[data-bs-original-title],abbr[title]{-webkit-text-decoration:underline dotted;text-decoration:underline dotted;cursor:help;-webkit-text-decoration-skip-ink:none;text-decoration-skip-ink:none}address{margin-bottom:1rem;font-style:normal;line-height:inherit}ol,ul{padding-left:2rem}dl,ol,ul{margin-top:0;margin-bottom:1rem}ol ol,ol ul,ul ol,ul ul{margin-bottom:0}dt{font-weight:700}dd{margin-bottom:.5rem;margin-left:0}blockquote{margin:0 0 1rem}b,strong{font-weight:bolder}small{font-size:.875em}mark{padding:.2em;background-color:#fcf8e3}sub,sup{position:relative;font-size:.75em;line-height:0;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}a{color:#0d6efd;text-decoration:underline}a:hover{color:#0a58ca}a:not([href]):not([class]),a:not([href]):not([class]):hover{color:inherit;text-decoration:none}code,kbd,pre,samp{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;font-size:1em;direction:ltr;unicode-bidi:bidi-override}pre{display:block;margin-top:0;margin-bottom:1rem;overflow:auto;font-size:.875em}pre code{font-size:inherit;color:inherit;word-break:normal}code{font-size:.875em;color:#d63384;word-wrap:break-word}a>code{color:inherit}kbd{padding:.2rem .4rem;font-size:.875em;color:#fff;background-color:#212529;border-radius:.2rem}kbd kbd{padding:0;font-size:1em;font-weight:700}figure{margin:0 0 1rem}img,svg{vertical-align:middle}table{caption-side:bottom;border-collapse:collapse}caption{padding-top:.5rem;padding-bottom:.5rem;color:#6c757d;text-align:left}th{text-align:inherit;text-align:-webkit-match-parent}tbody,td,tfoot,th,thead,tr{border-color:inherit;border-style:solid;border-width:0}label{display:inline-block}button{border-radius:0}button:focus:not(:focus-visible){outline:0}button,input,optgroup,select,textarea{margin:0;font-family:inherit;font-size:inherit;line-height:inherit}button,select{text-transform:none}[role=button]{cursor:pointer}select{word-wrap:normal}select:disabled{opacity:1}[list]::-webkit-calendar-picker-indicator{display:none}[type=button],[type=reset],[type=submit],button{-webkit-appearance:button}[type=button]:not(:disabled),[type=reset]:not(:disabled),[type=submit]:not(:disabled),button:not(:disabled){cursor:pointer}::-moz-focus-inner{padding:0;border-style:none}textarea{resize:vertical}fieldset{min-width:0;padding:0;margin:0;border:0}legend{float:left;width:100%;padding:0;margin-bottom:.5rem;font-size:calc(1.275rem + .3vw);line-height:inherit}@media (min-width:1200px){legend{font-size:1.5rem}}legend+*{clear:left}::-webkit-datetime-edit-day-field,::-webkit-datetime-edit-fields-wrapper,::-webkit-datetime-edit-hour-field,::-webkit-datetime-edit-minute,::-webkit-datetime-edit-month-field,::-webkit-datetime-edit-text,::-webkit-datetime-edit-year-field{padding:0}::-webkit-inner-spin-button{height:auto}[type=search]{outline-offset:-2px;-webkit-appearance:textfield}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-color-swatch-wrapper{padding:0}::file-selector-button{font:inherit}::-webkit-file-upload-button{font:inherit;-webkit-appearance:button}output{display:inline-block}iframe{border:0}summary{display:list-item;cursor:pointer}progress{vertical-align:baseline}[hidden]{display:none!important}</style>
Expand Down Expand Up @@ -49,10 +49,10 @@ <h2>Submodules</h2>
<h1 class="modulename">
<a href="./../outrank.html">outrank</a><wbr>.algorithms </h1>





</section>
</main>
<script>
Expand Down Expand Up @@ -237,4 +237,4 @@ <h1 class="modulename">
}
});
</script></body>
</html>
</html>
Loading

0 comments on commit bf540c5

Please sign in to comment.