feat: add `@stdlib/plot/table/unicode` #2407

Snehil-Shah · 2024-06-20T00:16:09Z

Towards #2067

Description

What is the purpose of this pull request?

This pull request:

adds @stdlib/plot/table/unicode

Table options:

alignment: datum's cell alignment. Default: 'right'.
borders: border characters. Default: '─ │ ─ │'.
cellPadding: cell padding. Default: 1.
columnSeparator: column separator character. Default: '│'.
corners: corner characters. Default: '┌ ┐ ┘ └'.
headerSeparator: header separator character. Default: '─'.
joints: joint characters. Default: '┼ ┬ ┤ ┴ ├'.
marginX: horizontal output margin. Default: 0.
marginY: vertical output margin. Default: 0.
maxCellWidth: maximum cell width (excluding padding). Default: FLOAT64_MAX.
maxOutputWidth: maximum output width (including margin). Default: FLOAT64_MAX.
rowSeparator: row separator character. Default: 'None'.

Table methods:

addRow(row): adds a row to data.
getData(): gets current data and headers in an object
render(): renders table
setData(data,[headers]): sets data

Related Issues

Does this pull request have any related issues?

This pull request:

subtask of [RFC]: add support for pretty printing tabular data in the REPL #2067

Questions

Any questions for reviewers of this pull request?

The current implementation allows for loose parsing. So, if the data is not exactly tabular, in some cases, it still tries to parse, adjust and make sense of the data.

These are the only two cases where this can be seen:

data = {
    'col1': [ ... ],
    'col2': [ ... ]
};
headers = [  'col1', 'col2', 'col3'  ];

Parser will automatically ignore 'col3' and value of this._headers after parsing will be [ 'col1', 'col2' ].

data = [ {  'col1': 1, 'col2': 4 }, { 'col1' : 5 } ];
headers = [  'col1', 'col2'  ];

Parser will automatically fill the missing col2 value with undefined.

Should we be raising an error instead?

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.

Checklist

Please ensure the following tasks are completed before submitting this pull request.

Read, understood, and followed the contributing guidelines.

@stdlib-js/reviewers

Signed-off-by: Snehil Shah <[email protected]>

kgryte · 2024-06-22T05:49:18Z

Re: loose parsing. I'd just raise an error. If the number of headers is off, that is probably a user bug.

When an object is missing a field, that is trickier, as could mean missing data. However, again, I'd raise an error here. A user should arguably be explicitly in terms of what value should represent missing data (e.g., undefined, null, NaN, '', etc).

Re: methods. I'd opt for closer parity with sparklines. Namely,

addRow => push()
getData/setData => data (accessor)

and I would add a headers accessor for getting/setting headers. You shouldn't have to provide data again in order to update headers. This should be a separate accessor to allow independent updating.

And similar to sparklines, I'd make the table an event emitter with render and change events.

Re: maxCellWidth. I wonder if it would be better to adopt the border-box box model, as in CSS. Namely, the cell width should include padding, not exclude it.

Re: maxOutputWidth. I'd rename to simply maxWidth, as in CSS.

Re: alignment. This one is trickier. E.g., I may want to align columns differently, as I would in a spreadsheet. So my initial inclination is that alignment can either be a string (apply to all columns) or an array of strings, in which an alignment must be provided for each column.

One could argue that the same logic applies to cellPadding and maxCellWidth.

kgryte · 2024-06-22T05:51:07Z

For the row separator default, wouldn't an empty string make more sense?

kgryte · 2024-06-22T05:52:07Z

lib/node_modules/@stdlib/plot/table/unicode/benchmark/benchmark.render.js

+
+		b.tic();
+		for ( i = 0; i < b.iterations; i++ ) {
+			str = table.setData( data(), headers() ).render();


You definitely do not want to be generating fresh data and headers for every benchmark iteration. Otherwise, you confound results.

kgryte · 2024-06-22T05:53:16Z

lib/node_modules/@stdlib/plot/table/unicode/lib/index.js

+/**
+* Create a Unicode table.
+*
+* @module @stdlib/plot/table/unicode


Missing example code.

kgryte · 2024-06-22T05:55:47Z

lib/node_modules/@stdlib/plot/table/unicode/lib/props/row-separator/set.js

+
+// VARIABLES //
+
+var CHARACTER_LENGTH = 1;


This is incorrect. Users should be able to provide emojis, etc, which may be comprised of multiple code points. Instead, you need to check for the number of grapheme clusters.

Also, why must a row separator only be one character? Couldn't I also want a pattern? E.g., -+-+-, or something similar?

Also, why must a row separator only be one character? Couldn't I also want a pattern? E.g., -+-+-, or something similar?

I wanted to generalize the arguments. We have column separators and borders as well. And as they are vertical, having them span multiple characters makes things more complex. For instance, how do we place the corners and joints if we have a 3-character long vertical border? Although it does make sense for horizontal lines, but I figured it was making the API design inconsistent and messy. For instance, borders takes in a shorthand top-right-bottom-left. We would have to only allow the top and bottom properties to be able to span multiple characters (or grapheme clusters to be accurate).

Single grapheme clusters are fine to use across the board for now. And understood regarding the difficulty in supporting multiple visual characters for vertical borders/separators. While horizontal support for multiple grapheme clusters would be straightforward to add, we can wait until a user requests such a feature.

Wait, now that I think again, maybe I was seeing it the wrong way. We can have multiple characters for a vertical line. Say if it's -*#. We just print:

- * #

This way, we never have a "width/thickness" of a line to be more than a single grapheme cluster so placing corners and joints becomes straightforward.
We would have to reserve joints and corners to be a single grapheme cluster though for obvious reasons

That makes sense. Thanks for circling back!

Also now that we are allowing multiple character strings, I was wondering if we could change the shorthand properties ('a b c d') into an array (['a', 'b', 'c', 'd']). This would allow the user to also have spaces as part of their line characters?

That also makes sense. Same thing: either a string or an array of strings.

...meaning I shouldn't have to do ['a']; I should be able to both 'a' and ['a'] and have them both result in the same thing.

Sorry. This is for the shorthand properties. Yeah, just supporting an array of strings seems reasonable.

kgryte · 2024-06-22T05:56:11Z

lib/node_modules/@stdlib/plot/table/unicode/lib/props/row-separator/set.js

+	if ( !isString( separator ) ) {
+		throw new TypeError( format( 'invalid assignment. `%s` must be a string. Value: `%s`.', 'rowSeparator', separator ) );
+	}
+	if ( separator === 'None' ) {


I wouldn't make None the sentinel.

Do you mean we should another value to denote None (like null or undefined)?

Yes, or the empty string.

You should not use undefined as the sentinel.

kgryte · 2024-06-22T05:58:41Z

lib/node_modules/@stdlib/plot/table/unicode/lib/render.js

+	* @throws {Error} output must be able to accommodate every column individually
+	* @returns {Array<number>} list of column indices
+	*/
+	function resolveWrapping() {


Rather than use a closure, I suggest either figuring out a way to move these to the parent scope or adding them as private methods on the table prototype. Otherwise, each time render is invoked, these functions have to be allocated, etc, which will hurt perf.

We can move it to the module scope, and just take in the private properties as arguments? Adding them to the table prototype can be avoided (I think?) as these functions are only used when rendering and nowhere else..

That is fine, as well.

kgryte · 2024-06-22T06:00:27Z

lib/node_modules/@stdlib/plot/table/unicode/README.md

+
+#### UnicodeTable.prototype.alignment
+
+Alignment of datum in cell. The value must be either `'right'`, `'left'` or `'center'`.


Suggested change

Alignment of datum in cell. The value must be either `'right'`, `'left'` or `'center'`.

Alignment of datum in cell. The value must be either `'right'`, `'left'`, or `'center'`.

The project uses Oxford commas.

kgryte · 2024-06-22T06:01:16Z

lib/node_modules/@stdlib/plot/table/unicode/README.md

+data = new Float64Array( 50 );
+for ( i = 0; i < data.length; i++ ) {
+    data[ i ] = randu() * 100.0;
+}


Use @stdlib/random/array/uniform instead. That way you can avoid loops.

kgryte · 2024-06-22T06:01:59Z

lib/node_modules/@stdlib/plot/table/unicode/README.md

+headers = new Float64Array( 5 );
+for ( i = 0; i < headers.length; i++ ) {
+    headers[ i ] = randu() * 100.0;
+}


I suggest just hardcoding a list of strings here. Generating random headers is unlikely in user code.

Snehil-Shah · 2024-06-22T17:32:54Z

For the row separator default, wouldn't an empty string make more sense?

Without empty string:

┌───────┬──────┬───────┐
│  col1 │ col2 │  col3 │
├───────┼──────┼───────┤
│    45 │   33 │ hello │
│ 32.54 │ true │  null │
└───────┴──────┴───────┘

With empty string:

┌───────┬──────┬───────┐
│  col1 │ col2 │  col3 │
├───────┼──────┼───────┤
│    45 │   33 │ hello │
│       │      │       │
│ 32.54 │ true │  null │
└───────┴──────┴───────┘

Most of the prior art don't separate rows by default (dataframes in ipython or jupyter, or python tabulate), and I think it looks better too without the rows separated

Snehil-Shah · 2024-06-22T17:42:06Z

Re: methods. I'd opt for closer parity with sparklines

Should I also have an argument for bufferSize that denotes the max number of rows? in the table? After that "pushing" more data would remove the "oldest" data in a cyclic manner like we do with sparklines.

Re: alignment. This one is trickier. E.g., I may want to align columns differently, as I would in a spreadsheet. So my initial inclination is that alignment can either be a string (apply to all columns) or an array of strings, in which an alignment must be provided for each column.

So, say the class isn't initialized with the data or headers, then we should raise an error if the user provides an array of alignments? Because in general, the alignments array should be of the same length as the number of columns?

and I would add a headers accessor for getting/setting headers. You shouldn't have to provide data again in order to update headers. This should be a separate accessor to allow independent updating.

Should we allow them to give headers if the data doesn't exist yet? (or raise an error)

kgryte · 2024-06-22T18:27:04Z

Re: bufferSize. Yes, that can make sense for streaming contexts.

Re: alignments without data/headers. No, it just needs to be consistent. As soon as we're provided something that conveys the number of columns, from that point forward, the table has a fixed number of columns and everything just needs to be consistent.

Re: headers without data. Yes, that seems reasonable and is applicable in streaming contexts, where you know the headers beforehand and are still awaiting data to be pushed.

Re: row separator and empty string. By empty string, I did not mean create a row separator using the empty string. I meant using the empty string as a sentinel to convey that no row separator should be rendered.

Snehil-Shah · 2024-06-22T18:35:36Z

Re: alignments without data/headers. No, it just needs to be consistent. As soon as we're provided something that conveys the number of columns, from that point forward, the table has a fixed number of columns and everything just needs to be consistent

Just to make sure I get you correctly, if there is no data or headers yet, and we set the alignment array for 5 columns. If the user now sets the data that depicts 9 columns, we raise an error, right?

Re: row separator and empty string. By empty string, I did not mean create a row separator using the empty string. I meant using the empty string as a sentinel to convey that no row separator should be rendered.

Ah, understood, yes that makes sense.

kgryte · 2024-06-22T18:53:43Z

If the user now sets the data that depicts 9 columns, we raise an error, right?

Correct. Can raise with something like "invalid argument. Expected %d columns, but received data having %d columns.".

feat: add @stdlib/plot/table/unicode

ec04cbd

Signed-off-by: Snehil Shah <[email protected]>

kgryte added the Feature Issue or pull request for adding a new feature. label Jun 21, 2024

kgryte added the REPL Issue or pull request specific to the project REPL. label Jun 22, 2024

kgryte reviewed Jun 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add `@stdlib/plot/table/unicode` #2407

feat: add `@stdlib/plot/table/unicode` #2407

Snehil-Shah commented Jun 20, 2024

kgryte commented Jun 22, 2024

kgryte commented Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024

Snehil-Shah Jun 22, 2024

kgryte Jun 22, 2024

Snehil-Shah Jun 22, 2024 •

edited

Loading

kgryte Jun 22, 2024

Snehil-Shah Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024 •

edited

Loading

kgryte Jun 22, 2024

kgryte Jun 22, 2024

Snehil-Shah Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024

Snehil-Shah Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024

kgryte Jun 22, 2024 •

edited

Loading

kgryte Jun 22, 2024

Snehil-Shah commented Jun 22, 2024

Snehil-Shah commented Jun 22, 2024

kgryte commented Jun 22, 2024

Snehil-Shah commented Jun 22, 2024

kgryte commented Jun 22, 2024


		#### UnicodeTable.prototype.alignment

		Alignment of datum in cell. The value must be either `'right'`, `'left'` or `'center'`.

feat: add @stdlib/plot/table/unicode #2407

Are you sure you want to change the base?

feat: add @stdlib/plot/table/unicode #2407

Conversation

Snehil-Shah commented Jun 20, 2024

Description

Related Issues

Questions

Other

Checklist

kgryte commented Jun 22, 2024

kgryte commented Jun 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Snehil-Shah Jun 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kgryte Jun 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kgryte Jun 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Snehil-Shah commented Jun 22, 2024

Snehil-Shah commented Jun 22, 2024

kgryte commented Jun 22, 2024

Snehil-Shah commented Jun 22, 2024

kgryte commented Jun 22, 2024

feat: add `@stdlib/plot/table/unicode` #2407

feat: add `@stdlib/plot/table/unicode` #2407

Snehil-Shah Jun 22, 2024 •

edited

Loading

kgryte Jun 22, 2024 •

edited

Loading

kgryte Jun 22, 2024 •

edited

Loading