Skip to content

Commit

Permalink
docs
Browse files Browse the repository at this point in the history
  • Loading branch information
lhoestq committed Nov 2, 2021
1 parent dcaa3c0 commit 212b8ba
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 3 deletions.
1 change: 1 addition & 0 deletions .circleci/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ deploy_doc "master" master

# Example of how to deploy a doc on a certain commit (the commit doesn't have to be on the master branch).
# The following commit would live on huggingface.co/docs/datasets/v1.0.0
deploy_doc "dcaa3c0" v1.15.0
deploy_doc "ec82422" v1.14.0
deploy_doc "10dc68c" v1.13.3
deploy_doc "e82164f" v1.13.2
Expand Down
3 changes: 2 additions & 1 deletion docs/source/_static/js/custom.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// These two things need to be updated at each release for the version selector.
// Last stable version
const stableVersion = "v1.14.0"
const stableVersion = "v1.15.0"
// Dictionary doc folder to label
const versionMapping = {
"master": "master",
Expand Down Expand Up @@ -36,6 +36,7 @@ const versionMapping = {
"v1.13.2": "v1.13.2",
"v1.13.3": "v1.13.3",
"v1.14.0": "v1.14.0",
"v1.15.0": "v1.15.0",
}

function addIcon() {
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@

setup(
name="datasets",
version="1.15.0", # expected format is one of x.y.z.dev0, or x.y.z.rc1 or x.y.z (no to dashes, yes to dots)
version="1.15.1.dev0", # expected format is one of x.y.z.dev0, or x.y.z.rc1 or x.y.z (no to dashes, yes to dots)
description="HuggingFace community-driven open-source library of datasets",
long_description=open("README.md", "r", encoding="utf-8").read(),
long_description_content_type="text/markdown",
Expand Down
2 changes: 1 addition & 1 deletion src/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
# pylint: enable=line-too-long
# pylint: disable=g-import-not-at-top,g-bad-import-order,wrong-import-position

__version__ = "1.15.0"
__version__ = "1.15.0.dev0"

import pyarrow
from packaging import version as _version
Expand Down

1 comment on commit 212b8ba

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==3.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.086036 / 0.011353 (0.074683) 0.005126 / 0.011008 (-0.005882) 0.042392 / 0.038508 (0.003884) 0.044766 / 0.023109 (0.021657) 0.396076 / 0.275898 (0.120178) 0.413177 / 0.323480 (0.089697) 0.092739 / 0.007986 (0.084753) 0.005076 / 0.004328 (0.000748) 0.011115 / 0.004250 (0.006864) 0.047168 / 0.037052 (0.010116) 0.375784 / 0.258489 (0.117295) 0.420618 / 0.293841 (0.126778) 0.121157 / 0.128546 (-0.007389) 0.013939 / 0.075646 (-0.061707) 0.337046 / 0.419271 (-0.082225) 0.070133 / 0.043533 (0.026600) 0.410472 / 0.255139 (0.155333) 0.431049 / 0.283200 (0.147849) 0.098481 / 0.141683 (-0.043201) 2.140161 / 1.452155 (0.688007) 2.282010 / 1.492716 (0.789294)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.250888 / 0.018006 (0.232882) 0.580483 / 0.000490 (0.579993) 0.006758 / 0.000200 (0.006558) 0.000136 / 0.000054 (0.000081)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.049058 / 0.037411 (0.011647) 0.030439 / 0.014526 (0.015913) 0.046092 / 0.176557 (-0.130465) 0.257550 / 0.737135 (-0.479585) 0.036505 / 0.296338 (-0.259834)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.679373 / 0.215209 (0.464163) 6.785279 / 2.077655 (4.707624) 2.624985 / 1.504120 (1.120865) 2.180524 / 1.541195 (0.639330) 2.225785 / 1.468490 (0.757295) 0.748613 / 4.584777 (-3.836164) 7.066466 / 3.745712 (3.320754) 1.649898 / 5.269862 (-3.619963) 1.580774 / 4.565676 (-2.984902) 0.108097 / 0.424275 (-0.316178) 0.014664 / 0.007607 (0.007057) 0.854501 / 0.226044 (0.628456) 8.111027 / 2.268929 (5.842099) 3.217569 / 55.444624 (-52.227056) 2.570098 / 6.876477 (-4.306378) 2.598144 / 2.142072 (0.456072) 0.939936 / 4.805227 (-3.865291) 0.186701 / 6.500664 (-6.313963) 0.081328 / 0.075469 (0.005859)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 2.109425 / 1.841788 (0.267638) 17.699637 / 8.074308 (9.625329) 44.383187 / 10.191392 (34.191795) 0.923507 / 0.680424 (0.243083) 0.694790 / 0.534201 (0.160589) 0.494614 / 0.579283 (-0.084669) 0.758903 / 0.434364 (0.324540) 0.356873 / 0.540337 (-0.183465) 0.381020 / 1.386936 (-1.005916)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.081665 / 0.011353 (0.070312) 0.005249 / 0.011008 (-0.005759) 0.038060 / 0.038508 (-0.000448) 0.041264 / 0.023109 (0.018155) 0.389506 / 0.275898 (0.113608) 0.434492 / 0.323480 (0.111012) 0.192606 / 0.007986 (0.184620) 0.005502 / 0.004328 (0.001173) 0.009378 / 0.004250 (0.005128) 0.053254 / 0.037052 (0.016202) 0.399241 / 0.258489 (0.140752) 0.458236 / 0.293841 (0.164395) 0.110272 / 0.128546 (-0.018274) 0.014364 / 0.075646 (-0.061282) 0.352157 / 0.419271 (-0.067115) 0.063252 / 0.043533 (0.019720) 0.396742 / 0.255139 (0.141603) 0.421569 / 0.283200 (0.138369) 0.110900 / 0.141683 (-0.030783) 2.166056 / 1.452155 (0.713901) 2.242732 / 1.492716 (0.750016)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.258913 / 0.018006 (0.240907) 0.578193 / 0.000490 (0.577703) 0.019052 / 0.000200 (0.018852) 0.000184 / 0.000054 (0.000130)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.045491 / 0.037411 (0.008080) 0.028703 / 0.014526 (0.014177) 0.029667 / 0.176557 (-0.146889) 0.223650 / 0.737135 (-0.513485) 0.031995 / 0.296338 (-0.264343)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.692688 / 0.215209 (0.477479) 6.942054 / 2.077655 (4.864399) 2.830440 / 1.504120 (1.326320) 2.328502 / 1.541195 (0.787307) 2.332783 / 1.468490 (0.864293) 0.769303 / 4.584777 (-3.815474) 6.992523 / 3.745712 (3.246811) 1.654404 / 5.269862 (-3.615458) 1.555262 / 4.565676 (-3.010415) 0.091155 / 0.424275 (-0.333120) 0.015598 / 0.007607 (0.007991) 0.850882 / 0.226044 (0.624838) 8.572064 / 2.268929 (6.303136) 3.441942 / 55.444624 (-52.002682) 2.731902 / 6.876477 (-4.144575) 2.798595 / 2.142072 (0.656522) 1.002551 / 4.805227 (-3.802676) 0.181225 / 6.500664 (-6.319439) 0.075823 / 0.075469 (0.000354)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.991744 / 1.841788 (0.149957) 16.360135 / 8.074308 (8.285827) 44.405277 / 10.191392 (34.213885) 1.017227 / 0.680424 (0.336803) 0.702348 / 0.534201 (0.168147) 0.557702 / 0.579283 (-0.021582) 0.757207 / 0.434364 (0.322843) 0.350513 / 0.540337 (-0.189825) 0.349599 / 1.386936 (-1.037337)

CML watermark

Please sign in to comment.