diff --git a/docs/source/index.rst b/docs/source/index.rst
index e77ce7105..d9c74aeac 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -35,7 +35,7 @@ We hope k2 will have many other applications as well.
 
 
 .. toctree::
-   :maxdepth: 2
+   :maxdepth: 3
    :caption: Contents:
 
    installation/index
diff --git a/docs/source/python_tutorials/ragged/basics.rst b/docs/source/python_tutorials/ragged/basics.rst
index 62d9fb760..b39195d18 100644
--- a/docs/source/python_tutorials/ragged/basics.rst
+++ b/docs/source/python_tutorials/ragged/basics.rst
@@ -1,72 +1,277 @@
 Basics
 ======
 
-A ragged tensor or ragged array in k2 can be used to store the following kinds of data:
+In this tutorial, we describe
 
-  1. A list of lists. Each sub list may contain different number of entries.
-     That is, they can have different lengths.
+  - What are ragged tensors?
 
-    .. code-block:: python
+     - What are the differences between ragged tensors and regular tensors?
+     - How to create ragged tensors?
 
-      a = [ [1, 2], [2, 3, 4], [], [1] ]
+  - Various concepts relevant to ragged tensors, including
 
-  2. A list-of-list of lists.
+     - What is ``RaggedShape``?
+     - What is ``row_splits`` ?
+     - What is ``row_ids`` ?
 
-    .. code-block:: python
+What are ragged tensors?
+------------------------
 
-      b = [ [[1., 2.5], [2, 3, 4]], [[]], [[10, 20], []] ]
+Before talking about what ragged tensors are, let's look at what non-ragged
+tensors, i.e., regular tensors, look like.
 
-  3. A list-of-list-of-list-of... lists. List can be nested in any number of levels.
+  - 2-D regular tensors
 
+    .. literalinclude:: code/basics/regular-tensors.py
+       :language: python
+       :lines: 8-20
 
-.. Note::
+    The shape of the 2-D regular tensor ``a`` is ``(3, 4)``, meaning it has 3
+    rows and 4 columns. Each row has **exactly** 4 elements.
 
-  Ragged arrays are the **core** data structures in k2, designed by us
-  **independently**. We were later told that TensorFlow was using the same
-  ideas (See `tf.ragged <https://www.tensorflow.org/guide/ragged_tensor>`_).
+  - 3-D regular tensors
 
-In k2, a ragged tensor contains two parts:
+    .. literalinclude:: code/basics/regular-tensors.py
+       :language: python
+       :lines: 24-45
 
-  - a shape, which is of type :class:`k2.RaggedShape`
-  - a value, which can be accessed as a **contiguous**
-    `PyTorch tensor <https://pytorch.org/docs/stable/tensors.html>`_.
+    The shape of the 3-D regular tensor ``b`` is ``(3, 3, 2)``, meaning it has
+    3 planes. Each plane has **exactly** 3 rows and each row has **exactly** two
+    entries
 
-.. hint::
+  - N-D regular tensors (N >= 4)
 
-  The **value** is stored contiguously in memory.
+    We assume you know how to create N-D regular tensors.
 
+After looking at what non-ragged tensors look like, let's have a look at ragged
+tensors in ``k2``.
 
+  - 2-D ragged tensors
+
+    .. literalinclude:: code/basics/ragged-tensors.py
+       :language: python
+       :lines: 7-16
+
+    The 2-D ragged tensor ``c`` has 4 rows. However, unlike regular tensors,
+    each row in ``c`` can have different number of elements. In this case,
+
+      - Row 0 has 5 entries: ``[1, 2, 3, 6, -5]``
+      - Row 1 has 2 entries: ``[0, 1]``
+      - Row 2 is empty. It has no entries.
+      - Row 3 has only 1 entry: ``[3]``
+
+    .. Hint::
+
+      In ``k2``, we say that ``c`` is a ragged tensor with **two axes**.
+
+  - 3-D ragged tensors
+
+    .. literalinclude:: code/basics/ragged-tensors.py
+       :language: python
+       :lines: 20-40
+
+    The 3-D ragged tensor ``d`` has 4 planes. Different from regular tensors,
+    different planes in a ragged tensor can have different number of rows.
+    Moreover, different rows within a plane can also have different number
+    of entries.
+
+    .. Hint::
+
+      In ``k2``, we say that ``d`` is a ragged tensor with **three axes**.
+
+  - N-D ragged tensors (N >= 4)
+
+    Having seen how to create 2-D and 3-D ragged tensors, we assume you know how to
+    create N-D ragged tensors.
+
+A ragged tensor in ``k2`` has ``N`` (``N >= 2``) axes. Unlike regular tensors,
+each axis of a ragged tensor can have different number of elements.
+
+Ragged tensors are **the most important** data structure in ``k2``. FSAs are
+represented as ragged tensors. There are also various operations defined on ragged
+tensors.
+
+At this point, we assume you know how to create ``N-D`` ragged tensors in ``k2``.
+Let us do some exercises to check what you have learned.
+
+Exercise 1
+^^^^^^^^^^
+
+.. container:: toggle
+
+    .. container:: header
+
+        .. Note::
+
+          How to create a ragged tensor with 2 axes, satisfying the following
+          constraints:
+
+            - It has 3 rows.
+            - Row 0 has elements: ``1, 10, -1``
+            - Row 1 is empty, i.e., it has no elements.
+            - Row 2 has two elements: ``-1.5, 2``
+
+          (Click ▶ to view the solution)
+
+    .. literalinclude:: code/basics/ragged-tensors.py
+       :language: python
+       :lines: 43-49
+
+Exercise 2
+^^^^^^^^^^
+
+.. container:: toggle
+
+    .. container:: header
+
+        .. Note::
+
+          How to create a ragged tensor with only 1 axis?
+
+          (Click ▶ to view the solution)
+
+    You **cannot** create a ragged tensor with only 1 axis. Ragged tensors
+    in ``k2`` have at least 2 axes.
+
+dtype and device
+^^^^^^^^^^^^^^^^
+
+Like tensors in PyTorch. ragged tensors in ``k2`` has attributes ``dtype`` and
+``device``. The following code shows that you can specify the ``dtype`` and
+``device`` while constructing ragged tensors.
+
+.. literalinclude:: code/basics/dtype-device.py
+   :language: python
+   :lines: 3-23
 
 .. container:: toggle
 
     .. container:: header
 
-        .. attention::
+        .. Note::
+
+          (Click ▶ to view the output)
+
+    .. literalinclude:: code/basics/dtype-device.py
+       :language: python
+       :lines: 25-50
+
+Concepts about ragged tensors
+-----------------------------
 
-          What is the dimension of the **value** as a torch tensor? (Click ▶ to see it)
+A ragged tensor in ``k2`` consists of two parts:
 
-    It depends on the data type of of the ragged tensor. For instance,
+  - ``shape``, which is an instance of :class:`k2.RaggedShape`
 
-      - if the data type is ``int32_t``, the **value** is accessed as a **1-D** torch tensor with dtype ``torch.int32``.
-      - if the data type is ``float``, the **value** is accessed as a **1-D** torch tensor with dtype ``torch.float32``.
-      - if the data type is ``double``, the **value** is accessed as a **1-D** torch tensor with dtype ``torch.float64``.
+    .. Caution::
 
-    If the data type is ``k2::Arc``, which has the following definition
-    `in C++ <https://github.com/k2-fsa/k2/blob/master/k2/csrc/fsa.h#L31>`_:
+      It is assumed that a shape within a ragged tensor in ``k2`` is a constant.
+      Once constructed, you are not expected to modify it. Otherwise, unexpected
+      things can happen; you will be SAD.
+
+  - ``values``, which is an **array** of type ``T``
+
+    .. Hint::
+
+      ``values`` is stored ``contiguously`` in memory, whose entries have to be
+      of the same type ``T``. ``T`` can be either primitive types, such as
+      ``int``, ``float``, and ``double`` or can be user defined types. For instance,
+      ``values`` in FSAs contains ``arcs``, which is defined in C++
+      `as follows <https://github.com/k2-fsa/k2/blob/master/k2/csrc/fsa.h#L31>`_:
 
       .. code-block:: c++
 
-        struct Arc {
-          int32_t src_state;
-          int32_t dest_state;
-          int32_t label;
-          float score;
-        };
+          struct Arc {
+            int32_t src_state;
+            int32_t dest_state;
+            int32_t label;
+            float score;
+          }
+
+Before explaining what ``shape`` and ``values`` contain, let us look at an example of
+how to use a ragged tensor to represent the following
+FSA (see :numref:`ragged_basics_simple_fsa_1`).
+
+.. _ragged_basics_simple_fsa_1:
+.. figure:: code/basics/images/simple-fsa.svg
+    :alt: A simple FSA
+    :align: center
+    :figwidth: 600px
+
+    An simple FSA that is to be represented by a ragged tensor.
+
+The FSA in :numref:`ragged_basics_simple_fsa_1` has 3 arcs and 3 states.
+
++---------+--------------------+--------------------+--------------------+--------------------+
+|         |      src_state     |     dst_state      |         label      |       score        |
++---------+--------------------+--------------------+--------------------+--------------------+
+| Arc 0   |       0            |        1           |          1         |         0.1        |
++---------+--------------------+--------------------+--------------------+--------------------+
+| Arc 1   |       0            |        1           |          2         |         0.2        |
++---------+--------------------+--------------------+--------------------+--------------------+
+| Arc 2   |       1            |        2           |          -1        |         0.3        |
++---------+--------------------+--------------------+--------------------+--------------------+
+
+When the above FSA is saved in a ragged tensor, its arcs are saved in a 1-D contiguous
+``values`` array containing ``[Arc0, Arc1, Arc2]``.
+At this point, you might ask:
+
+  - As we can construct the original FSA by using the ``values`` array,
+    what's the point of saving it in a ragged tensor?
+
+Using the ``values`` array alone is not possible to answer the following questions in ``O(1)``
+time:
+
+  - How many states does the FSA have ?
+  - How many arcs does each state have ?
+  - Where do the arcs belonging to state 0 start in the ``values`` array ?
+
+To handle the above questions, we introduce another 1-D array, called ``row_splits``.
+``row_splits[s] = p`` means for state ``s`` its first outgoing arc starts at position
+``p`` in the ``values`` array. As a side effect, it also indicates that the last outgoing
+arc for state ``s-1`` ends at position ``p`` (exclusive) in the ``values`` array.
+
+In our example, ``row_splits`` would be ``[0, 2, 3, 3]``, meaning:
+
+  - The first outgoing arc for state 0 is at position ``row_splits[0] = 0``
+    in the ``values`` array
+  - State 0 has ``row_splits[1] - row_splits[0] = 2 - 0 = 2`` arcs
+  - The first outgoing arc for state 1 is at position ``row_splits[1] = 2``
+    in the ``values`` array
+  - State 1 has ``row_splits[2] - row_splits[1] = 3 - 2 = 1`` arc
+  - State 2 has no arcs since ``row_splits[3] - row_splits[2] = 3 - 3 = 0``
+  - The FSA has ``len(row_splits) - 1 = 3`` states.
+
+We can construct a ``RaggedShape`` from a ``row_splits`` array:
+
+.. literalinclude:: code/basics/ragged_shape_1.py
+   :language: python
+   :lines: 3-14
+
+Pay attention to the string form of the shape ``[ [x x] [x] [ ] ]``.
+``x`` means we don't care about the actual content inside a ragged tensor.
+The above shape has 2 axes, 3 rows, and 3 elements. Row 0 has two elements as there
+are two ``x`` inside the 0th ``[]``. Row 1 has only one element, while
+row 2 has no elements at all. We can assign names to the axes. In our case,
+we say the shape has axes ``[state][arc]``.
+
+Combining the ragged shape and the ``values`` array, the above FSA can
+be represented using a ragged tensor ``[ [Arc0 Arc1] [Arc2] [ ] ]``.
+
+The following code displays the string from of the above FSA when represented
+as a ragged tensor in k2:
+
+.. literalinclude:: code/basics/single-fsa.py
+   :language: python
+   :lines: 2-14
+
+
+Shape
+^^^^^
+
+To be done.
 
-    the **value** is acessed as a **2-D** torch tensor with dtype ``torch.int32``.
-    The **2-D** tensor has 4 columns: the first column contains ``src_state``,
-    the second column contains ``dest_state``, the third column contains ``label``,
-    and the fourth column contains ``score`` (The float type is **reinterpreted** as
-    int type without any conversions).
+data
+^^^^
 
-    There are only 1-D and 2-D **value** tensors in k2 at present.
+TBD.
diff --git a/docs/source/python_tutorials/ragged/code/basics/dtype-device.py b/docs/source/python_tutorials/ragged/code/basics/dtype-device.py
new file mode 100755
index 000000000..6124b0aa7
--- /dev/null
+++ b/docs/source/python_tutorials/ragged/code/basics/dtype-device.py
@@ -0,0 +1,51 @@
+#!/usr/bin/env python3
+
+import k2
+import torch
+
+a = k2.RaggedTensor([[1, 2], [1]])
+b = k2.RaggedTensor([[1, 2], [1]], dtype=torch.int32)
+c = k2.RaggedTensor([[1, 2], [1.5]])
+d = k2.RaggedTensor([[1, 2], [1.5]], dtype=torch.float32)
+e = k2.RaggedTensor([[1, 2], [1.5]], dtype=torch.float64)
+f = k2.RaggedTensor([[1, 2], [1]], dtype=torch.float32, device=torch.device("cuda", 0))
+g = k2.RaggedTensor([[1, 2], [1]], device="cuda:0", dtype=torch.float64)
+print(f"a:\n{a}")
+print(f"b:\n{b}")
+print(f"c:\n{c}")
+print(f"d:\n{d}")
+print(f"e:\n{e}")
+print(f"f:\n{f}")
+print(f"g:\n{g}")
+print(f"g.to_str_simple():\n{g.to_str_simple()}")
+print(f"a.dtype: {a.dtype}, g.device: {g.device}")
+print(f"a.to(g.device).device: {a.to(g.device).device}")
+print(f"a.to(g.dtype).dtype: {a.to(g.dtype).dtype}")
+"""
+a:
+RaggedTensor([[1, 2],
+              [1]], dtype=torch.int32)
+b:
+RaggedTensor([[1, 2],
+              [1]], dtype=torch.int32)
+c:
+RaggedTensor([[1, 2],
+              [1.5]], dtype=torch.float32)
+d:
+RaggedTensor([[1, 2],
+              [1.5]], dtype=torch.float32)
+e:
+RaggedTensor([[1, 2],
+              [1.5]], dtype=torch.float64)
+f:
+RaggedTensor([[1, 2],
+              [1]], device='cuda:0', dtype=torch.float32)
+g:
+RaggedTensor([[1, 2],
+              [1]], device='cuda:0', dtype=torch.float64)
+g.to_str_simple():
+RaggedTensor([[1, 2], [1]], device='cuda:0', dtype=torch.float64)
+a.dtype: torch.int32, g.device: cuda:0
+a.to(g.device).device: cuda:0
+a.to(g.dtype).dtype: torch.float64
+"""
diff --git a/docs/source/python_tutorials/ragged/code/basics/images/simple-fsa.svg b/docs/source/python_tutorials/ragged/code/basics/images/simple-fsa.svg
new file mode 100644
index 000000000..264164859
--- /dev/null
+++ b/docs/source/python_tutorials/ragged/code/basics/images/simple-fsa.svg
@@ -0,0 +1,53 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.40.1 (20161225.0304)
+ -->
+<!-- Title: WFSA Pages: 1 -->
+<svg width="245pt" height="52pt"
+ viewBox="0.00 0.00 245.00 52.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 48)">
+<title>WFSA</title>
+<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-48 241,-48 241,4 -4,4"/>
+<!-- 0 -->
+<g id="node1" class="node">
+<title>0</title>
+<ellipse fill="none" stroke="#000000" stroke-width="2" cx="18" cy="-22" rx="18" ry="18"/>
+<text text-anchor="middle" x="18" y="-18.3" font-family="Times,serif" font-size="14.00" fill="#000000">0</text>
+</g>
+<!-- 1 -->
+<g id="node2" class="node">
+<title>1</title>
+<ellipse fill="none" stroke="#000000" stroke-width="2" cx="112" cy="-22" rx="18" ry="18"/>
+<text text-anchor="middle" x="112" y="-18.3" font-family="Times,serif" font-size="14.00" fill="#000000">1</text>
+</g>
+<!-- 0&#45;&gt;1 -->
+<g id="edge1" class="edge">
+<title>0&#45;&gt;1</title>
+<path fill="none" stroke="#000000" d="M36.1241,-22C49.6484,-22 68.3808,-22 83.8486,-22"/>
+<polygon fill="#000000" stroke="#000000" points="83.9315,-25.5001 93.9315,-22 83.9315,-18.5001 83.9315,-25.5001"/>
+<text text-anchor="middle" x="65" y="-25.8" font-family="Times,serif" font-size="14.00" fill="#000000">1/0.1</text>
+</g>
+<!-- 0&#45;&gt;1 -->
+<g id="edge2" class="edge">
+<title>0&#45;&gt;1</title>
+<path fill="none" stroke="#000000" d="M32.7129,-11.3344C38.1808,-7.9798 44.6133,-4.7072 51,-3 63.0223,.2137 66.9777,.2137 79,-3 82.1933,-3.8536 85.3981,-5.0986 88.4881,-6.5494"/>
+<polygon fill="#000000" stroke="#000000" points="86.83,-9.6318 97.2871,-11.3344 90.1742,-3.4822 86.83,-9.6318"/>
+<text text-anchor="middle" x="65" y="-6.8" font-family="Times,serif" font-size="14.00" fill="#000000">2/0.2</text>
+</g>
+<!-- 2 -->
+<g id="node3" class="node">
+<title>2</title>
+<ellipse fill="none" stroke="#000000" stroke-width="2" cx="215" cy="-22" rx="18" ry="18"/>
+<ellipse fill="none" stroke="#000000" stroke-width="2" cx="215" cy="-22" rx="22" ry="22"/>
+<text text-anchor="middle" x="215" y="-18.3" font-family="Times,serif" font-size="14.00" fill="#000000">2</text>
+</g>
+<!-- 1&#45;&gt;2 -->
+<g id="edge3" class="edge">
+<title>1&#45;&gt;2</title>
+<path fill="none" stroke="#000000" d="M130.4143,-22C144.8858,-22 165.4016,-22 182.604,-22"/>
+<polygon fill="#000000" stroke="#000000" points="182.8031,-25.5001 192.8031,-22 182.803,-18.5001 182.8031,-25.5001"/>
+<text text-anchor="middle" x="161.5" y="-25.8" font-family="Times,serif" font-size="14.00" fill="#000000">&#45;1/0.3</text>
+</g>
+</g>
+</svg>
diff --git a/docs/source/python_tutorials/ragged/code/basics/ragged-tensors.py b/docs/source/python_tutorials/ragged/code/basics/ragged-tensors.py
new file mode 100755
index 000000000..2b153737d
--- /dev/null
+++ b/docs/source/python_tutorials/ragged/code/basics/ragged-tensors.py
@@ -0,0 +1,49 @@
+#!/usr/bin/env python3
+
+# Note: If you add/remove lines, please also
+# update the line numbers in python_tutorials/ragged/basics.rst
+
+# 2d
+import k2
+
+c = k2.RaggedTensor(
+    [
+        [1, 2, 3, 6, -5],
+        [0, 1],
+        [],
+        [3],
+    ]
+)
+
+
+# 3d
+d = k2.RaggedTensor(
+    [
+        [
+            [1],
+            [],
+            [3, 5, 8],
+        ],
+        [
+            [1, 2],
+        ],
+        [
+            [],
+            [],
+            [5, 9, -1, 10],
+            [],
+        ],
+        [
+            [],
+        ],
+    ]
+)
+
+# exercise 1
+e = k2.RaggedTensor(
+    [
+        [1, 10, -1],
+        [],
+        [-1.5, 2],
+    ]
+)
diff --git a/docs/source/python_tutorials/ragged/code/basics/ragged_shape_1.py b/docs/source/python_tutorials/ragged/code/basics/ragged_shape_1.py
new file mode 100755
index 000000000..d695024e3
--- /dev/null
+++ b/docs/source/python_tutorials/ragged/code/basics/ragged_shape_1.py
@@ -0,0 +1,16 @@
+#!/usr/bin/env python3
+
+import k2
+import torch
+
+shape = k2.ragged.create_ragged_shape2(
+    row_splits=torch.tensor([0, 2, 3, 3], dtype=torch.int32),
+)
+print(type(shape))
+print(shape)
+"""
+<class '_k2.ragged.RaggedShape'>
+[ [ x x ] [ x ] [ ] ]
+"""
+print("num_states:", shape.dim0)
+print("num_arcs:", shape.numel())
diff --git a/docs/source/python_tutorials/ragged/code/basics/regular-tensors.py b/docs/source/python_tutorials/ragged/code/basics/regular-tensors.py
new file mode 100755
index 000000000..1580b6c29
--- /dev/null
+++ b/docs/source/python_tutorials/ragged/code/basics/regular-tensors.py
@@ -0,0 +1,45 @@
+#!/usr/bin/env python3
+
+# Note: If you add/remove lines, please also
+# update the line numbers in python_tutorials/ragged/basics.rst
+
+
+# 2d
+import numpy as np
+
+a = np.array(
+    [
+        [1, 2, 3, 6],
+        [0, 1, 5, 0],
+        [3, 6, 8, 10],
+    ]
+)
+
+print("a.shape:", a.shape)
+# It prints
+# a.shape: (3, 4)
+
+# 3d
+
+b = np.array(
+    [
+        [
+            [1, 2],
+            [0, 1],
+            [3, 6],
+        ],
+        [
+            [5, 20],
+            [0, -1],
+            [-2, 9],
+        ],
+        [
+            [8, 7],
+            [-3, 3],
+            [2, -2],
+        ],
+    ]
+)
+print("b.shape:", b.shape)
+# It prints
+# b.shape: (3, 3, 2)
diff --git a/docs/source/python_tutorials/ragged/code/basics/single-fsa.py b/docs/source/python_tutorials/ragged/code/basics/single-fsa.py
new file mode 100755
index 000000000..fd99f44ac
--- /dev/null
+++ b/docs/source/python_tutorials/ragged/code/basics/single-fsa.py
@@ -0,0 +1,23 @@
+#!/usr/bin/env python3
+import k2
+
+s = """
+0 1 1 0.1
+0 1 2 0.2
+1 2 -1 0.3
+2
+"""
+fsa = k2.Fsa.from_str(s)
+print(fsa.arcs)
+"""
+[ [ 0 1 1 0.1 0 1 2 0.2 ] [ 1 2 -1 0.3 ] [ ] ]
+"""
+
+sym_str = """
+a 1
+b 2
+"""
+
+#  fsa.labels_sym = k2.SymbolTable.from_str(sym_str)
+#  fsa.draw("images/simple-fsa.svg")
+#  print(k2.to_dot(fsa))
diff --git a/docs/source/python_tutorials/ragged/index.rst b/docs/source/python_tutorials/ragged/index.rst
index a3a84157f..86e5cd2d1 100644
--- a/docs/source/python_tutorials/ragged/index.rst
+++ b/docs/source/python_tutorials/ragged/index.rst
@@ -2,6 +2,12 @@
 Ragged tensor tutorials
 =======================
 
+.. Note::
+
+  Ragged tensors are the **core** data structures in k2, designed by us
+  **independently**. We were later told that TensorFlow was using the same
+  ideas (See `tf.ragged <https://www.tensorflow.org/guide/ragged_tensor>`_).
+
 .. toctree::
    :maxdepth: 2
 
diff --git a/k2/python/csrc/torch/v2/ragged_any.cu b/k2/python/csrc/torch/v2/ragged_any.cu
index ad26333e7..111800083 100644
--- a/k2/python/csrc/torch/v2/ragged_any.cu
+++ b/k2/python/csrc/torch/v2/ragged_any.cu
@@ -335,7 +335,7 @@ std::string RaggedAny::ToString(bool compact /*=false*/,
                                 int32_t device_id /*=-1*/) const {
   ContextPtr context = any.Context();
   if (context->GetDeviceType() != kCpu) {
-    return To("cpu").ToString(context->GetDeviceId());
+    return To("cpu").ToString(compact, context->GetDeviceId());
   }
 
   std::ostringstream os;
diff --git a/k2/python/csrc/torch/v2/ragged_shape.cu b/k2/python/csrc/torch/v2/ragged_shape.cu
index 81c3118ce..cb3bc8c13 100644
--- a/k2/python/csrc/torch/v2/ragged_shape.cu
+++ b/k2/python/csrc/torch/v2/ragged_shape.cu
@@ -232,8 +232,8 @@ void PybindRaggedShape(py::module &m) {
 
   m.def(
       "create_ragged_shape2",
-      [](torch::optional<torch::Tensor> row_splits,
-         torch::optional<torch::Tensor> row_ids,
+      [](torch::optional<torch::Tensor> row_splits = torch::nullopt,
+         torch::optional<torch::Tensor> row_ids = torch::nullopt,
          int32_t cached_tot_size = -1) -> RaggedShape {
         if (!row_splits.has_value() && !row_ids.has_value())
           K2_LOG(FATAL) << "Both row_splits and row_ids are None";
@@ -257,7 +257,7 @@ void PybindRaggedShape(py::module &m) {
             row_splits.has_value() ? &array_row_splits : nullptr,
             row_ids.has_value() ? &array_row_ids : nullptr, cached_tot_size);
       },
-      py::arg("row_splits"), py::arg("row_ids"),
+      py::arg("row_splits") = py::none(), py::arg("row_ids") = py::none(),
       py::arg("cached_tot_size") = -1, kCreateRaggedShape2Doc);
 
   m.def("random_ragged_shape", &RandomRaggedShape, "RandomRaggedShape",