Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create-ns: align the namespaces to 1Mib boundaries when using SI suffixes #2095

Merged
merged 1 commit into from
Feb 5, 2024

Conversation

maurizio-lombardi
Copy link
Contributor

Some controllers refuse to create namespaces with a size not aligned to a 1 megabyte boundary, this happens when using the -S and -C options.

$ nvme create-ns /dev/nvme0 -S 80M -C 80M -f 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x2)

Fix this problem by modifying parse_lba_num_si() to align the values to 1 Megabyte boundaries.

@ikegami-t
Copy link
Contributor

ikegami-t commented Oct 24, 2023

The -S and -C options were added for the issue #1747 proposed to accept SI units. Is this PR changes still able to keep and follow the issue requirement? By the way the -s and -c options can be used to set the binaly binary units instead as below I think. But those binary units values are not divided by the LBA size as different from the -S and -C options. Is it acceptable to use the binary units options instead of the PR changes?

$ nvme create-ns /dev/nvme0 -s 80Mi -c 80Mi -f 0

@igaw
Copy link
Collaborator

igaw commented Nov 2, 2023

I think we should have one pair of options which take value verbatim without any modifications and one set which do the right thing. The second set is for the casual user which just wants to create a new NS and doesn't want to figure all the low level things, e.g. I want a 8GB namespace.

IIRC, this is what we had in mind when we added the --nsze-si/-S and --ncap-si/-C option. This patch just brings it a bit closer to just do the right thing.
SI units

@ikegami-t
Copy link
Contributor

By the way for example is there any user to use the -S and -C options to align the values to 512 Kilobyte boundaries but not 1 Megabyte boundaries. If so in the case the options are not able to use for the case with the PR changes.

@maurizio-lombardi
Copy link
Contributor Author

By the way for example is there any user to use the -S and -C options to align the values to 512 Kilobyte boundaries but not 1 Megabyte boundaries. If so in the case the options are not able to use for the case with the PR changes.

That is true, but it's not a bit of a corner case? Is there a real need to create namespaces of < 1Mbyte size?
If yes, in that case, the user can still use the --nsze and --ncap parameters to create small namespaces

@HaroPanosyan
Copy link
Contributor

Not sure if namespace granularity (CNS 16h) is being accounted for in current implementation, but if controller supports it then granularity can be accounted for when creating namespaces (with same matching indices 0-15 as for LBA formats).

@igaw
Copy link
Collaborator

igaw commented Nov 7, 2023

namespace granularity (CNS 16h)

Not yet. For anyone looking for the documentation it is in NVM Command Set Specification (4.1.5.6 Namespace Granularity List (CNS 16h)). Sounds reasonable to use it if it's available.

@HaroPanosyan
Copy link
Contributor

namespace granularity (CNS 16h)

Not yet. For anyone looking for the documentation it is in NVM Command Set Specification (4.1.5.6 Namespace Granularity List (CNS 16h)). Sounds reasonable to use it if it's available.

Yup, CNS 01h will report if granularity is supported at bit 7 for "Controller Attributes (CTRATT)"

@ikegami-t
Copy link
Contributor

If the granularity is not support should we keep the current command options behavir? Or the controllers needed the size aligned to a 1 megabyte boundary are supported the granularity?

@igaw
Copy link
Collaborator

igaw commented Nov 10, 2023

If the granularity is not support should we keep the current command options behavir? Or the controllers needed the size aligned to a 1 megabyte boundary are supported the granularity?

I think it makes sense to align it to 1 MB boundary in absence of the granularity information. The worst case scenario is that we waste 1MB disk space. If you look at partitioning tools, it is not uncommon that some space is unused to alignment constrains.

As I said, I see these options here as the user friendly command line options to create namespaces. You should not need to know the spec at all to create a new namespace. As long we have the low level command line arguments around, there is always the possibility to configure a namespace exactly as you want it do be.

@ikegami-t
Copy link
Contributor

Agreed. Thanks for your explanation.

@igaw
Copy link
Collaborator

igaw commented Nov 17, 2023

@maurizio-lombardi do you plan to add the Granularity discovery bit?

@maurizio-lombardi
Copy link
Contributor Author

@maurizio-lombardi do you plan to add the Granularity discovery bit?

I am going to look at how to do that.

@igaw
Copy link
Collaborator

igaw commented Dec 7, 2023

@maurizio-lombardi any updates?

@maurizio-lombardi
Copy link
Contributor Author

maurizio-lombardi commented Dec 7, 2023 via email

@maurizio-lombardi
Copy link
Contributor Author

I updated the patch but I didn't have the opportunity to test it yet so it may not work as intended. I plan to test it this week

nvme.c Show resolved Hide resolved
nvme.c Show resolved Hide resolved
@maurizio-lombardi maurizio-lombardi force-pushed the create_ns_1mb_v2 branch 3 times, most recently from 046b788 to d5df782 Compare January 9, 2024 07:46
@igaw
Copy link
Collaborator

igaw commented Jan 22, 2024

I've lost the overview here. Is the patch ready to be merged (tested)? It looks it is in shape.

The only small thing is that the documentation is not mentioning that it will be align on nszegran/ncapgran or fallback on 1MB alignment bondary.

@maurizio-lombardi
Copy link
Contributor Author

I've lost the overview here. Is the patch ready to be merged (tested)? It looks it is in shape.

It is not, on one of my machines I get quite high values from the granularity list, for example 4 Gb of preferred granularity. (desc->nszegran = 4294967296 desc->ncapgran = 4294967296). I didn't expect the values to be so high and this makes some u32 variables to overflow, leading to divisions by zero and crashes.
Plus it makes it impossible to create, for example, a namespace with size 100Mb because it rounds up to a much larger value.

@maurizio-lombardi
Copy link
Contributor Author

It is not, on one of my machines I get quite high values from the granularity list.

By the way, the specification says that the list gives us a preferred granularity, but doesn't say that we are obliged to respect it

@igaw
Copy link
Collaborator

igaw commented Jan 23, 2024

Uff, this is a big alignment requirement! Does this mean we stick your initial idea to align to 1 MB ? Not really sure what to do here.

@maurizio-lombardi
Copy link
Contributor Author

Uff, this is a big alignment requirement! Does this mean we stick your initial idea to align to 1 MB ? Not really sure what to do here.

Me neither, I am tempted to align to 1 Mb unless the controller suggests that a smaller value is preferred.

@igaw
Copy link
Collaborator

igaw commented Jan 23, 2024

Maybe we should just do this and see if someone complains :)

@maurizio-lombardi
Copy link
Contributor Author

Maybe we should just do this and see if someone complains :)

Will modify the patch and go ahead then, unless @ikegami-t disagrees

@ikegami-t
Copy link
Contributor

Agreed. Thank you.

Copy link
Contributor

@ikegami-t ikegami-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically looks good but only some minor comments let me mention.

nvme.c Outdated
/* Only the first descriptor is valid */
index = 0;
} else if (index > gr_list->num_descriptors) {
/* The descriptor will contain only zeroes
Copy link
Contributor

@ikegami-t ikegami-t Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be split the line to start multiple lines comment as belwo below.

- 				/* The descriptor will contain only zeroes
+ 				/*
+ 				 * The descriptor will contain only zeroes

nvme.c Outdated Show resolved Hide resolved
nvme.c Outdated
return err;
}

align_nsze = align_ncap = 1 << 20; /* Default 1 Mb */
Copy link
Contributor

@ikegami-t ikegami-t Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value can be initialized by the variable definitions as for example. By the way 1 Mb should be 1 MiB. (The lowercase character b also should be uppercase charadter B. This is also same for the commit message summary 1Mib and description 1Mb.)

	__u32 nsid:
	__u32 align_nsze = 1 << 20; /* Default 1 MiB */
	__u32 align_ncap = align_nsze;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this one, I will change it

nvme.c Show resolved Hide resolved
nvme.c Show resolved Hide resolved
…ixes

Some controllers refuse to create namespaces with a size not aligned to
a 1 MiB boundary, this happens when using the -S and -C options.

$ nvme create-ns /dev/nvme0 -S 80M -C 80M -f 0
NVMe status: Invalid Field in Command: A reserved coded value or
an unsupported value in a defined field(0x2)

Fix this problem by modifying create_ns() parse_lba_num_si() to align
the values to 1 MiB boundaries. If granularity is supported, we will
use the specified values, if they are smaller than 1 MiB.

Signed-off-by: Maurizio Lombardi <[email protected]>
@igaw igaw merged commit 3d1243e into linux-nvme:master Feb 5, 2024
16 checks passed
@igaw
Copy link
Collaborator

igaw commented Feb 5, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants