Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create abstraction API around PERL_MAGIC_vstring #23075

Open
wants to merge 2 commits into
base: blead
Choose a base branch
from

Conversation

leonerd
Copy link
Contributor

@leonerd leonerd commented Mar 5, 2025

This PR creates two new API functions, sv_vstring_get and sv_vstring_set for interacting with what is currently implemented as PERL_MAGIC_vstring. By using this API callers no longer need to know about details of how that magic is implemented (or that it even is magic in the first place).

This is for similar reasons to the API added by #22971, as it gives future flexibility on how this behaviour is actually implemented.

I believe that the Storable module is supposed to be dual-life, so I added back-compat wrappers in Storable.xs to make it continue to work on older Perls before this API was added.

sv.c Outdated
{
PERL_ARGS_ASSERT_SV_VSTRING_SET;

sv_magic(sv, NULL, PERL_MAGIC_vstring, pv, len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant to happen if sv already has a vstring set?

sv_magic() will silently refuse to add the same magic again.

For normal assignment sv_force_normal_flags() removes the vstring magic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yes that's a fun question. The existing use-cases where this API was called in core or Storable.xs would know that the SV didn't already have the magic, but as a more general API it might not know that. Perhaps set should update any existing magic to change the stored values? This is going to mean a Safefree() and savepvn() call. Hrm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, thinking further: I can add some code for that case, but there's currently no code that would exercise it to test it. I guess I can add some in XS-APItest or somesuch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now with behaviour updated and some tests added.

Comment on lines +299 to +320
#ifndef sv_vstring_get
#define sv_vstring_get(sv,pvp,lenp) S_sv_vstring_get(aTHX_ sv,pvp,lenp)
static bool S_sv_vstring_get(pTHX_ SV *sv, const char **pvp, STRLEN *lenp)
{
MAGIC *mg;
if(!SvMAGICAL(sv) || !(mg = mg_find(sv, PERL_MAGIC_vstring)))
return FALSE;

*pvp = mg->mg_ptr;
*lenp = mg->mg_len;
return TRUE;
}
#endif

#ifndef sv_vstring_set
#define sv_vstring_set(sv,pv,len) \
STMT_START { \
sv_magic((sv), NULL, PERL_MAGIC_vstring, (pv), (len)); \
SvRMAGICAL_on(sv); \
} STMT_END
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This belongs in Devel::PPPort

Copy link
Contributor

@Leont Leont Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like an excellent idea that I fully support someone else doing.

Really - I have no idea how to edit or maintain PPPort. The entire process seems utterly opaque to me. If someone else wants to copypaste it over there then I can update this PR to match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a more general problem that almost no-one knows how to do that. I suspect that the real solution is that several of us learn to do so because otherwise this keeps being a recurring problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help getting this into D:P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's fine with me to merge this, and we will release a new D:P later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like overall the process for adding this to D::P is nontrivial enough, that I'd suggest it might be easier to first merge this PR as-is, and then as a separate change, move these support helpers out to there. That way it doesn't hold up this particular change and the work depending on it.

*/

void
Perl_sv_vstring_set(pTHX_ SV * const sv, const char *pv, STRLEN const len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't Perl_sv_vstring_set() sanitizing the user's input args const char *pv, STRLEN len? Why or why not, are both good answers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the question at all.

@bulk88
Copy link
Contributor

bulk88 commented Mar 6, 2025

What happens if arg SV* dest_sv passed by caller into sv_vstring_set() is already IOK_on/NOK_on/POK_on/ROK_on/SvIsCOW_on?

What if PP-level SvIVX() vs SvNVX() vs SvPVX() vs sv_vstring_set(sv,pv,len) are 4 different random [i.e. uninit mem] PP $scalar values? What should sv_vstring_set() do? something? nothing? any answer is a good answer


If the given SV has vstring magic, stores the string pointer and length into
the variables addressed by C<pvp> and C<lenp>, and returns true. If not,
returns false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a sentence saying,

stores the string pointer and length into
the variables addressed by C<pvp> and C<lenp>. The lifespan
of string pointer is until the next FREETMPS;.

or

The lifespan of string pointer is until next LEAVE;.

or

The lifespan of string pointer is of undefined storage and from
undefined allocator.   Memcpy() it ASAP if necessary.  It is not documented
 if string pointer is backed by a Newx()/Safesysmalloc() block,
somewhere into the middle of a Newx() block, or is a SVPV COW,
or is a shared string table HEK or is a temp ptr owned by MORTAL
or SAVESTACK or is a process global ptr to a C const "" string literal.

Version # 3 documents lifetime as TBD/opaque/UB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of those apply; but I've now added some words to point out the string buffer is owned by the SV itself (or at least, the magic annotation on it)

@@ -2597,18 +2620,16 @@ static int store_scalar(pTHX_ stcxt_t *cxt, SV *sv)
string:

#ifdef SvVOK
if (SvMAGICAL(sv) && (mg = mg_find(sv, 'V'))) {
if (sv_vstring_get(sv, &vstr_pv, &vstr_len)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were a serializing XS mod, and performance is important, sv_vstring_get needs to be #define SvVSTRING(sv, pv, n) with a fast shortcut on SvMAGICAL(sv) && before executing "heavy" extern exported PLT/GOT C symbol Perl_sv_vstring_get().


bool
Perl_sv_vstring_get(pTHX_ SV * const sv, const char **pvp, STRLEN *lenp)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is retval of Perl_sv_vstring_get of type bool and not of type STRLEN?

What does if(SvVOK(sv) && SvVCUR(sv) == 0) { NOOP; } mean on a PP/XS/C level?

why not move arg 3 STRLEN *lenp to retval?

Code like if(!len) { die("bad input"); } has universal meaning.

if(1) {
        sv_setpvs(TARG, "0 but true");
        XPUSHTARG;
        RETURN;
    }

is very rare in Perl/all SW dev.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is retval of Perl_sv_vstring_get of type bool and not of type STRLEN?

It matches the similar API shape I added with sv_regex_global_pos_get().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of more use of retvals where no ambiguity is possible. Should sv_regex_global_pos_get() also return its STRLEN?

@leonerd leonerd force-pushed the sv_vstring-API branch 2 times, most recently from 26f92cf to 49d4e7b Compare March 7, 2025 20:14
Comment on lines +17909 to +17915
=for apidoc sv_vstring_set

Applies vstring magic to the given SV, storing the string given by C<pv> for
C<len> bytes into it. If the SV already had vstring magic, the previous
stored string is freed and replaced.

=cut
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably point at upg_version() and new_version(), which is what you want if you're not using a pre-parsed version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this...

Storable uses this mechanism to restore the magic to match an SV it was originally provided, eg. from a vX.Y:

$ perl -MDevel::Peek -e 'Dump(v1.1.2)'
SV = PVMG(0x556625c6bd00) at 0x556625c29fe0
  REFCNT = 1
  FLAGS = (RMG,POK,IsCOW,READONLY,PROTECT,pPOK)
  IV = 0
  NV = 0
  PV = 0x556625c325e0 "\x01\x01\x02"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 0
  MAGIC = 0x556625c531d0
    MG_VIRTUAL = 0
    MG_TYPE = PERL_MAGIC_vstring(V)
    MG_LEN = 6
    MG_PTR = 0x556625c1bb60 "v1.1.2"

Note that the magic matches the PV.

I do think that the docs should indicate this is only useful when the PV and the magic are kept in sync (or someone is going to see unexpected results).

leonerd added 2 commits March 10, 2025 17:21
Adds two new API functions to abstract out applying or querying the
`PERL_MAGIC_vstring` annotation, used to create vstring values. Code can
now use these API functions without needing to be aware of the inner
details involved in how the magic currently works.
@leonerd
Copy link
Contributor Author

leonerd commented Mar 10, 2025

Having talked with @khwilliamson both on IRC and email, we are progressing with looking at how to add this to Devel::PPPort, but it seems the process is somewhat more involved than anticipated. It would be nice if that delay didn't hold up this PR, because it would ba shame for all API additions to be held hostage by the bus-factor of PPPort.

@Leont
Copy link
Contributor

Leont commented Mar 11, 2025

Having thought more about it, I'm not sure sv_vstring_set should be public API (though it may well be fine as internal API). Nor am I sure it has the right shape in the first place. In its current form, you can easily set a vstring value that doesn't match with its string value, which seems wrong.

@leonerd
Copy link
Contributor Author

leonerd commented Mar 12, 2025

Having thought more about it, I'm not sure sv_vstring_set should be public API (though it may well be fine as internal API). Nor am I sure it has the right shape in the first place. In its current form, you can easily set a vstring value that doesn't match with its string value, which seems wrong.

I'd be perfectly happy saying it shouldn't be public API. Unfortunately, Storable.xs needs to call it. So that's a bit out of the bag.

How about, instead of two separate steps of "make a plain SVt_PV", "add vstring magic to it", there was a single API function on "make a vstring, with these two stringy parts"? Effectively a dualstring var. ;)

Perhaps that's a more robust solution here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants