PHP Internals News: Episode 86: Property Accessors
Podcast
Podcaster
Beschreibung
vor 4 Jahren
PHP Internals News: Episode 86: Property Accessors
Thursday, May 27th 2021, 09:14 BST
London, UK
In this episode of "PHP Internals News" I chat with
Nikita Popov (Twitter, GitHub, Website) about the
"Property Accessors" RFC.
The RSS feed for this podcast is
https://derickrethans.nl/feed-phpinternalsnews.xml, you
can download this episode's MP3 file, and it's available
on Spotify and iTunes. There is a dedicated website:
https://phpinternals.news
Transcript
Derick Rethans 0:14
Hi I'm Derick. Welcome to PHP internals news, a
podcast dedicated to explain the latest developments
in the PHP language. This is episode 86. Today I'm
talking with Nikita Popov about his massive property
excesses RFC. Nikita, would you please introduce
yourself?
Nikita Popov 0:32
Hi Derick, I'm Nikita, and I do work on PHP core
development, on behalf of JetBrains.
Derick Rethans 0:39
This is probably the largest RFC I've seen in a
while. What in one sentence, are you proposing to add
to PHP here?
Nikita Popov 0:46
I would say it's an alternative to magic get and set,
just for one specific property instead of all of
them. That's the technical side. Maybe I should say
something about the like motivation behind it, which
is that since PHP seven four, we have type
properties, that at least for me personally with that
feature, the need to have this typical pattern of
private property for storage, plus a public getter
and setter methods, the main motivation for that has
kind of gone away, because we can now use types to
enforce any contracts on value. And now these getter
and setter methods most if you like boilerplate. So
the idea with accessors, at least my idea with
accessors is that you really shouldn't use them. You
should just have them as a backup option. You declare
a public property in your class, and then maybe
later, years later, it turns out that okay, that
property actually requires additional validation. And
right now if you have a public property, then you
don't really have a good way of introducing that.
Only way is to either break the API contract by
converting the property into getter/setter methods
where you can introduce arbitrary code, or by using
magic get/set, which is definitely possible and
persist the API contract, it's just fairly ugly.
Derick Rethans 2:09
You changes the public property that people could
read into a private one. And because it's private,
the set and get metric methods are being called.
Nikita Popov 2:18
Exactly.
Derick Rethans 2:19
This RFC is titled Property accesses, how do these
improve on the situation?
Nikita Popov 2:24
So I think there are really two fairly orthogonal
parts to this RFC. The first part is implicit
accesses that don't have any custom behaviour, and
just allow controlling the behaviour of properties a
bit more precisely. In particular, the most important
part is probably the asymmetric visibility, where you
have a property that's publicly readable, but can
only be set from within the class. So public read/
private write. I think that's a, maybe the most
common requirement. The second part is where you can
actually introduce some custom behaviour. So where
you can say that okay, the get behaviour for this
property looks like this, and the set behaviour, it
looks like this. Which is essentially exactly the
same as what magic get/set does, just for a single
property.
Derick Rethans 3:10
For example, when you then do set, or you can add
additional validation to it.
Nikita Popov 3:14
Exactly. Originally, you had a simple public
property, then you can add a setter that checks okay
this string cannot be empty.
Derick Rethans 3:23
Okay, what it's the syntax that you're proposing?
Nikita Popov 3:26
I went with these essentially the same syntax that's
being used in C#. Looks like you write public foobar,
and then you have this sort of semi colon you have a
code block. And this code block contains two
accessors, so then you have something like get, and
another code block that specifies the get behaviour,
and set, and the code block that specifies the set
behaviour and so on.
Derick Rethans 3:52
The RFC talks about implicit and explicit
implementations of these getter and setter accessors.
What is the difference between them and how does it
look different in syntax?
Nikita Popov 4:03
Yeah so the difference is, either you can write just
get semi colon, set semicolon, that's an implicit
implementation, or you actually specify a code block
with real custom behaviour. To do the implicit
implementation, you're saying that this is really a
normal property, and PHP automatically manages the
storage for you, is that you have this more fine
grained control over how it works. Namely what you
can do is you can say that you have get and private
set. But that's a property that's publicly read only
and internally writeable. You can write just get
without set, in which case it's a real read only
property both publicly and privately, or to be more
precise, it's an init once property so you can assign
to it once.
Derick Rethans 4:52
How do you keep track of the init once?
Nikita Popov 4:53
It's same mechanism as for Type Properties, where we
distinguish between an initialized and an
uninitialized property. You can assign to an
uninitialized property, but you can't assign to an
initialized one, if it's read only. The only maybe
problem there is that this mechanism, requires that
the property actually is uninitialized to start with,
which means that for accessors you don't have any
default values. To say there is no implicit default
value, no implicit null value. If you want to have a
default value the same as with type properties you
have to specify it explicitly. Specifying a default
value really only makes sense if the property is both
readable and writable. For Read Only properties, if
you specify the default then you will you can change
that.
Derick Rethans 5:37
You have basically have created a constant.
Nikita Popov 5:39
Yes, it is essentially a constant.
Derick Rethans 5:41
You mentioned already, PHP seven four introduced type
properties. How do these types interact with the
setter and getter accessors?
Nikita Popov 5:50
I would say in the obvious way. The getter is
required to return type of property, modulo the usual
weak typing conversions, and the setter also checks
before it's called whether the passed value matches
the type or not. But enforces that matches the type.
Derick Rethans 6:08
This does mean that if you provide an explicit
implementation for the set accessor, you also need to
specify the parameter name?
Nikita Popov 6:15
No, or you can specify the parameter name, and if you
don't then that's just passed in as the value
variable. It's also inspired by how C# and Swift do
it. I mean there are some possible variations here we
could always require an explicit name, some people
for that, or I also heard that some people would like
to have the name of this implicit variable match the
name of the property, instead of always being just
value.
Derick Rethans 6:41
Would you have to specify the type though?
Nikita Popov 6:43
You wouldn't have to and you're actually not allowed
to. So the accessor implementation is somewhat strict
about not allowing you to do anything that would be
redundant because otherwise, you know, there are
quite a lot of extra things you could be adding
everywhere.
Derick Rethans 6:56
That's the same way as marking a property as private.
And then the accessors as private as well. Right?
Nikita Popov 7:03
Yeah exactly. So, then that will also say: if the
property is already private you can't, again say that
the accessors also private.
Derick Rethans 7:11
I think that's the wise thing, otherwise people go
overboard with adding private and final and whatever
everywhere anyway right.
Nikita Popov 7:18
One could argue that it's really not our business and
this is a coding style question, but you know it's
better to not leave people, with the option of doing
stupid things.
Derick Rethans 7:28
I saw in the RFC that it is also possible to use
references with the get accessor. Does this
complicated implementation and the idea of this RFC,
a lot, or just a little?
Nikita Popov 7:39
I think the important context to keep in mind here is
that we already have magic get set, and the accessors
are, like, largely based on their semantics. Magic
getters already have this distinction between
returning by value and returning by reference. The by
reference return value is primarily useful for two
cases. One and this is really the important one, is
if you're working with arrays, any write operation on
an array like setting an element or appending an
element, those require that the getter returns by
reference, because PHP will actually do the
modification on the reference. Because some people
asked about that. Why can't we just like get the
array using the getter, then make the change and then
assign back using the setter. That would
theoretically work, but it would be extremely
inefficient, and the reason is that this breaks PHP's
copy on write mechanism. If the error is returned
from the getter, then we have one array inside the
property. And we have one copy of the array inside
the property, and as the return value. Then we change
the return value and the resource is now shared, we
actually have to copy the whole array, and then we
assign it back. So effectively what we do is we copy
the array, we do single element change, and then we
copy the array, we do a single element change and
then we destroy the old array. That works in theory,
but it's so inefficient that we would not want to
promote this kind of usage.
Derick Rethans 8:42
The way around is of course, is having an implicit
methods on the class to make this change to the array
itself right?
Nikita Popov 9:10
That would be another option. Problem is that you
will need a lot of methods, I mean it's not just a
matter of setting a single element or unsetting an
element, but you can also set like a deep element
where you're not modifying the outermost array but,
like, a multi dimensional array. You would actually
have to pass through that information somehow as
well. I don't think there is a simple solution to
that problem beyond the reference based solution that
we currently use.
Derick Rethans 9:34
I saw people arguing about not bothering with
references in this new implementation at all, but I
think you've now made a good case for keeping them.
Nikita Popov 9:42
Effectively not bothering with references just means
not supporting that array use case. Which might be,
maybe a reasonable limitation, especially if we like
make a distinction and supported for the implicit
accessor case where we can, you know, do internal
magic to support that and not support it in the
explicit accessors case. I mean, people were arguing
that this reduces the complexity of the proposal, but
it kind of also increases the complexity because now
we are doing something else for the accessors and
we're doing for the magic get/set, where we already
have this established mechanism. I'm not really
convinced by that.
Derick Rethans 10:20
And I also think it creates inconsistencies in the
language itself because it does something different
with an implicit or explicit accessor, as well as it
being different between the original underscore
underscore get magic method as well.
Nikita Popov 10:34
It's not a secret that I'm not a big fan of
references, and I would certainly love to get rid of
them, but it's a hard problem, and this array
modification behaviour for magic get or for get
accessors is certainly a large part of that problem,
and I just don't have a good solution for it.
Derick Rethans 10:52
I don't either. The RFC also goes into great detail
about inheritance and variance. Would you have a few
words on that?
Nikita Popov 11:00
I think mostly inheritance works like inheritance
does for methods, at least that's how it's supposed
to work. Of course there are some interactions,
because you can for example mix real properties and
accessor properties. In which case, if you have
parent accessor property, you can always replace it
with a normal simple property, because normal
properties they support all operations that accessor
properties do. What you can't do is the other way
around. If you have a parent normal property, then
you can't replace that with an accessor property. And
reason is that it does have some limitations. Not a
lot, but there are some limitations. One of them is
related to references, I mean, we're already talking
about this topic. What the by reference get allows is
taking a reference to the property, so you can do
something like a reference equals the property. What
you can't do is the other way around the property
reference equals something else. So you can't assign
a new reference into the property, that just doesn't
work on a pretty fundamental level, because it would
require an additional set handler for set by
reference. As we don't particularly love references,
adding a new mechanism to support that is not a very
popular choice.
Derick Rethans 12:20
Variance wise, I guess, the same rules apply as for
normal properties and property types?
Nikita Popov 12:27
Approximately. Properties are apparently invariant,
so you can't change the type or I mean you can change
it but it has to be an equivalent type. If you have a
read only property, with only a getter, then the
implementation makes the type covariant, which means
you can use a smaller type in the child class. This
is similar to how if you have a getter method, you
could also give it a smaller type in the child class.
The converse case, if you have a property that can
only be set, then the type is contravariant, you can
have larger type in the child class, though I should
say that properties that can only be set are somewhat
odd and really only supported for the sake of
completeness, so maybe it might be worthwhile to drop
the type specific behaviour there, because a set only
property should already be really rare, and then set
property with a contravariant inheritance that's like
a edge case of an edge case.
Derick Rethans 13:24
Would it even make sense to support set only
properties?
Nikita Popov 13:27
Not sure. So for the C#, implementation, I think they
don't support this and there is a StackOverflow
question about that, and people try to convince
their, that they should support this, that the are
really use cases. Currently the imagined use cases
are along the lines of injecting values into a class,
so using setter injection, just that now it's
property based setter injection. Okay, I'll be honest
I think it doesn't make sense.
Derick Rethans 13:55
To be fair, I don't think either. It would reduce the
length of the RFC a little bit.
Nikita Popov 14:00
A little bit, yes.
Derick Rethans 14:01
Can you say a few words about abstracts, traits,
private accessors shadowing and things like that. So
a lot of complicated words, maybe you, you can distil
that into something slightly simpler.
Nikita Popov 14:12
Well I think actually abstract properties are worth
mentioning. In particular, the fact that you can now
specify properties inside interfaces. If you have
public properties, then it makes sense to have them
really on the same level as public methods, so they
are part of the API contract, and as such should also
be supported in interfaces. Typically what the RFC
allows is, you can't specify a simple property in the
interface, but you can specify an accessor property,
which tells you which operations have to be
supported. So you can't have a property declaration
that says, it just has a get accessor, or it has get
and set. The implementation of course can always
implement more, so if the interface requires get,
then you can implement both get and set, but it has
to implement at least get, either through an accessor
offer another property. I think in most cases
implementation will just be a normal property.
Derick Rethans 15:03
Because a normal property would implement an implicit
get already anyway?
Nikita Popov 15:07
Yeah.
Derick Rethans 15:08
How do property accessors tie in, or integrate with
constructor property promotion?
Nikita Popov 15:13
They are supported and promotion with the limitation
that it's only implicit accessors. If you use
constructor promotion, then you can specify your read
only property in there, or property that is Public
Read/ private write. You cannot specify a property
with complex behaviour in there. This is mainly
because it would mean that you embed large code
blocks into the constructor signature, which is I
think, pushing the limits of shorthand syntax, a bit.
Like there is nothing fundamental that will prevent
it, it's more a question of style.
Derick Rethans 15:50
The RFC talks a little bit about how, or rather what
happens if you use foreach, var_dump, or an array
cast on properties with explicit accessor. What are
the restrictions here? Is something chasing from
normal standard properties like we currently have.
Nikita Popov 16:03
I don't think so. So here is once again the case
where we have this distinction between the implicit
accessors, which are really just normal properties
with limitations. So those show up in var_dump and
array cast, foreach, as usual. And we have explicit
accessors, which are really virtual properties, so
they don't have any storage themselves. Any storage
to use, you have to manage separately somehow. So,
these don't show up in var_dump, foreach, and so on.
Both these actual computed properties, they don't
show up because that would require us to actually
call all the accessors if you do foreach and that
seems rather dubious to me.
Derick Rethans 16:44
How this will work for internal API's that some
extensions use to access, like a list of all the
properties, for say, for a debugger.
Nikita Popov 16:51
It'll work the same way as var_dump. I mean, in the
end it's all, well it's not quite based on the same
API's, but still, the answer is the same. You only
get those properties that have some kind of backing
storage, and those are only the ones that are either
normal properties, or the ones with implicit
accessors.
Derick Rethans 17:09
That means I need to go find out a way how to be able
to read the ones with explicit accessors.
Nikita Popov 17:14
Yeah, if you want to. I don't think that the debugger
should read those by default, because that means that
doing a dump, will have side effects, which is not
ideal, but maybe you want to have an option to show
them.
Derick Rethans 17:26
That's something for me to think about, because I'm
pretty sure people are going to want to see the
contents of these properties, even in a debugger,
even though that could mean that are side effects,
which I'm not keen on.
Nikita Popov 17:36
I guess that's one of the, I would say advantages of
using this over just magic get/set, because actually
know which properties you're supposed to look at,
with for magic get/set you just don't know at all.
Derick Rethans 17:51
The RFC talks a little bit about the performance
impacts and although I saw the numbers I didn't
actually read them, when preparing for this
recording. What are the performance impacts for
implicit accessors as well as explicit ones?
Nikita Popov 18:02
Impact is basically if you use implicit accessors
that has similar performance to plain properties,
performance is a bit worse. The reason is essentially
that we have some limitations on caching. So we can't
just cache it as if it were a normal property,
because it could have asymmetric visibility. And we
reuse the same cache slots for reads and writes. I've
been thinking about maybe splitting that up but at
least for now there is a small additional performance
impact of using implicit accessors, but it's not
really significant. On the other side if you use
explicit accessors. Those are expensive, they are not
quite as expensive as using magic get/set, but they
are more expensive than using normal method calls.
Reason is basically their normal method calls, they
are very optimized, and they do not have to re enter
the virtual machine, so we just stay in the same
virtual machine loop, and we just switch to different
stack frames. For magic get/set we actually have to
like recursively call the virtual machine, because we
don't have a good point to re enter it, at least
based on our current API's. And we also have to deal
with some additional stuff, particularly the fact
that magic get/set and property accessor as both,
they have recursion guards. Normally if you recurse
methods in PHP, we don't do any checks about that.
Xdebug does, but PHP itself doesn't, so you can
infinitely recurse and PHP is fine. The only thing
that happens is that at some point you'll run out of
memory.
Derick Rethans 19:37
Or when extensions are loaded such as Xdebug, you'll
actually still get a stack overflow.
Nikita Popov 19:41
So that's something we should still be addressing, at
least the baseline behaviour that you can get to that
memory limit error. For properties will set have
recursion guards, which say that if you recursively
access a property in magic get/set, that it will not
call magic get/set again and instead, access the
property as if they didn't exist.
Derick Rethans 20:01
Instead of throwing in an error?
Nikita Popov 20:03
Yeah. For property accessors I'm actually throwing an
error on recursion, and the reason for that is if we
didn't throw an error, then this would end up
accessing dynamic property of the same name as the
accessor, which would technically work, but it's very
likely not what the programmer actually intended. So
it's going to be really inefficient because you
actually have to allocate space for the dynamic
properties and access for those. So if you wanted to
have some kind of backing storage for the property,
then you should just explicitly declare it and access
that, rather than accessing something with the same
name and implicitly creating a dynamic property.
Derick Rethans 20:41
Yeah, that sounds all very complicated.
Nikita Popov 20:44
It's cleaner to just make it an error and let the
programmer fix it, instead of PHP try to fix it for
you.
Derick Rethans 20:51
Are there any BC considerations about the
introduction of property accessors?
Nikita Popov 20:55
Not strictly, but I'm sure that it's going to break,
various assumptions for people, or at least in the
sense that, right now, most assumptions should
already be broken through magic get/set. I mean you
can always have this kind of magic behaviour. If we
have accessors this is probably going to be a lot
more common, and people will have to deal with things
like properties being publicly readable, but
privately writable much as because someone very
rarely manually implements that behaviour, but
because the language though has native support for it
and it's going to be common.
Derick Rethans 21:28
We spoke a little bit about all the different
sticking points in his RFC, for example with
references, but there's one other thing and I think
it's an argument you make somewhere on the bottom of
the RFC, that there is a separation between implicit
and explicit property accessors. I'm wondering
whether it would make sense to consider adding
whether the implicit part of this RFC first and then
maybe later look at adding explicit property
accessors.
Nikita Popov 21:54
That's really the main sticking point, and also my
own problem with the RFC. I mean, you mentioned at
the very start, that this is a very long RFC and
still a little bit incomplete so it's going to be
longer. It's a fairly complex feature that has
complex interactions with other features in PHP. The
implementation is actually, maybe less complex, then
you think, given the RFC length. The main concern I
have is that, at least for me personally the most
useful part of the RFC, are the read only properties.
The read only properties and the like Public Read,
Private Write properties. I think these two cover
like 90% of the use cases, especially because if you
have a property that is only publicly readable, then
you don't really have to be concerned about this case
where you have to, later on, add additional
validation. I mean after all the property is read
only, or you control all the sets because they're
private. There is no danger of introducing an API
break, because you have to add additional validation.
I think like the largest part of the use case of the
whole accessors proposal will be covered by these two
things, Or maybe even just one of these two things,
that's a bit of a philosophical question. There are
some people who think we should have just public read
/ private write and no proper read only properties,
because that like looks the same from the user
perspective, but still gives you more flexibility. I
think that's like the most important use case, and we
could implement that part with a lot less language
complexity. So the question is really does it make
sense to have this full accessor proposal, if we
could get the most useful part as a separate simpler
feature, and, well, I heard differing opinions on
that one. I was actually pretty surprised that their
reception of the on like a full accessors proposal
was fairly positive. I kind of expected more
pushback, especially as, this is the second proposal
on the topic, we had earlier one with, like, similar
syntax even though different details, and that one
did fail.
Derick Rethans 24:02
How long ago was that?
Nikita Popov 24:03
Oh that was quite a while actually, at least more
than five years.
Derick Rethans 24:06
I think that the mindset of developers has changed in
the last five to 10 years, like introducing this 10
years ago would never happened, or even typed
properties, right. It would never have happened.
Nikita Popov 24:17
That's true.
Derick Rethans 24:19
Do you have any idea when you're going to put us up
for a vote? Because, of course, PHP 8.1 feature
freezes coming up in not too far away from now.
Nikita Popov 24:28
Yeah, I'm not sure about that. I'm still considering
if I want to explore the simpler alternatives, first.
There was already a proposal. Another rejected
proposal for Read Only properties, probably was
called write once properties at the time. But yeah, I
kind of do think that it might make sense to try
something like that, again, before going to the full
accessors proposal or instead.
Derick Rethans 24:54
Do you have anything else to add?
Nikita Popov 24:56
What are your thoughts on this proposal, and the
question at the end?
Derick Rethans 24:59
I quite like it, but I also think that it might make
sense to split it up. I'm always quite a fan of
splitting things up in smaller bits, if that's
possible too, and still provide quite a lot of use
out of it. And that's why I was asking whether it
makes sense to split it up into the implicit part and
the explicit part of it, especially because it makes
the implementation and the logic around it quite a
bit easier for people to understand as well.
Nikita Popov 25:24
It's maybe worth mentioning that Swift also has a
similar accessor model but it is more like a
composition of various different features like read
only properties, properties with asymmetric
visibility, and then finally properties with like
fully controlled, get and set behaviour rather than
this C# model where everything is modelled using
accessors with appropriate modifiers. So there is
certainly precedent in other languages of separating
these things.
Derick Rethans 25:55
Something to ponder about, and I'm sure we'll get to
a conclusion at some point. Hopefully some of it
before PHP eight one goes and feature freeze, of
course. We've been chatting for quite a while now, I
think we should call it the end for this RFC. Thank
you for taking the time today to talk about property
accessors.
Nikita Popov 26:11
Thanks for having me, Derick.
Derick Rethans 26:12
Thank you for listening to this installment of PHP
internals news, a podcast, dedicated to demystifying
the development of the PHP language. I maintain a
Patreon account for supporters of this podcast as
well as the Xdebug debugging tool. You can sign up
for Patreon at https://drck.me/patreon. If you have
comments or suggestions, feel free to email them to
derick@phpinternals.news. Thank you for listening and
I'll see you next time.
Show Notes
RFC: Property Accessors
Credits
Music: Chipper Doodle v2 — Kevin MacLeod
(incompetech.com) — Creative Commons: By
Attribution 3.0
Weitere Episoden
vor 3 Jahren
vor 3 Jahren
In Podcasts werben
Kommentare (0)