I apologize if I'm butchering the terminology. I'm trying to understand the code in this example on how to chain a custom function onto a PySpark dataframe. I'd really want to understand exactly what it's doing, and if it is not awful practice before I implement anything.
From the way I'm understanding the code, it:
- defines a function g with sub-functions inside of it, that returns a copy of itself
- assigns the sub-functions to g as attributes
- assigns g as a property of the DataFrame class
I don't think at any step in the process do any of them become a method (when I do getattr, it always says "function")
When I run a (as best as I can do) simplified version of the code (below), it seems like only when I assign the function as a property to a class, and then instantiate at least one copy of the class, do the attributes on the function become available (even outside of the class). I want to understand what and why that is happening.
An answer [here(https://stackoverflow.com/a/17007966/19871699) indicates that this is a behavior, but doesn't really explain what/why it is. I've read this too but I'm having trouble seeing the connection to the code above.
I read here about the setattr part of the code. He doesn't mention exactly the use case above. this post has some use cases where people do it, but I'm not understanding how it directly applies to the above, unless I've missed something.
The confusing part is when the inner attributes become available.
class SampleClass():
def __init__(self):
pass
def my_custom_attribute(self):
def inner_function_one():
pass
setattr(my_custom_attribute,"inner_function",inner_function_one)
return my_custom_attribute
[x for x in dir(my_custom_attribute) if x[0] != "_"]
returns []
then when I do:
SampleClass.custom_attribute = property(my_custom_attribute)
[x for x in dir(my_custom_attribute) if x[0] != "_"]
it returns []
but when I do:
class_instance = SampleClass()
class_instance.custom_attribute
[x for x in dir(my_custom_attribute) if x[0] != "_"]
it returns ['inner_function']
In the code above though, if I do SampleClass.custom_attribute = my_custom_attribute instead of =property(...) the [x for x... code still returns [].
edit: I'm not intending to access the function itself outside of the class. I just don't understand the behavior, and don't like implementing something I don't understand.