Skip to content

Conversation

@dafnapension
Copy link
Collaborator

@dafnapension dafnapension commented Sep 20, 2025

remove register_all_artifacts from unitxt.__init__. Instantiate classes upon demand (only the needed ones) by finding the module, and then use import tools.

@yoavkatz
Copy link
Member

Hi Dafna. Indeed the requirement to register classes is a cumbersome one. If I understand right, this proposed solution, searchs for a class in the unitxt folders. Unitxt allows extension in external catalogs and code, and people use it, so the search path to find classes is not only unitxt. I'm not sure how this approach can be extended to cover this.

@dafnapension
Copy link
Collaborator Author

Thank you, @yoavkatz , this indeed is one missing piece of information for me: Where else is register_all_classes invoked, and over which modules/classes? Does the usual mode of operation of a user contain the register_all_classes that I see in src/unitxt/__init__, as well as other register_all_artifacts? Will you share an example of such with me?
Are all classes, registered in all points of registration, registered together in Artifact._class_register ?

Will a high-level registration of such points of class-registration in the project directory do as a solution for the issue you raised?

@yoavkatz
Copy link
Member

I think the way things worked before, is that once you imported a class, it registered it.

Do if I extended unitxt and did

from mymodule import MyMetric

it would register MyMetric, and I load object that include MyMetric objects.

If I did not have the import , I would get the error:

my_metric is not a recognized artifact 'type'. Make sure a the class defined this type (Probably called 'MyMetric' or similar) is defined and/or imported anywhere in the code executed."

@dafnapension
Copy link
Collaborator Author

Thank you, @yoavkatz , I need to exercise to understand. But at any rate, everything that runs today in main, also runs in the PR. The only thing that is gone from the code in this PR is the invocation of register_all_artifacts from src/unitxt/__init__ (and I also removed the definition of the method). So, even if not comprehensive, this PR at least saves that invocation (from __init__).

I apologize that I do not understand..: Is there another point in the code where register_all_artifacts is invoked? for registering classes defined by the user? If so, I at least need to return the definition of register_all_artifacts to the code, as I removed it for becoming unemployed. If no place other than src/unitxt/__init__ exists where register_all_classes is explicitly invoked, but, rather, classes are registered as they are imported, then this PR does nothing that hurts such a process.

All in all, this PR replaces the invocation of register_all_artifacts from src/unitxt/__init__ , that registers all and only the classes in src/unitxt/*.py, by an on-demand method that instantiates any such unitxt class, given the snake_case_name of the class.
Nothing else is changed. Not even the format of the catalog. (there exists another PR that does change the format of the __type__ fields in the catalog, which allows a dramatically faster instantiation of the unitxt classes, also on-demand). So all other processes continue as they are today.

If I can understand whether and where another invocation of register_all_artifacts occurs in the code, I can try replace it in analogy.

@dafnapension dafnapension force-pushed the module_name_same_catalog branch from e4db288 to 16124e7 Compare September 21, 2025 11:22
@dafnapension dafnapension force-pushed the module_name_same_catalog branch from 16124e7 to 80fba24 Compare October 12, 2025 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants