You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the ChangeListExecutor class, the changelist_generator collects all the resources from a previously generated resourcelist using the update_previous_state() method. This was reasonable for the rspub-core filesystem-centric approach, but generally speaking this just doesn't scale (I'm working with ~70 million resources).
I guess the only reason for doing so is being able to perform this check, which is again reasonable when you have file system resources, but what should happen is that your resource generator should be able to list changes and label them as C/U/D without relying on py-resourcesync. You should therefore use a specific generator for "change" resources (or make a generator able to issue resources or changes based on the strategy).
What I mean is something like:
I left this part of the code as it was in rspub-core, so I will have to look into this in detail to understand what is happening. In the meantime, if you would like to submit a PR, please do so :-).
I'm trying to overcome this limitation keeping in mind the specific use case provided by CORE. This means that I'm working on something not really general, although it may be a good starting point. As soon as I have more time I will definitely work on a PR.
Btw, the general approach used by rspub-core for changes was:
I parse the old resourcelist
I apply the changes that are already recorded in previous changelists, if any
I see the differences and write them
As you can imagine, when you have 70 million resources, this is time, cpu and memory consuming. I discussed about this with Henk back in the days, and he confirmed that this was something we need to work on.
In the
ChangeListExecutor
class, thechangelist_generator
collects all the resources from a previously generated resourcelist using theupdate_previous_state()
method. This was reasonable for the rspub-core filesystem-centric approach, but generally speaking this just doesn't scale (I'm working with ~70 million resources).I guess the only reason for doing so is being able to perform this check, which is again reasonable when you have file system resources, but what should happen is that your resource generator should be able to list changes and label them as C/U/D without relying on py-resourcesync. You should therefore use a specific generator for "change" resources (or make a generator able to issue resources or changes based on the
strategy
).What I mean is something like:
What do you think? Does it sound reasonable?
The text was updated successfully, but these errors were encountered: