-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce Unicode CLDR artifact size #373
Comments
|
You are correct that TimeZones.jl only requires the In order to address this problem with the artifacts system we'd need to generate our own permanent artifact archive for each CLDR release. Probably the right approach would be to have these artifacts as GitHub release assets which follows what JLLs do. It also probably makes sense to have this not be a part of the TimeZones.jl repo but probably a CLDRWindowsZones_jll repo so that the tags/releases make more sense. Additionally, it may make more sense to make a CLDR_jll.jl and have just the The end result is having a much smaller artifact footprint as well as being able to update CLDR versions without having to make a new TimeZones.jl release. Unfortunately, this is a bunch of work but I do want to tackle this before or around JuliaCon 2023. |
The PR #439 changed the Unicode CLDR artifacts to no longer be non-lazy and platform specific. Although this doesn't impact the download size on Windows machines it does limit the downloads to only be on Windows. It still makes sense to make separate data package with this information akin to what was done in #441. I do want to have #441 be out in the wild for a little while to ensure this artifact model works well. |
Besides the artifact size, the many nested directories also can cause issues with the maximum file path on Windows. In this case using PackageCompiler to create a library with the cldr artifact led to an error like:
Enabling long paths in the Windows Registry fixed the issue. https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation |
@omus if I understand correctly windowsZones.xml is only used to fill I had a look at the tzlocal Python package and saw they do the same. This is their dict literal, which they generate from this script. The dict lookup is done here after reading the registry (rather than |
On my local drive size of
.julia\artifacts\8d7201f06ff8061c1b5bff0f661b5297ec1d3158\cldr-release-40
is 318 MB.
If I searched code correctly, only one 49 kB file is actually used
https://github.com/unicode-org/cldr/blob/release-40/common/supplemental/windowsZones.xml
here
https://github.com/JuliaTime/TimeZones.jl/blob/master/src/winzone/WindowsTimeZoneIDs.jl
Could the artifact point directly to the
windowsZones.xml
file?The text was updated successfully, but these errors were encountered: