Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider moving over to parquet format #107

Open
zcattacz opened this issue Feb 18, 2025 · 1 comment
Open

Consider moving over to parquet format #107

zcattacz opened this issue Feb 18, 2025 · 1 comment

Comments

@zcattacz
Copy link
Contributor

Most of the analysis tools like Apache Spark, Apache Drill support parquet out of the box. The data crunching is a lot faster and more efficient than SQL server or Pgsql due to the format's columnary design and the sheer number of channels used in Scada.

I use Python script to convert dat archive from scada5 into parquet, the size is like 1/2 to 1/10. very compact in size. also analysis tools working with parquet usually support partition by dir names, quite easy to organize.

I think supporting exporting archive to parquet, after that if you feel comfortable, moving over to use parquet as native archive format might be a good idea. putting the fancy tools aside, you got 1/2-3/5 free storage space, plus data loading is also more efficient.

@2mik
Copy link
Member

2mik commented Feb 19, 2025

Hello,
Creating a module that supports Apache Parquet format for storing historical archives looks promises. Are you planning to implement such a module for Rapid SCADA? If so, we can add it to https://rapidscada.net/store/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants