-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading specific rows from a large sas7bdat
file
#42
Comments
Hi! Have you tried the keyword arguments |
Hi, yes, but it would only work if the rows I want to select are in order. In my case, they're spread out over the dataset. |
@BERENZ All right. Now I see your point. Filtering the rows of the data file with a general condition is not something that is built into the parser. However, a work around could be that you try to cut the file into partitions of consecutive rows that are small enough to be fit into the memory and then filter each partition one by one. The entire file is therefore still read into the memory at some point. |
Sure, this is what I actually do nowadays (split data into chunks). I understand that to make this possible is to make changes to the underlying |
Yes. For reading the files, the iteration across rows is handled within the C library and there is no such an interface to skip rows depending on the values. |
Is there a way to add functionality to read specific rows from a large
sas7bdat
file? The issue I'm facing is that I have large SAS files (around 10GB) along with text files (an exact, flat copy of the SAS file). Based on the text file, I can specify the subset of rows that I'm interested in (around 10% of the file).Another option is to specify a filter while reading, for example, reading rows based on a column. However, I understand that this may be more challenging to implement.
The text was updated successfully, but these errors were encountered: