Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to only read n rows #117

Open
lexual opened this issue Apr 22, 2015 · 3 comments
Open

Option to only read n rows #117

lexual opened this issue Apr 22, 2015 · 3 comments

Comments

@lexual
Copy link

lexual commented Apr 22, 2015

If you have a large csv, can take quite some time to load up the file in tabview (10 seconds for file we just tested on)

Would be nice to have an option to only read in n rows.

e.g.

tabview -n 1000 my_big_data.csv

This would be useful as quite often I'm more interested in just looking at the structure and common values, not the entire dataset.

@firecat53
Copy link
Collaborator

That sounds reasonable. Thanks for the suggestion!

@s-celles
Copy link
Contributor

It could be a good idea to support a sort of 'slice' syntax --rows 0:10 first 10 rows --rows -2: to get last 2 rows. same idea for columns with --cols. If data was in a Pandas DataFrame it will be very to have this slice concept.

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'), index=list('ABCD'))
print(df)

          a         b         c         d         e
A  0.393480  0.203721  0.502450  0.734149  0.380107
B  0.337915  0.416731  0.180090  0.840988  0.029033
C  0.094730  0.183898  0.875805  0.060895  0.387969
D  0.597270  0.697400  0.078505  0.850511  0.932793

You can get a subset of this dataframe (last 2 rows, first 3 colums)

print(df.iloc[-2:,0:3])
         a         b         c
C  0.09473  0.183898  0.875805
D  0.59727  0.697400  0.078505

@wavexx
Copy link
Member

wavexx commented Apr 26, 2015

On 04/26/2015 10:23 AM, scls19fr wrote:

It could be a good idea to support a sort of 'slice' syntax --rows 0:10 first 10 rows --rows -2:0 to get last 2 rows. same idea for
columns with --cols. If data was in a Pandas DataFrame it will be
very to have this slice concept.

I'm a bit against this option, since you can now just chain a
head/tail/cut command in there, and it would actually be faster.

The only argument in favor is that it's a PITA to do sometimes, due to
the delimiters/quoting/whatnot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants