Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot seem to load a very small file #301

Open
xiaodaigh opened this issue Aug 31, 2019 · 3 comments
Open

Cannot seem to load a very small file #301

xiaodaigh opened this issue Aug 31, 2019 · 3 comments

Comments

@xiaodaigh
Copy link

xiaodaigh commented Aug 31, 2019

Included a runnable MWE. The file is less than 1mb but just seems to hang in the terminal in Julia 1.2.0 Windows 10, but is working fine on Julia 1.1.1

using JuliaDB, Dagger

##############################################################
# Download & Extract data
###############################################################

#;wget https://raw.githubusercontent.com/xiaodaigh/JuliaDB.jl/master/ok.csv

##############################################################
# Specify the types of columns
###############################################################

fmtypes = [
    Int64,                     String,     Union{String, Missing},     Union{Float64, Missing},    Union{Float64, Missing},
    Union{Float64, Missing},    Union{Float64, Missing},    Union{Float64, Missing},    Union{String, Missing},     Union{String, Missing},
    Union{String, Missing},     Union{String, Missing},     Union{String, Missing},     Union{String, Missing},     Union{String, Missing},
    Union{String, Missing},     Union{String, Missing},     Union{Float64, Missing},    Union{Float64, Missing},    Union{Float64, Missing},
    Union{Float64, Missing},    Union{Float64, Missing},    Union{Float64, Missing},    Union{Float64, Missing},    Union{Float64, Missing},
    Union{Float64, Missing},    Union{Float64, Missing},    Union{Float64, Missing},    Union{String, Missing},     Union{Float64, Missing},
    Union{String, Missing}]


@time jll = loadtable(
    "ok.csv",
    output = "fm.jldb/",
    delim=',',
    header_exists=true,
    #filenamecol = "filename",
    #chunks = length(ifiles),
    #type_detect_rows = 20_000,
    # colnames = colnames,
    colparsers = fmtypes,
    indexcols=["Column1"]);
@jpsamaroo
Copy link
Collaborator

I can reproduce this, and confirm that with a non-release Julia v1.3 build on Linux it hangs and ignores attempts to Ctrl-C.

@davidanthoff
Copy link

Could you try to read it with just TextParse.jl? Just to figure out whether the problem is there, or in JuliaDB.

@jpsamaroo
Copy link
Collaborator

Doing just TextParse.csvread("ok.csv", ','; header_exists=true, colparsers=fmtypes) loads the file successfully in ~5 seconds (including inference and compilation time, which is quite good). So clearly this is a JuliaDB issue. Thanks for the tip @davidanthoff !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants