121 of 9,346 importing from structure file - inconsistent "negative subscripts" error #141

gottschoa · 2016-04-20T20:34:47Z

Hi,

I am trying to run DAPC on my new Sceloporus dataset. I successfully got this to work before with some other datasets. I am using adegenet 2.0.0.

When I run the following line for my "Uma dataset", I am able to successfully import:

library(adegenet)

data <- read.structure("output_western_uma_121214_n60_h5_p75_editnames_2.str", n.ind=64, n.loc=597, onerowperind=FALSE, col.lab=1, col.pop=0, col.others=NULL, row.marknames=NULL, NA.char="-9", pop=NULL, ask=FALSE, quiet=FALSE)

When I run the same code for the "Sceloporus dataset":

data <- read.structure("output_sceloporus_032415_n43_h5_p75.str", n.ind=80, n.loc=1024, onerowperind=FALSE, col.lab=1, col.pop=0, col.others=NULL, row.marknames=NULL, NA.char="-9", pop=NULL, ask=FALSE, quiet=FALSE)

I get the following error:

Error in mat[, (ncol(mat) - p + 1):ncol(mat)] :
only 0's may be mixed with negative subscripts

I also tried this with adegenet v 1.4.2 and having the exact same issue.

I attached both input (structure) files to this email. They were both formatted the same way, from pyRAD v2.1.2. If anyone can figure out why one file is giving me the error, and the other isn't, I would greatly appreciate it.

I should point out that I searched the archives, a similar question has been posted about a year ago, but I didn't see it resolved:

http://lists.r-forge.r-project.org/pipermail/adegenet-forum/2014-December/001049.html

Thanks for your help! (I added .txt extension to the .str files to upload to github)

Best, Andy

output_sceloporus_032415_n43_h5_p75.str.txt
output_western_uma_121214_n60_h5_p75_editnames_2.str.txt

thibautjombart · 2016-04-21T10:15:12Z

Hi there,
before looking into this, have you tried with the latest version of adegenet (2.0.1)

thibautjombart · 2016-08-05T09:56:22Z

Is this issue still pending?

gottschoa · 2016-08-16T23:28:10Z

Hi Dr. Jombart,

Sorry for the delayed reponse, I tried with 2.01 and still encounter the
same issue.

Best, Andy

On Fri, Aug 5, 2016 at 5:56 AM, Thibaut Jombart [email protected]
wrote:

Is this issue still pending?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#141 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ARuMZ2Ul3ldczVX2U0VrbIOxJrTX1zURks5qcwjIgaJpZM4IMFoz
.

Andrew Gottscho, Ph.D.
[email protected]

MagB · 2016-11-23T14:31:30Z

Hi,

I'm having the same issue. I tried with 2.01 and continue to have the same error code.
I'm using .str data.[
no_pop_map_snp_data.txt

](url)

thibautjombart · 2016-11-25T17:50:17Z

Hi there,
I am heading to a conference all of next week, so will not be able to look into this before a week. If this is a persistent error, this may be a bug. If you have time for this, you can try and see what is wrong using:

debug(read.structure)

before entering the command line creating the error.

zkamvar · 2017-10-10T15:48:38Z

Hi @gottschoa, the reason why this fails is because adegenet can only detect 1019 loci and not 1024. If you read the structure file in as a table, there are only 1019 columns that register as loci.

library("adegenet")
#> Loading required package: ade4
#> 
#>    /// adegenet 2.1.0 is loaded ////////////
#> 
#>    > overview: '?adegenet'
#>    > tutorials/doc/questions: 'adegenetWeb()' 
#>    > bug reports/feature requests: adegenetIssues()
tmp <- tempfile(fileext = ".str")
download.file("https://github.com/thibautjombart/adegenet/files/228778/output_sceloporus_032415_n43_h5_p75.str.txt", 
  destfile = tmp)
read.structure(tmp, n.ind = 80, n.loc = 1024, onerowperind = FALSE, col.lab = 1, 
  col.pop = 0, col.others = NULL, row.marknames = NULL, NA.char = "-9", pop = NULL, 
  ask = FALSE, quiet = FALSE)
#> 
#>  Converting data from a STRUCTURE .stru file to a genind object...
#> Error in mat[, (ncol(mat) - p + 1):ncol(mat)]: only 0's may be mixed with negative subscripts
read.structure(tmp, n.ind = 80, n.loc = 1019, onerowperind = FALSE, col.lab = 1, 
  col.pop = 0, col.others = NULL, row.marknames = NULL, NA.char = "-9", pop = NULL, 
  ask = FALSE, quiet = FALSE)
#> 
#>  Converting data from a STRUCTURE .stru file to a genind object...
#> Warning in df2genind(X = X, pop = pop, ploidy = 2, sep = sep, ncode =
#> ncode): entirely non-type marker(s) deleted
#> /// GENIND OBJECT /////////
#> 
#>  // 80 individuals; 1,017 loci; 2,047 alleles; size: 1.1 Mb
#> 
#>  // Basic content
#>    @tab:  80 x 2047 matrix of allele counts
#>    @loc.n.all: number of alleles per locus (range: 1-4)
#>    @loc.fac: locus factor for the 2047 columns of @tab
#>    @all.names: list of allele names for each locus
#>    @ploidy: ploidy of each individual  (range: 2-2)
#>    @type:  codom
#>    @call: read.structure(file = tmp, n.ind = 80, n.loc = 1019, onerowperind = FALSE, 
#>     col.lab = 1, col.pop = 0, col.others = NULL, row.marknames = NULL, 
#>     NA.char = "-9", pop = NULL, ask = FALSE, quiet = FALSE)
#> 
#>  // Optional content
#>    - empty -
sum(!sapply(read.table(tmp, sep = "\t"), is.logical))
#> [1] 1020

Session info

devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.2 (2017-09-28)
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/Chicago             
#>  date     2017-10-10
#> Packages -----------------------------------------------------------------
#>  package    * version    date       source                        
#>  ade4       * 1.7-8      2017-08-09 cran (@1.7-8)                 
#>  adegenet   * 2.1.0      2017-10-10 local                         
#>  ape          4.1        2017-02-14 CRAN (R 3.4.0)                
#>  assertthat   0.2.0      2017-04-11 CRAN (R 3.4.0)                
#>  backports    1.1.1      2017-09-25 CRAN (R 3.4.2)                
#>  base       * 3.4.2      2017-10-04 local                         
#>  bindr        0.1        2016-11-13 CRAN (R 3.4.0)                
#>  bindrcpp     0.2        2017-06-17 CRAN (R 3.4.0)                
#>  boot         1.3-20     2017-07-30 CRAN (R 3.4.1)                
#>  cluster      2.0.6      2017-03-16 CRAN (R 3.4.0)                
#>  coda         0.19-1     2016-12-08 CRAN (R 3.4.0)                
#>  colorspace   1.3-3      2017-08-16 R-Forge (R 3.4.1)             
#>  compiler     3.4.2      2017-10-04 local                         
#>  datasets   * 3.4.2      2017-10-04 local                         
#>  deldir       0.1-14     2017-04-22 CRAN (R 3.4.0)                
#>  devtools     1.13.3     2017-08-02 CRAN (R 3.4.1)                
#>  digest       0.6.12     2017-01-27 CRAN (R 3.4.0)                
#>  dplyr        0.7.4      2017-09-28 CRAN (R 3.4.1)                
#>  evaluate     0.10.1     2017-06-24 CRAN (R 3.4.1)                
#>  expm         0.999-2    2017-03-29 CRAN (R 3.4.0)                
#>  formatR      1.5        2017-04-25 CRAN (R 3.4.0)                
#>  gdata        2.18.0     2017-06-06 CRAN (R 3.4.0)                
#>  ggplot2      2.2.1      2016-12-30 CRAN (R 3.4.0)                
#>  glue         1.1.1      2017-06-21 CRAN (R 3.4.0)                
#>  gmodels      2.16.2     2015-07-22 CRAN (R 3.4.0)                
#>  graphics   * 3.4.2      2017-10-04 local                         
#>  grDevices  * 3.4.2      2017-10-04 local                         
#>  grid         3.4.2      2017-10-04 local                         
#>  gtable       0.2.0      2016-02-26 CRAN (R 3.4.0)                
#>  gtools       3.5.0      2015-05-29 CRAN (R 3.4.0)                
#>  htmltools    0.3.6      2017-04-28 CRAN (R 3.4.0)                
#>  httpuv       1.3.5      2017-07-04 CRAN (R 3.4.1)                
#>  igraph       1.1.2      2017-07-21 cran (@1.1.2)                 
#>  knitr        1.17       2017-08-10 cran (@1.17)                  
#>  lattice      0.20-35    2017-03-25 CRAN (R 3.4.0)                
#>  lazyeval     0.2.0      2016-06-12 CRAN (R 3.4.0)                
#>  LearnBayes   2.15       2014-05-29 CRAN (R 3.4.0)                
#>  magrittr     1.5        2014-11-22 CRAN (R 3.4.0)                
#>  MASS         7.3-47     2017-04-21 CRAN (R 3.4.0)                
#>  Matrix       1.2-11     2017-08-16 CRAN (R 3.4.1)                
#>  memoise      1.1.0      2017-04-21 CRAN (R 3.4.0)                
#>  methods    * 3.4.2      2017-10-04 local                         
#>  mgcv         1.8-22     2017-09-19 CRAN (R 3.4.2)                
#>  mime         0.5        2016-07-07 CRAN (R 3.4.0)                
#>  munsell      0.4.3      2016-02-13 CRAN (R 3.4.0)                
#>  nlme         3.1-131    2017-02-06 CRAN (R 3.4.0)                
#>  parallel     3.4.2      2017-10-04 local                         
#>  permute      0.9-4      2016-09-09 CRAN (R 3.4.0)                
#>  pkgconfig    2.0.1      2017-03-21 CRAN (R 3.4.0)                
#>  plyr         1.8.4      2016-06-08 CRAN (R 3.4.0)                
#>  R6           2.2.2      2017-06-17 cran (@2.2.2)                 
#>  Rcpp         0.12.13.1  2017-10-10 Github (RcppCore/Rcpp@136d50f)
#>  reshape2     1.4.2      2016-10-22 CRAN (R 3.4.0)                
#>  rlang        0.1.2      2017-08-09 cran (@0.1.2)                 
#>  rmarkdown    1.6        2017-06-15 cran (@1.6)                   
#>  rprojroot    1.2        2017-01-16 CRAN (R 3.4.0)                
#>  scales       0.5.0.9000 2017-08-28 Github (hadley/scales@d767915)
#>  seqinr       3.4-5      2017-08-01 CRAN (R 3.4.1)                
#>  shiny        1.0.5      2017-08-23 cran (@1.0.5)                 
#>  sp           1.2-5      2017-06-29 CRAN (R 3.4.1)                
#>  spdep        0.6-15     2017-09-01 CRAN (R 3.4.1)                
#>  splines      3.4.2      2017-10-04 local                         
#>  stats      * 3.4.2      2017-10-04 local                         
#>  stringi      1.1.5      2017-04-07 CRAN (R 3.4.0)                
#>  stringr      1.2.0      2017-02-18 CRAN (R 3.4.0)                
#>  tibble       1.3.4      2017-08-22 cran (@1.3.4)                 
#>  tools        3.4.2      2017-10-04 local                         
#>  utils      * 3.4.2      2017-10-04 local                         
#>  vegan        2.4-4      2017-08-24 cran (@2.4-4)                 
#>  withr        2.0.0      2017-07-28 CRAN (R 3.4.1)                
#>  xtable       1.8-2      2016-02-05 CRAN (R 3.4.0)                
#>  yaml         2.1.14     2016-11-12 CRAN (R 3.4.0)

saidwali · 2020-02-21T13:27:47Z

Hi,
I still have this problem if I run dapc. I am using the lastest verison 2.1.2 It works fine if i work with imputed data. But leaving missing marker data as NA is giving me this error. "Fehler in dm[, 1L:dimen, drop = FALSE] : nur Nullen dürfen mit negativen Indizes gemischt werden"

zkamvar · 2020-02-21T16:03:24Z

Hi,
I still have this problem if I run dapc. I am using the lastest verison 2.1.2 It works fine if i work with imputed data. But leaving missing marker data as NA is giving me this error. "Fehler in dm[, 1L:dimen, drop = FALSE] : nur Nullen dürfen mit negativen Indizes gemischt werden"

Are you leaving missing data in the file as NA or as -9?

saidwali · 2020-02-21T18:46:02Z

Hi,
I found out it was not working because of some stupid mistakes.
Somehow it works now also with missing data.
I am use "NA" for missing marker information. "1" for major, "2" for hetero and "3" for minor.

Some functions give me an error like "find.clusters"
"Warning in find.clusters.data.frame(as.data.frame(x), ...) : NAs introduced by coercion".
"Dudi.pca" is also not working with missing data. But I guess this is normal and I can live with that.
DAPC, scatter etc are working fine.

zkamvar · 2020-02-21T21:27:27Z

Hi,
I found out it was not working because of some stupid mistakes.
Somehow it works now also with missing data.
I am use "NA" for missing marker information. "1" for major, "2" for hetero and "3" for minor.

Just to confirm: you are referring to an error with read.structure()?

The system you describe is not supported by adegenet and will give you incorrect results. Adegenet assumes that you represent each allele individually so that it can then represent those as counts in a sparse matrix.

kkolis · 2020-04-20T19:20:15Z

Hello, I am having a very similar problem with the dapc command, where I get the same error as saidwali when I run the code
"mmOfour <- dapc(genlit.vcf, pop.list$pop, n.pca = 20, n.da = 4)"
Error in dm[, 1L:dimen, drop = FALSE] :
only 0's may be mixed with negative subscripts

I am currently running the Adegenet package 2.1.2. I am generating the genlight file with vcfR.
string.vcf <- read.vcfR("file.vcf")
genlit.vcf <- vcfR2genlight(string.vcf)

The Adegenet find.clusters program works with the genlight file. Additionally, previously generated genlight files work when running dapc.

I have been spinning my wheels with this error code for the past week, as I am re-analyzing some data after some changes to upstream filtering processes. I have relaxed some filters so that the new vcf/genlit files have more SNPs, and more missing data (however no more than ~25%).

Any help would be appreciated!

massub · 2020-08-05T16:21:35Z

Hi @gottschoa, the reason why this fails is because adegenet can only detect 1019 loci and not 1024. If you read the structure file in as a table, there are only 1019 columns that register as loci.

library("adegenet")
#> Loading required package: ade4
#> 
#>    /// adegenet 2.1.0 is loaded ////////////
#> 
#>    > overview: '?adegenet'
#>    > tutorials/doc/questions: 'adegenetWeb()' 
#>    > bug reports/feature requests: adegenetIssues()
tmp <- tempfile(fileext = ".str")
download.file("https://github.com/thibautjombart/adegenet/files/228778/output_sceloporus_032415_n43_h5_p75.str.txt", 
  destfile = tmp)
read.structure(tmp, n.ind = 80, n.loc = 1024, onerowperind = FALSE, col.lab = 1, 
  col.pop = 0, col.others = NULL, row.marknames = NULL, NA.char = "-9", pop = NULL, 
  ask = FALSE, quiet = FALSE)
#> 
#>  Converting data from a STRUCTURE .stru file to a genind object...
#> Error in mat[, (ncol(mat) - p + 1):ncol(mat)]: only 0's may be mixed with negative subscripts
read.structure(tmp, n.ind = 80, n.loc = 1019, onerowperind = FALSE, col.lab = 1, 
  col.pop = 0, col.others = NULL, row.marknames = NULL, NA.char = "-9", pop = NULL, 
  ask = FALSE, quiet = FALSE)
#> 
#>  Converting data from a STRUCTURE .stru file to a genind object...
#> Warning in df2genind(X = X, pop = pop, ploidy = 2, sep = sep, ncode =
#> ncode): entirely non-type marker(s) deleted
#> /// GENIND OBJECT /////////
#> 
#>  // 80 individuals; 1,017 loci; 2,047 alleles; size: 1.1 Mb
#> 
#>  // Basic content
#>    @tab:  80 x 2047 matrix of allele counts
#>    @loc.n.all: number of alleles per locus (range: 1-4)
#>    @loc.fac: locus factor for the 2047 columns of @tab
#>    @all.names: list of allele names for each locus
#>    @ploidy: ploidy of each individual  (range: 2-2)
#>    @type:  codom
#>    @call: read.structure(file = tmp, n.ind = 80, n.loc = 1019, onerowperind = FALSE, 
#>     col.lab = 1, col.pop = 0, col.others = NULL, row.marknames = NULL, 
#>     NA.char = "-9", pop = NULL, ask = FALSE, quiet = FALSE)
#> 
#>  // Optional content
#>    - empty -
sum(!sapply(read.table(tmp, sep = "\t"), is.logical))
#> [1] 1020

Session info

Dear @zkamvar I am facing the same problem as mentioned, How will i know that how many loci are detected in the structure file? am going round and round but could not figure it out. Please help me how I will know the number of loci being read by adegent?
Thank you so much in advance,
genotypic.data.structure.AFG landrace.stru.txt

SMoulherat · 2020-11-14T10:11:57Z

I had the same problem on a data set of A. obstetricans:
AO_gen_F<-read.structure(
"File",
sep = ";",
n.ind=474,
n.loc=13,
onerowperind = TRUE,
NA.char="-9",
col.lab=1,
col.pop=2,
row.marknames = 1,
col.others = 0)

Comparing with other .stru I have, I saw that my working .stru have a space separator while those not working have a ; . Thus I replace ; per spaces in my not working file and obtained the expected results.
So the bug is in the parameter management.

Cheers

thesnakeguy · 2021-06-02T11:37:46Z

Hello, I am having a very similar problem with the dapc command, where I get the same error as saidwali when I run the code
"mmOfour <- dapc(genlit.vcf, pop.list$pop, n.pca = 20, n.da = 4)"
Error in dm[, 1L:dimen, drop = FALSE] :
only 0's may be mixed with negative subscripts

I am currently running the Adegenet package 2.1.2. I am generating the genlight file with vcfR.
string.vcf <- read.vcfR("file.vcf")
genlit.vcf <- vcfR2genlight(string.vcf)

The Adegenet find.clusters program works with the genlight file. Additionally, previously generated genlight files work when running dapc.

I have been spinning my wheels with this error code for the past week, as I am re-analyzing some data after some changes to upstream filtering processes. I have relaxed some filters so that the new vcf/genlit files have more SNPs, and more missing data (however no more than ~25%).

Any help would be appreciated!

Has this been solved? I am experiencing the same thing... I also read the VCF file with vcfR and converted it with vcfR2genlight.

zkamvar · 2021-06-06T03:55:13Z

Please forgive the lateness of my reply. It's.... been a hell of a year for everyone.

Regarding errors in structure files

It's likely that whitespace characters are giving you problems. There is a difference between a tab and a space that doesn't show up on text editors by default, which will cause problems down the line. For example, in my answer to the initial inquiry back in 2017, I showed that only 1019 loci were being detected. What I didn't explain was that there were six columns after the ID column that were completely blank because there was a series of six tabs after the ID. The truth is, there are many reason why this could be happening. Unfortunately the structure format is quite varied and it can be really hard to debug without knowing what you were expecting (number of loci and number of individuals)

Regarding vcfR errors

These errors don't have anything to do with the initial issue. You are getting a similar error because it's a common error message in R. The problem is that I don't have any way to reproduce the error you are getting because I don't know what the state of the data is. What I do know is that the code dm[, 1L:dimen, drop=FALSE] does not come from {adegenet}, rather it comes from MASS::predict.lda(). This comes from the Discriminant Analysis portion of the DAPC:

adegenet/R/dapc.R

Lines 78 to 92 in 78be588

    
           ## PERFORM DA ## 
        
           ldaX <- lda(XU, grp, tol=1e-30) # tol=1e-30 is a kludge, but a safe (?) one to avoid fancy rescaling by lda.default 
        
           lda.dim <- sum(ldaX$svd^2 > 1e-10) 
        
           ldaX$svd <- ldaX$svd[1:lda.dim] 
        
           ldaX$scaling <- ldaX$scaling[,1:lda.dim,drop=FALSE] 
        
           if(is.null(n.da)){ 
        
               barplot(ldaX$svd^2, xlab="Linear Discriminants", ylab="F-statistic", main="Discriminant analysis eigenvalues", col=heat.colors(length(levels(grp))) ) 
        
               cat("Choose the number discriminant functions to retain (>=1): ") 
        
               n.da <- as.integer(readLines(con = getOption('adegenet.testcon'), n = 1)) 
        
           } 
        
           ##n.da <- min(n.da, length(levels(grp))-1, n.pca) # can't be more than K-1 disc. func., or more than n.pca 
        
           n.da <- round(min(n.da, lda.dim)) # can't be more than K-1 disc. func., or more than n.pca 
        
           predX <- predict(ldaX, dimen=n.da)

Unfortunately, this is as far as I can go without knowing what your data looks like. What might help in debugging is to not set n.da and see how many discriminant axes are available because that is the source of the error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

121 of 9,346 importing from structure file - inconsistent "negative subscripts" error #141

121 of 9,346 importing from structure file - inconsistent "negative subscripts" error #141

gottschoa commented Apr 20, 2016

thibautjombart commented Apr 21, 2016

thibautjombart commented Aug 5, 2016

gottschoa commented Aug 16, 2016

MagB commented Nov 23, 2016

thibautjombart commented Nov 25, 2016

zkamvar commented Oct 10, 2017

saidwali commented Feb 21, 2020

zkamvar commented Feb 21, 2020

saidwali commented Feb 21, 2020

zkamvar commented Feb 21, 2020

kkolis commented Apr 20, 2020 •

edited

Loading

massub commented Aug 5, 2020

SMoulherat commented Nov 14, 2020

thesnakeguy commented Jun 2, 2021

zkamvar commented Jun 6, 2021

121 of 9,346 importing from structure file - inconsistent "negative subscripts" error #141

121 of 9,346 importing from structure file - inconsistent "negative subscripts" error #141

Comments

gottschoa commented Apr 20, 2016

thibautjombart commented Apr 21, 2016

thibautjombart commented Aug 5, 2016

gottschoa commented Aug 16, 2016

MagB commented Nov 23, 2016

thibautjombart commented Nov 25, 2016

zkamvar commented Oct 10, 2017

saidwali commented Feb 21, 2020

zkamvar commented Feb 21, 2020

saidwali commented Feb 21, 2020

zkamvar commented Feb 21, 2020

kkolis commented Apr 20, 2020 • edited Loading

massub commented Aug 5, 2020

SMoulherat commented Nov 14, 2020

thesnakeguy commented Jun 2, 2021

zkamvar commented Jun 6, 2021

Regarding errors in structure files

Regarding vcfR errors

kkolis commented Apr 20, 2020 •

edited

Loading