Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinearAlgebra diag() returns error for 1- and 0-dimensional matrices #727

Closed
blackeneth opened this issue May 2, 2020 · 10 comments
Closed

Comments

@blackeneth
Copy link

blackeneth commented May 2, 2020

Problem Statement
The diag() function in Linear Algebra returns an error for 1- and 0-dimensional matrices.

Desired Behavior
In both cases, the input matrix can simply be returned

  • diag([5]) would return [5]
  • diag([]) would return []

Error Example: 1 dimensional case

julia> using LinearAlgebra
julia> a=[5]
1-element Array{Int64,1}:
 5
julia> diag(a)
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix
Stacktrace:
 [1] diag(::Array{Int64,1}) at C:\Users\julia\AppData\Local\Julia-1.4.0\share\julia\stdlib\v1.4\LinearAlgebra\src\generic.jl:425
 [2] top-level scope at none:0

Error Example: 0 dimensional case

julia> b=[]
0-element Array{Any,1}
julia> diag(b)
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix
Stacktrace:
 [1] diag(::Array{Any,1}) at C:\Users\julia\AppData\Local\Julia-1.4.0\share\julia\stdlib\v1.4\LinearAlgebra\src\generic.jl:425
 [2] top-level scope at none:0

What do other languages do?

language diag([1]) diag([])
Python Numpy [[1]]† error*
MATLAB [1] []
SAS JMP JSL [1] error
Octave ans=1 ans = [](0x0)
R 1 <0 x 0 matrix>

* the numpy error returned says that diagonal() requires at least a 2x2 matrix; however, it works for a 1-dimensional matrix.
† result corrected from comment below.

System information for above examples:

julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-4960X CPU @ 3.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, ivybridge)
@sostock
Copy link
Contributor

sostock commented May 2, 2020

Note that both [5] and [] are not matrices (2D arrays), but vectors (1D arrays). It works for actual matrices:

julia> a = fill(5, 1, 1) # 1x1 matrix
1×1 Array{Int64,2}:
 5

julia> diag(a)
1-element Array{Int64,1}:
 5

julia> a = fill(5, 0, 0) # 0x0 matrix
0×0 Array{Int64,2}

julia> diag(a)
0-element Array{Int64,1}

@Micket
Copy link

Micket commented May 2, 2020

Similar behavior with numpy as well actually; [] turns into a 1D numpy array, so this should be [[1]] and [[]] respectively, and both behave just like matrices in julia:

>>> numpy.diag([[1]])
array([1])
>>> numpy.diag([[]])
array([], dtype=float64)

The table is also wrong for numpy.diag([1]), it does not give you [1] but [[1]], since it actually does the equivalent of julias diagm and constructs a matrix from a diagonal

>>> numpy.diag([1])
array([[1]])

@blackeneth
Copy link
Author

Note that both [5] and [] are not matrices (2D arrays), but vectors (1D arrays). It works for actual matrices:

We can contrast Linear Algebra to Julia's Implementation of linear algebra.

Is [5] a vector or an array? From a Linear Algebra perspective, it's both, or either.

The Julia Implementation can store [5] in two different data structures (vector or array). The storage data structure shouldn't break the Linear Algebra.

Furthermore, what does Julia tell me about vector [5]?

julia> a=[5]
1-element Array{Int64,1}:
 5

It informs the user that it is a 1-element Array. It doesn't say Vector, it says Array.

I've corrected the table for the Numpy result based on the other comment.

@Micket
Copy link

Micket commented May 3, 2020

Examples are still wrong, because you are not calling the same functionality; numpy.diag has 2 completely opposite modes into one function; constructing a 2D thing from a 1D thing, or extract a 1D thing from a 2D thing. Julia named these diagm and diag respectively.

All languages and libraries I'm aware of, except matlab, is consistent on this; 2D input, 1D output, even when the size in each dimension is less than 2:
Julia:

julia> diag(ones((1,1)))
1-element Array{Float64,1}:
 1.0

Numpy:

>>> diag(ones((1,1))) # or diag([[1]])
array([1.])

Mathematica:

In[4]:= Diagonal[ConstantArray[1, {1, 1}]] (* or Diagonal[{{1}}] *)
Out[4]= {1}

R

> diag(matrix(1, nrow=1))
[1] 1

The only difference is syntax for constructing 2D like things. Matlab/octave is the odd one out, as it decides to give special treatment and disallow true 1D things by padding on a second dimension. The ambiguity of the math is a bad thing.

Julia tells you that [5] is a (1) dimensional thing, and not a 1x1 dimensional thing.

Related:
https://www.youtube.com/watch?v=C2RO34b_oPM

@blackeneth
Copy link
Author

Python Numpy:

>>> a=np.arange(1).reshape(1,1)
>>> a
array([[0]])
>>> a.shape
(1, 1)
>>> a.diagonal()
array([0])
>>> np.diag(a)
array([0])

>>> b=np.array([5])
>>> b
array([5])
>>> b.shape
(1,)
>>> b.diagonal()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: diag requires an array of at least two dimensions
>>> np.diag(b)
array([[5]])

Julia tells you that [5] is a (1) dimensional thing, and not a 1x1 dimensional thing.

Julia is telling me it is storing it as a (1) dimensional thing, not a 1x1 dimensional thing.

[5] is a special case of a (1) dimensional thing and a 1x1 dimensional thing.

You're explained why Julia does it the way it does it.

But . . . how should it be done? You see how I think it should be done under "desired behaviors".

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 4, 2020

As documented, diag returns "the kth diagonal of a matrix, as a vector." That's fairly specific and justifies the current refusal to do anything for [5] or [], which are both vectors. We do as a convention, however, allow many operations that work on matrices to work on vectors by interpreting them as column matrices. Under that interpretation, it would be possible to define diag([1, 2, 3]) to return [1] since that's the diagonal of the matrix with a single column, [1, 2, 3]. That would prevent us from giving the helpful warning we currently give:

julia> diag([1, 2, 3])
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix

This is helpful since people coming from other systems may be expecting diag to construct a matrix with [1, 2, 3] as its diagonal. The question is whether allowing vectors to act like column matrices with respect to the diag function is worth not giving users a clear warning that they're using the wrong function here. Note that if we added this diag would apply to any vector, not just the cases you've picked where diag and diagm happen to have similar results.

@blackeneth
Copy link
Author

blackeneth commented May 4, 2020

The question is whether allowing vectors to act like column matrices with respect to the diag function is worth not giving users a clear warning that they're using the wrong function here.

Eh, they're not using the wrong function, they're using the wrong type. The user truly wanted the diagonal of [5].

It's cumbersome in the REPL to create a 1 dimensional matrix.

julia> a=[5]    # creates vector
1-element Array{Int64,1}:
 5
julia> a=ones(1)*5  # creates vector
1-element Array{Float64,1}:
 5.0
julia> a=fill(5,1)  # creates vector
1-element Array{Int64,1}:
 5
julia> a=Array{Int64}(undef,(1,1))  # creates array
1×1 Array{Int64,2}:
 934448784
julia> a[1,1]=5  # or a[1]=5
5
julia> a=[5]    # easy to do an oopsy and turn it into a vector 
1-element Array{Int64,1}:
 5
julia> a=ones(1,1)*5  # creates array 
1×1 Array{Float64,2}:
 5.0
julia> a=fill(5,(1,1))  # creates array
1×1 Array{Int64,2}:
 5

Just considering the user interface, it would be friendlier to a statistician typing in the REPL if diag([n]) returned [n] and diag([]) returned [].

Note that if we added this diag would apply to any vector, not just the cases you've picked where diag and diagm happen to have similar results.

I don't see why this is the case. It would just apply to 1-dimensional and 0-dimensional vectors.

However, if you don't want to modify the behavior of the function, then I would request that the error message change.

julia> a=[5]
1-element Array{Int64,1}:
 5
julia> diag(a)
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix

The current error message is assuming the user is trying to construct a diagonal matrix.

If the user was just trying to get the diagonal of a 1-dimensional matrix, this error is confusing.

The user error is feeding diag() a vector and not an array. The error message should say that.

Perhaps something like:

ERROR: ArgumentError: diag input must be of an Array type. Use diagm instead if you were trying to construct a diagonal matrix

That would be a helpful error message.

@StefanKarpinski
Copy link
Member

I don't see why this is the case. It would just apply to 1-dimensional and 0-dimensional vectors.

We don't do that sort of thing. Making a function that works when a vector has certain sizes but not others is how you end up with a language that's impossible to write reliable code in.

However, if you don't want to modify the behavior of the function, then I would request that the error message change.

Yes, that would be a reasonable change as well since some users may have wanted to take the diagonal of a vector as though it were a column matrix.

@StefanKarpinski
Copy link
Member

To be clear, it's also reasonable to make diag(v) work and return v[1:end] for vectors, but what's not ok is making it only work for vectors of length zero or one. If we were to do that then we would not be able to warn users expecting diag(v) to do what diagm(v) does as we do now.

@ViralBShah
Copy link
Member

I propose closing this.

@KristofferC KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants