Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect reading of two openly available test datasets in .ang file format #413

Closed
Tracked by #415
hakonanes opened this issue Dec 4, 2022 · 6 comments · Fixed by #416
Closed
Tracked by #415

Incorrect reading of two openly available test datasets in .ang file format #413

hakonanes opened this issue Dec 4, 2022 · 6 comments · Fixed by #416
Labels
bug Something isn't working
Milestone

Comments

@hakonanes
Copy link
Member

See issues with incorrectly read .ang files into CrystalMap in #411:

  • Scan units should be "um", not "nm"
  • Phase IDs of the AF96 dataset are incorrectly read
@hakonanes hakonanes added the bug Something isn't working label Dec 4, 2022
@hakonanes hakonanes added this to the v0.10.3 milestone Dec 4, 2022
@argerlt
Copy link
Contributor

argerlt commented Dec 5, 2022

Here's the code snippet for loading the AF96 datasets correctly. I don't know enough about the TSL OIM software used to collect this data to know if ALL ebsd scans from tsl can be read like this, or if there are user choices that change the ordering of columns. Also not sure how best to make orix determine the correct phase data, or if that should be left up to orix users to add.

# -*- coding: utf-8 -*-
"""
Created on Thu Nov 10 10:26:48 2022

@author: agerlt
"""

import numpy as np
import glob
from orix.quaternion import Rotation
from diffpy.structure import Atom, Lattice, Structure
from orix.crystal_map import CrystalMap, PhaseList
from orix import io
import os


try:
    os.mkdir("AF96")
except:
    # delete the files from the last run
    old = glob.glob("AF96/AF96_Large*.h5")
    [os.remove(x) for x in old]

angs = glob.glob("AF96_Large*.ang")
xmaps = []

#add iterator for naming files
iterator = 0
for i, ang in enumerate(angs):

    # Load the data
    e1, e2, e3, x, y, image_quality, confidence_index, phase, indexed, \
        fit_parameter = np.loadtxt(ang, unpack=True)

    # Make an orix .h5 file.
    eu = np.column_stack((e1, e2, e3))
    rots = Rotation.from_euler(eu)
    properties = dict(iq=image_quality.astype(np.float32),
                      ci=confidence_index.astype(np.float32),
                      fit_parameter=fit_parameter.astype(np.float32))
    # Create unit cells of the phases
    structures = [
        Structure(
            title="austenite",
            atoms=[Atom("fe", [0] * 3)],
            lattice=Lattice(0.360, 0.360, 0.360, 90, 90, 90),
        ),
        Structure(
            title="ferrite",
            atoms=[Atom("fe", [0] * 3)],
            lattice=Lattice(0.287, 0.287, 0.287, 90, 90, 90),
        ),
    ]

    phase_list = PhaseList(
        names=["austenite", "ferrite"],
        point_groups=["m-3m", "m-3m"],
        space_groups=[225, 229],
        structures=structures,
    )
    # Create a CrystalMap instance
    xmap = CrystalMap(
        rotations=rots,
        phase_id=phase.astype(np.int32),
        x=x.astype(np.float32),
        y=y.astype(np.float32),
        phase_list=phase_list,
        prop=properties,
        )
    xmap
    xmaps.append(xmap)

print("Saving everything as orix .h5 files...")
[io.save("AF96/AF96_Large_{}.h5".format(i+1), xmaps[i]) for i in np.arange(5)]
print("Done!")

@argerlt
Copy link
Contributor

argerlt commented Dec 5, 2022

Also, attached is all the information on the collection software, taken from the following paper https://doi.org/10.1016/j.matchar.2019.109835
image

@hakonanes
Copy link
Member Author

Thanks for pointing me to these test datasets, @argerlt. I've fixed the identified issues in #416, and hope to release it in a 0.10.3 patch next week.

In your code snippet above, you assume the following column names for the .ang file data:

e1, e2, e3, x, y, image_quality, confidence_index, phase, indexed, fit_parameter

I'm not sure about the ninth column, "indexed". What do you base this name on? In the file I've tested, Field of view 1_EBSD data_Raw.ang, this column contains only ones. However, in all other .ang files I've read before, un-indexed points are identified as having a confidence index (CI) of -1 and a pattern fit of 180 degrees. 29 points in the mentioned .ang file has a CI of -1, i.e. are identified as un-indexed. Thus, I believe the ninth column contains some other data. But I have no idea what, so I've named the data "unknown1" in the returned CrystalMap.prop dictionary.

@argerlt
Copy link
Contributor

argerlt commented Dec 7, 2022

Your're right. that was a mistake on my part. The correct name is either "SEM signal" or "detector signal" or just "sem", which is left as 1 if there is no corresponding SEM data included.

Looking inside MTEX's .ang reader found here, I believe this case lines up with their description of "version 5" (line 113):

  % we need to guess one of the following conventions
  % Euler 1 Euler 2 Euler 3 X Y IQ CI Phase SEM_signal Fit
  % Euler 1 Euler 2 Euler 3 X Y IQ CI Fit phase
  % Euler 1 Euler 2 Euler 3 X Y IQ CI Fit unknown1 unknown2 phase
  % most important is the position of the phase
  
  % for future reference:
  % the following is taken from a recent .ang file - some new files might 
  % actually state the version in the header
  %
  % # NOTES: Start
  % # Version 1: phi1, PHI, phi2, x, y, iq (x*=0.1 & y*=0.1)
  % # Version 2: phi1, PHI, phi2, x, y, iq, ci
  % # Version 3: phi1, PHI, phi2, x, y, iq, ci, phase
  % # Version 4: phi1, PHI, phi2, x, y, iq, ci, phase, sem
  % # Version 5: phi1, PHI, phi2, x, y, iq, ci, phase, sem, fit
  % # Version 6: phi1, PHI, phi2, x, y, iq, ci, phase, sem, fit, PRIAS Bottom Strip, PRIAS Center Square, PRIAS Top Strip, Custom Value
  % # Version 7: phi1, PHI, phi2, x, y, iq, ci, phase, sem, fit. PRIAS, Custom, EDS and CMV values included if valid
  % # Phase index: 0 for single phase, starting at 1 for multiphase
  % # CMV = Correlative Microscopy value
  % # EDS = cumulative counts over a specific range of energies
  % # SEM = any external detector signal but usually the secondary electron detector signal
  % # NOTES: End
  %

My two cents:
Asking around in my lab, it seems the TSL .ang file format has changed some over the years, as has Oxford's. Thus, when trying to write a generic EBSD_loader, it seems the best practice would be to make a list of all possible formats, then pair it down based on column number, if columns contain integer or float data, etc.

That said, creating a comprehensive "if/then/else" tree for every oddball format sounds exhausting, and so far for me, saying "if 10 columns, assume phi1, Phi, phi2, x, y, iq, ci, phase_id, detector_signal, pattern_fit" has yet to fail, so such a function might only need be included if and when a test case is found that Orix mishandles.

@hakonanes
Copy link
Member Author

[...] it seems the best practice would be to make a list of all possible formats, then pair it down based on column number

This is what the current reader does. See relevant lines in the updated possible columns in #416. Since ASTAR, EMsoft and orix have unique footprints in their .ang file header, if none of these footprints are found, we assume the file was written by EDAX TSL. Then, we determine the column names based on the number of columns available. Reading EDAX TSL .ang files with 10 or 15 columns should now work. The reader will fail as in #411 if a file with another number of columns is read. But it should not fail silently (as demonstrated), so we can improve it further when that happens.

I consider this fixed once #416 is merged.

@argerlt
Copy link
Contributor

argerlt commented Dec 8, 2022

Ah, you are right.

In that case, my feedback is just "I believe unknown 1 should be changed to sem". Apologies for the long walk to a short answer.

"If I had had more time I would have written a shorter letter"
- Mark Twain

@hakonanes hakonanes modified the milestones: v0.10.3, v0.11.0 Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants