@@ -104,8 +104,16 @@ \section{Introduction}
104
104
section~\ref {sect:quantities }, while the mapping between our columns and the
105
105
VAMDC-XSAMS Data Model is given in section~\ref {sect:mapping }.
106
106
107
+ During the development of the standard, a major problem in molecular
108
+ spectroscopy turned out to be species nomenclature. The core LineTAP
109
+ table sidesteps this problem by identifying species using IUPAC standard
110
+ InChIs, a choice unpopular with many practitioners. To facilitate the
111
+ use of colloquial species designations (`` ethyl alcohol'' ), this
112
+ specification also defines a \textit {species table } associating common
113
+ names and sum formulas with InChIs in section \ref {sect:speciestable }.
114
+
107
115
When accessed using the Table Access Protocol TAP
108
- \citep {2019ivoa.spec.0927D }, the table can be queried using the
116
+ \citep {2019ivoa.spec.0927D }, the tables can be queried using the
109
117
expressive SQL-derived query language ADQL, while query results are
110
118
available in the VOTable format, easily readable by VO client
111
119
applications. Line databases accessible in this way can be registered
@@ -220,6 +228,13 @@ \subsection{Credit}
220
228
repository of line data, it should be as simple as possible for users to
221
229
give credit to the contributors of line data.
222
230
231
+ \subsection {Resolution of Molecule Designation }
232
+ \label {uc:resolution }
233
+
234
+ A researcher wants to find lines for the molecule they have been calling
235
+ `` Methyl Mercaptan'' or designated by a pseudo-structural formula like
236
+ \verb |CH3SHv=0 | for a long time.
237
+
223
238
224
239
\subsection {Non-Use Cases }
225
240
@@ -235,6 +250,7 @@ \subsection{Non-Use Cases}
235
250
\end {itemize }
236
251
237
252
253
+
238
254
\begin {table }[hpt]
239
255
\hskip -0.05\linewidth
240
256
\begin {tabular }{p{0.43\linewidth }cp{0.5\linewidth }}
@@ -280,7 +296,7 @@ \subsection{Non-Use Cases}
280
296
\end {table }
281
297
282
298
283
- \section {Spectral Line Data }\label {sect:quantities }
299
+ \section {Spectral Lines Table }\label {sect:quantities }
284
300
285
301
Table~\ref {tab:ltcols } gives the columns that make up the LineTAP
286
302
relational model. Implementations MUST have all columns given in this
@@ -379,12 +395,53 @@ \section{Spectral Line Data}\label{sect:quantities}
379
395
380
396
\end {itemize }
381
397
398
+ \section {Species Table }\label {sect:speciestable }
399
+ \label {ref:speciestable }
400
+
401
+ The species table is used to facilitate the referencing of molecules. As
402
+ there are many summary formulas and colloquial molecule names for common
403
+ species (and more than one species may correspond to a given summary
404
+ formula and even colloquial name), the resolution of such identifiers to
405
+ InChIs is generally non-trivial.
382
406
383
- \section {Protocol }
384
- \label {sect:protocol }
385
- \subsection {Queries: LineTAP }
407
+ LineTAP's species table contains a mapping between common names and
408
+ summary formulas and InChIs. It should be populated by data providers
409
+ publishing molecule data to the best of their knowledge. It is
410
+ explicitly possible to associate multiple names with a single InChI.
411
+ There is no explicit relationship between a species table and LineTAP
412
+ tables on a given service, i.e., the presence of a species in the the
413
+ species table is not a guarantee that data on it is available from any
414
+ table in the service.
415
+
416
+ For most cases, only the InChIKey is enough to reference a molecule. The InChi
417
+ column is present in this table for the case that users want to use it to confirm if the
418
+ returned molecule is the one they're searching for.
419
+
420
+ \begin {table }[hpt]
421
+ \hskip -0.05\linewidth
422
+ \begin {tabular }{p{0.43\linewidth }cp{0.5\linewidth }}
423
+ \sptablerule
424
+ \textbf {Name [Unit] } \ucd {UCD}&\textbf {Type }&\textbf {Description }\\
425
+ \sptablerule
426
+ % GENERATED: python3 make-species-table.py
427
+ \texttt {inchikey } \hfil\break\ucd {} & text & \raggedright InChIKey of this species\tabularnewline
428
+ \rowsep
429
+ \texttt {inchi } \hfil\break\ucd {} & text & \raggedright InChI of this species\tabularnewline
430
+ \rowsep
431
+ \texttt {name } \hfil\break\ucd {} & text & \raggedright A common name of this species\tabularnewline
432
+ \rowsep
433
+ \texttt {formula } \hfil\break\ucd {} & text & \raggedright Chemical formula of this species in some free-ish notation\tabularnewline
434
+ \rowsep
435
+ \texttt {source\_ id } \hfil\break\ucd {} & text & \raggedright VAMDC identifier of the origin of this mapping\tabularnewline
386
436
387
- \subsection {User-defined functions }
437
+ % /GENERATED
438
+ \sptablerule
439
+ \end {tabular }
440
+ \caption {The columns that make up the Species Table. }
441
+ \label {tab:spcols }
442
+ \end {table }
443
+
444
+ \section {ADQL User-defined functions }
388
445
\label {sect:udfs }
389
446
390
447
LineTAP services MUST implement the \texttt {ivo\_ specconv } user defined
@@ -541,6 +598,24 @@ \subsubsection{Characterising a Service's Data Holdings}
541
598
GROUP BY inchi
542
599
\end {lstlisting }
543
600
601
+ \subsubsection {Searching With Trivial Molecule Names }
602
+
603
+ Searching with trivial names as discussed in use
604
+ case~\ref {uc:resolution } would often be a two-step process where clients
605
+ ask the researcher which InChI would correspond the the species they
606
+ were looking for. In simple cases, however, a single joined query can be
607
+ run, too.
608
+
609
+ % please-run-a-test
610
+ \ begin{lstlisting} [language=SQL]
611
+ SELECT
612
+ *
613
+ FROM casa_lines.line_tap
614
+ JOIN species.main as s USING (inchikey)
615
+ WHERE s.name='Methylidynium'
616
+ \end {lstlisting }
617
+
618
+
544
619
\section {Mapping from VAMDCXSAMS }
545
620
\label {sect:mapping }
546
621
@@ -665,16 +740,13 @@ \section{LineTAP and the VO Registry}
665
740
666
741
\subsection {Registering LineTAP-conforming Tables }
667
742
668
- LineTAP tables are registered using VODataService \citep {2021ivoa.spec.1102D }
743
+ LineTAP line tables are registered using VODataService \citep {2021ivoa.spec.1102D }
669
744
tablesets, where the table utype is set to
670
- $$ \hbox {\verb |ivo://ivoa.net/std/linetap#table -1.0 |}.$$
745
+ $$ \hbox {\verb |ivo://ivoa.net/std/linetap#lines -1.0 |}.$$
671
746
672
- The tableset is normally contained in a VODataService \xmlel {CatalogService}
673
- record with a TAP capability, and this capability normally is an auxiliary
674
- capability as per DDC \citep {2019ivoa.spec.0520D }. For one-table
675
- services a full TAPRegExt \citep {2012ivoa.spec.0827D } capability is also
676
- allowed; other resource types can be used for registration as
677
- appropriate.
747
+ The tableset is contained in a VODataService \xmlel {CatalogResource}
748
+ record with a TAP auxiliary capability
749
+ as per DDC \citep {2019ivoa.spec.0520D }.
678
750
679
751
Further capabilities, for instance for full VAMDC or legacy SLAP
680
752
services, may be given in the same record.
@@ -714,7 +786,7 @@ \subsection{Registering LineTAP-conforming Tables}
714
786
<name>toss.ivoa_lines</name>
715
787
<title>TOSS</title>
716
788
<description> The LineTAP version of...</description>
717
- <utype>ivo://ivoa.net/std/linetap#table -1.0</utype>
789
+ <utype>ivo://ivoa.net/std/linetap#lines -1.0</utype>
718
790
...
719
791
</table>
720
792
\end {lstlisting }
@@ -726,6 +798,12 @@ \subsection{Registering LineTAP-conforming Tables}
726
798
and is thus to be expected in most registrations of this type. Clients
727
799
are advised to use the resource description for full text searches.
728
800
801
+ Species tables are registered in exactly the same way, except their
802
+ utype is
803
+ $$ \hbox {\verb |ivo://ivoa.net/std/linetap#species-1.0 |}.$$
804
+ Data providers should only register line and species tables in one
805
+ resource record if the species table really has the same metadata
806
+ (description, author, source, etc) as the line table.
729
807
730
808
\subsection {Discovering LineTAP services }
731
809
@@ -738,35 +816,34 @@ \subsection{Discovering LineTAP services}
738
816
would return TAP access URLs and the table names:
739
817
740
818
\ begin{lstlisting} [language=SQL]
741
- SELECT DISTINCT table_name, access_url
819
+ SELECT table_name, access_url
742
820
FROM rr.res_table
743
821
NATURAL JOIN rr.capability
744
822
NATURAL JOIN rr.interface
745
823
WHERE
746
- table_utype LIKE 'ivo://ivoa.net/std/linetap#table -1.%'
824
+ table_utype LIKE 'ivo://ivoa.net/std/linetap#lines -1.%'
747
825
AND standard_id LIKE 'ivo://ivoa.net/std/tap%'
748
826
AND intf_role='std'
827
+ AND res_type='vs:catalogresource'
749
828
\end {lstlisting }
750
829
751
- The \texttt {DISTINCT } in the main query is a rough filter that removes
752
- entries duplicated because their tables are registred both in the main
753
- TAP record and in an auxiliary capability.
754
-
755
830
The regular expression in the utype match is to make sure minor version
756
831
increments do not prevent service discovery; by IVOA versioning rules,
757
832
all LineTAP services of minor version 1 can be operated by all LineTAP
758
833
clients of version 1. We do not constrain the version of the TAP
759
834
service. Clients may want to adapt the TAP discovery pattern to match
760
835
their specific needs.
761
836
762
-
837
+ Adapting the utype, this query will work analogously for species tables.
763
838
764
839
\appendix
765
- \section {Changes from Previous Versions }
840
+ \section {Changes from WD-2023-03-23 }
766
841
767
- No previous versions yet.
768
- % these would be subsections "Changes from v. WD-..."
769
- % Use itemize environments.
842
+ \begin {itemize }
843
+ \item Adding the species table
844
+ \item Changing the line table utype to \dots lines-1.0 (rather than
845
+ \dots table-1.0 before).
846
+ \end {itemize }
770
847
771
848
772
849
\bibliography {ivoatex/ivoabib,ivoatex/docrepo, localrefs}
0 commit comments