INSPIRE Collaboration Author Lists: creating an author.xml file

Collaboration Author Lists

Information and specification for the author.xml file

Introduction

Providing complete author information upon submission of a document, e.g. to arXiv.org, is challenging, particularly for large collaborations with hundreds or even thousands of authors. Often only the most minimal information about individual authors is transmitted to publication and citation services. To facilitate an immediate and direct attribution of papers to authors, INSPIRE, the American Physical Society and arXiv.org have worked together to create a template for submitting information about authors in a way that will be both precise and universally understood. We encourage the use of the author.xml file schema as described in this document.

What are the advantages of using author.xml?

  • Paper processing speed
    Using an author.xml file allows INSPIRE to automatically add all authors and affiliations so they show up in the database with minimal delay and with as few errors as possible.
  • More accurate author information and attribution of credit
    The author.xml file, with all authors identified by an INSPIRE ID number, will be used by publishers such as the APS in producing their journal articles. It will also be used by other database providers covering the scientific literature. Providing accurate information about the authors results in more accurate publication lists and citation counts as well as more comprehensive search results.
  • Automatic Generation of LaTeX for the paper
    Given the XML file, an auto-generation process using xslt can be used to generate the author list for the paper in the desired LaTeX format.

How does it work?

  • You can produce a tailored output from your collaboration's author data by following the XML template.
  • Once completed, the file can be tested and validated through the validation files included in the download.
  • When your file passes the test, please submit your file along with your next submission to arXiv in the .tar ball.
  • If you encounter any problem and need assistance, or if you are unable provide the information required, please contact us: authors@inspirehep.net

A brief description

The author.xml file contains the following information:

  • Which paper does the author.xml correspond to?
  • The collaboration it represents.
  • The institutions participating in the collaboration.
  • The authors, with their affiliations and ID numbers.

This information is meant to automate the process of publishing a document electronically, without the need for human intervention.

How do I get the information needed in author.xml?

  • INSPIRE ID Numbers for a list of authors
    In your role as maintainer of the collaboration's author list, you have two options to acquire INSPIRE ID Numbers for authors:
    1. Send a list of names and email addresses or the xml file without INSPIRE IDs to authors@inspirehep.net and we will return this list enriched with all the INSPIRE ID Numbers for the authors. These INSPIRE IDs can then be integrated into your author database and delivered as part of your next author list.
    2. Provide the collaboration's internal ID Numbers of the respective person directly in the author.xml file. If you choose this option, please ensure these IDs are persistent and unique within your collaboration. This way, no additional information is needed in the collaboration's author database--we'll match up the internal IDs with INSPIRE ID Numbers and make sure the authors always have the correct ID.

    Motivation for #1: The INSPIRE IDs will be recognized by publishers and other databases or repositories and can also be shared with the authors, who may like to use them for searching and for papers they might write independently.
    Motivation for #2: Easiest approach for the collaboration. However, this will only work for INSPIRE and will not be shared with other parties.

    In any case, once the author.xml is submitted and checked by the INSPIRE service team, missing IDs will be assigned accordingly and created if necessary. Additional information on INSPIRE ID numbers can be found at INSPIRE ID NUMBERS.

  • INSPIRE ID Number for an individual
    The HEPNAMES database may be utilized to find the right ID for an individual. The INSPIRE ID Number will be on the author's record page. If the individual has an entry in HEPNAMES, but not an INSPIRE ID Number, a number will be assigned upon update of the record. If the individual is not in HEPNAMES, you can simply ask the individual to add a record for him/herself. An INSPIRE ID number will be assigned automatically.
  • Identification of affiliations by their Internet domain
    Internet domains provide a unique, universally-understood way to list an institution. Their hierarchical structure enables one to choose the required level of granularity, either at the institutional or departmental level. For example:
    • damtp.cam.ac.uk - for the Department of Applied Mathematics and Theoretical Physics (DAMTP)
    • cam.ac.uk - for the University of Cambridge in general

Can I test my author.xml file before submitting to arXiv?

You are free to use the .dtd file that is included in the download.

Downloads

.tar ball (includes the template, an example and a .dtd file for validation)
.zip file (includes the template, an example and a .dtd file for validation)
authors.xsd XML schema defintion file (inluded in tar and zip)
author.dtd XML document type defintion file (inluded in tar and zip)

ORCID Initiative
INSPIRE
HEPNames
HEPInstitutes
FOAF Project

Partners in the author.xml project

Information Services

Publishers

Collaborations

The Template - Detailed Description

The author.xml file has been designed for collaborations with more than 10 authors. An updated author.xml file should be included with each submission to arXiv. The collaboration's XML file will contain information on each author, such as name, affiliation and INSPIRE ID Number. Descriptions of the template items are listed below the template.

  • <cal:creationDate>: REQUIRED
    Date of creation of this author.xml file.
  • <cal:publicationReference>: REQUIRED
    This element gives the opportunity to bind this author list unambiguously to a document. This can either be an internal report number, an arXiv number, a collaboration's internal document number, an ISBN, a DOI, a persistant web destination or anything that identifies the referenced document.
    If there is no immediate identifier present, the title can serve this purpose as well.
  • <cal:collaboration>: REQUIRED
    This container holds information about the collaboration.
    One or more collaborations reside within the <cal:collaborations> container.
  • <foaf:Organization>: REQUIRED
    This container holds information about an organization with which authors are affiliated. Each organization is identified by the "id" attribute.
    One or more organizations reside within the <cal:organizations> container.
    • Attribute "id": REQUIRED
      Attribute, which is typically a sequential number, starting at "a1" and is used to denote the author's institution in this particular author.xml file for the purpose of attaching authors to the institution. More Information on ID naming conventionsNOTE: For bizarre historical reasons that few people remember, XML requires that an attribute value declared as being of type ID has the same syntax as an XML name - so it cannot start with a digit..
    • Element <cal:orgDomain>: OPTIONAL
      Internet domain of the institution.
      The domain should be detailed enough to unambiguously determine the institution if there are distinct locations throughout the nation, e.g., pv.infn.it rather than just infn.it. If desired, this can go to the department/research-group level.
    • Element <foaf:name>: REQUIRED
      This element defines the name of the organization as it shall appear on the document.
    • Element <cal:orgName>: OPTIONAL
      This element also defines the name of the organization. Depending on where this name originates from, the source attribute can be used.
      The element content shall be only the name of the respective institute. Location information, if not part of the name, may be stated in the orgAddress element.
      • Attribute "source": OPTIONAL (Defaults to "INTERNAL")
        Enables one to use either the INSPIRE (a.k.a. SPIRES-ICN) form of the institution's name or your own INTERNAL form.
    • Element <cal:orgStatus>: OPTIONAL
      Status of the organization within the collaboration. Typically this would be either "member" or "nonmember."
      • Attribute "collaborationid": OPTIONAL
        Enables you to specify which exact collaboration this organization is attached to. The collaboration is represented through its ID (e.g. "c1")
        This element may be repeated if necessary.
    • Element <cal:orgAddress>: OPTIONAL
      Full postal address of the institution as it would be written on a letter head.
    • Element <cal:group>: OPTIONAL
      See group discussion below.
  • <foaf:Person>: REQUIRED
    This container holds information about the author
    One or more authors reside within the <cal:authors> container.
    • Element <foaf:name>: OPTIONAL
      Author's complete name written in the format e.g. "Johannes Diderik van der Waals".
    • Element <foaf:givenName>: OPTIONAL
      All first/given names of an author in roman letters, e.g. "Johannes Diderik". You may leave this out in the rare case that a person does not possess a first name.
    • Element <foaf:familyName>: REQUIRED
      All sur/family names of an author in roman letters, e.g. "van der Waals".
    • Element <cal:authorNameNative>: OPTIONAL
      Name of author as written in his or her native language e.g., "Ле́в Дави́дович Ланда́у" or "張晨光".
    • Element <cal:authorSuffix>: OPTIONAL
      Suffiix information for a name E.g. "Jr.", "Sr.", "III"
    • Element <cal:authorStatus>: OPTIONAL
      This element describes the vital status of an author. If the author is deceased, please state "Deceased". Otherwise this element shall be empty.
    • Element <cal:authorNamePaper>: REQUIRED
      Name of author as it appears on the title page of the paper, e.g. "J. van der Waals".
      This element supports Roman letters only.
    • Element <cal:authorNamePaperGiven>: OPTIONAL
      Given name(s) of the author as it appears on the title page of the paper, typically initials, e.g. "J."
      This element supports Roman letters only. As with <foaf:givenName> it is "optional" only in the technical sense that someone may have one name only.
    • Element <cal:authorNamePaperFamily>: OPTIONAL
      Family name of author as it appears on the title page of the paper, e.g. "van der Waals".
      This element supports Roman letters only.
    • Element <cal:authorCollaboration>: REQUIRED
      In a multi collaboration environment, the author can be attached to a collaboration with the appropriate collaboration ID.
      If the author is a member of more than one collaboration or has more than one position, this element may be repeated.
      • Attribute "collaborationid": REQUIRED (defaults to the first collaboration)
        Enables the specification of which exact collaboration this author is attached to. The collaboration is represented through its ID (e.g. "c1").
      • Attribute "position": OPTIONAL
        This attribute specifies the position of an author within the collaboration.
        This may be "Spokesperson", "Contact person", "Speaker" or "Editor".
    • Element <cal:authorAffiliation>: OPTIONAL
      This element connects the author to his or her institution, through the organization ID attribute.
      All affiliation elements (zero or more) reside within the <cal:authorAffiliations> container.
      Several affiliations may be mentioned by using several of these elements--one line for each affiliation. Please do not use a (comma-separated) list of organization identifiers in the 'organizationid' attribute.
      In cases where multiple affiliations resemble one entity, please mark the organizations with IDs e.g. "o1a", "o1b" and "o1c" to show their relation.
      • Attribute "organizationid": REQUIRED
        Connects with one of the organizations from above. The link is established by using the respective ID of the organization here (e.g. "a1").
      • Attribute "connection": OPTIONAL (Defaults to "Affiliated with")
        This enables you to list information about the connection such as "Affiliated with", "On leave from", "Also at" or "Visitor"
    • Element <cal:authorid>: OPTIONAL
      This element specifies an ID number that identifies an author.
      All ID elements (zero or more) reside within the <cal:authorids> container.
      • Attribute "source": REQUIRED (if there is an authorID element present)
        Specifies the origins of the number. This can be an INSPIRE ID number (source="INSPIRE"), a collaboration-internal ID (source="INTERNAL") or other author ID services (e.g. source="ORCID").
        While the use of INSPIRE ID numbers is strongly encouraged, the use of a persistant ID for an author allows the INSPIRE service team to identify the authors and attach the respective identifiers to their INSPIRE ID.
        Please consult the section on "How do I obtain the information needed in author.xml?" for more detailed information about the handling of author ids.
    • Element <cal:authorFunding>: OPTIONAL
      This element describes the author's funding source, such as a grant or fellowship, if necessary (e.g., Alfred P. Sloan Fellow). Otherwise this element shall be empty.
  • Groups

    Occassionally collaborations wish to group together institutions that form a consortium. The author.xml schema allows for this by treating the group as just another institution and then connecting the institutions together via the ID of this group (<cal:group with="a1" />). In addition to institutions, collaborations and even authors can be grouped together in this way, although it is primarily intended for affiliations, typically united by some sort of funding arrangement.

    The following example shows groups of collaborations and institutions. In the example there is an Italian institution group and a Canadian institution group. The Canadian physicist is affiliated with the Canadian group as a whole while the Italian physicist is affiliated with 2 of the 3 institutions in the Italian institution group. In cases where multiple institutions are treated in some sense as a single entity, in addition to using <cal:group>, it is helpful to use a set of IDs for them that plainly show their relationship, e.g.
    <foaf:Organization id="a27a">
    <foaf:Organization id="a27b">
    <foaf:Organization id="a27c">

Additional information concerning the name spaces "cal" and "foaf"

  • "cal" is the official name space for 'collaboration author lists' as defined in this document.
  • "foaf" is a project creating a Web of machine-readable pages describing people, the links between them and the things they create and do. For building such an information system, standards have been created (For the creation of this document, the foaf vocabulary version 0.97 as of Jan. 1st 2010 is used). It stands for "friend of a friend" and is further described in the foaf specification and the foaf project web page.

 

Example snippets of an author.xml file

The following examples provide a guide for formatting the author.xml file in different scenarios.

Validating your author.xml file

The author.xml file can be validated through a service such as Validome. Any of the example xml files available here should validate cleanly.

Creating TeX files from author.xml file

The author.xml file contains all the information traditionally found on a title page. It can therefore be used to contruct the TeX author list using an xsl file. In fact, xsl files can be used to convert the xml file into any format. Example xsl files are given with their output style shown below:

cal_1.xsl (APS [Phys.Rev., Phys.Rev.Lett.] format)
\author{J.~Smith$^{a1}$}
\author{L.~Picard$^{a2}$}
\affiliation{$^{a1}$ Yerevan Physics Institute, Yerevan, Armenia}
\affiliation{$^{a2}$ CERN, European Organization for Nuclear Research, Geneva, Switzerland}

cal_2.xsl
\affiliation{CERN, European Organization for Nuclear Research, Geneva, Switzerland}
\affiliation{Yerevan Physics Institute, Yerevan, Armenia}
\author{J.~Smith}
\affiliation{CERN, European Organization for Nuclear Research, Geneva, Switzerland}
\author{L.~Picard}
\affiliation{Yerevan Physics Institute, Yerevan, Armenia}

cal_txt.xsl (a text file suitable for uploading to arXiv to create the author list)
J. Smith, L. Picard

These can be used through the unix command: (e.g.)
xsltproc cal_i.xsl author.xml
to produce TeX authors lists in two different, popular formats. Note that these files are fairly simple and do not handle the more complicated aspects of the xml file (such as funding notes).

Valid XHTML 1.0 Strict

 

Last Updated: April 19th 2011