papers.baremodels module

This module defines bare versions of the regular models: these are classes whose instances do not correspond to an object in the database. They are only stored in memory. This is useful for the API, where lookups are done online, without name ambiguity resolution.

class papers.baremodels.BareAuthor(*args, **kwargs)[source]

Bases: papers.baremodels.BareObject

The base class for the author of a paper. This holds the name of the author, its position in the authors list, and its possible affiliations.

classmethod deserialize(rep)[source]

Creates an Author object out of a serialized representation.

is_known

An author is “known” when it is linked to a known researcher.

json()[source]

JSON representation of the author for dataset dumping purposes, or for the public API.

researcher

Returns the Researcher object associated with this author (if any)

serialize()[source]

JSON representation for storage in a JSON field (internal, not to be used as the output of the API)

update_name_variants_if_needed(default_confidence=0.1)[source]

Ensure that an author associated with an ORCID has a name that is the variant of the researcher with that ORCID

class papers.baremodels.BareName(*args, **kwargs)[source]

Bases: papers.baremodels.BareObject

classmethod create(first, last)[source]

Creates an instance of the Name object without saving it. Useful for name lookups where we are not sure we want to keep the name in the model.

classmethod create_bare(first, last)[source]

Same as create, provided for uniformity among bare classes.

classmethod deserialize(rep)[source]

Reconstruct an object based on its serialized representation

first_letter()[source]

First letter of the last name, for sorting purposes

is_known

Does this name belong to at least one known researcher?

json()[source]

Returns a JSON representation of the name (for external APIs)

serialize()[source]

JSON representation for internal storage purposes

class papers.baremodels.BareOaiRecord(*args, **kwargs)[source]

Bases: papers.baremodels.BareObject

cleanup_description()[source]

Removes clutter frequently included in abstracts. Note that this does not save the object (being a method of BareOaiRecord).

full_journal_title()[source]

The full title of the journal, otherwise the title present in CrossRef’s metadata, which might be shorter.

has_publication_metadata()[source]

Does this record tell where the paper is published? If so we can use it to look up the policy in RoMEO.

json()[source]

Dumps the OAI record as a JSON object (for dataset dumping purposes)

oa_status()[source]

Policy of the publisher for this publication

publisher_or_default()[source]

Returns the publisher. If the publisher is unknown, returns an instance of DummyPublisher.

source_or_publisher()[source]

Returns the name of the source to display. If the record comes from a repository that we index directly via OAI-PMH, we return the name of the source. If the record has a publisher name, we return the publisher name.

update_priority()[source]
class papers.baremodels.BareObject(*args, **kwargs)[source]

Bases: object

A Bare object contains the skeleton for a non-bare (Django model) class. Its fields are stored in memory only and it does not correspond to a DB entry. To convert a bare object to its non-bare counterpart, for instance a BareName b into a Name, use Name.from_bare(b).

breadcrumbs()[source]

Breadcrumbs of bare objects are empty by default.

check_mandatory_fields()[source]

Raises ValueError if any field is missing. The list of mandatory fields for the class should be stored in _mandatory_fields.

classmethod from_bare(bare_obj)[source]

This creates an instance of the current class as a copy of a bare instance. Concretely, this copies all the fields contained in the bare object to an instance of the current class, which is expected to be a subclass of the bare object’s class.

class papers.baremodels.BarePaper(*args, **kwargs)[source]

Bases: papers.baremodels.BareObject

This class is the bare analogue to Paper. Its authors are lists of BareName, and its publications and OAI records are also bare.

MAX_DISPLAYED_AUTHORS = 15
abstract
add_author(author, position=None)[source]

Adds a new author to the paper, at the end of the list.

Parameters:position – if provided, set the author to the given position.
Returns:the BareAuthor that was added (it can differ in subclasses)
add_oairecord(oairecord)[source]

Adds a new OAI record to the paper

Returns:the BareOaiRecord that was added (it can differ in subclasses)
affiliations()[source]

The list of affiliations of all authors

author_count

Number of authors.

author_names()[source]

The list of Name instances of the authors

authors

The list of authors. They are bare in BarePaper. In other implementations, they can be arbitrary iterables of subclasses of BareAuthor. They are sorted in their natural order on paper.

bare_author_names()[source]

The list of name pairs (first,last) of the authors

check_authors()[source]

Check the sanity of authors (for now, only that the list is non-empty)

citation()[source]

A short citation-like representation of the paper. E.g. Joyal and Street, 1992

Link to search for the paper in CORE

classmethod create(title, author_names, pubdate, visible=True, affiliations=None, orcids=None)[source]

Creates a (bare) paper. To save it to the database, we need to run the clustering algorithm to resolve Researchers for the authors, using from_bare from the (non-bare) Paper subclass..

Parameters:
  • title – The title of the paper (as a string). If it is too long for the database, ValueError is raised.
  • author_names – The ordered list of author names, as Name objects.
  • pubdate – The publication date, as a python date object
  • visible – The visibility of the paper if it is created. If another paper exists, the visibility will be set to the maximum of the two possible visibilities.
  • affiliations – A list of (possibly None) affiliations for the authors. It has to have the same length as the list of author names.
  • orcids – same as affiliations, but for ORCID ids.
displayed_authors()[source]

Returns the full list of authors if there are not too many of them, otherwise returns only the interesting_authors()

first_publications()[source]

The list of the 3 first OAI records with publication metadata associated with this paper (in most cases, that should return all such records, but in some nasty cases many publications end up merged, and it is not very elegant to show all of them to the users).

classmethod from_bare(bare_obj)[source]

Creates an instance of this class from a BarePaper.

Link to search for the paper in Google Scholar

has_many_authors

When the paper has more than some arbitrary number of authors.

interesting_authors

The list of authors to display when the complete list is too long.

is_orphan()[source]

When no OAI record is associated with this paper.

json()[source]

JSON representation of the paper, for dataset dumping purposes

new_fingerprint(verbose=False)[source]

The fingerprint of the paper, taking into account the changes that may have occured since the last computation of the fingerprint. This does not update the fingerprint field, just computes its candidate value.

oairecords

The list of OAI records associated with this paper. It can be arbitrary iterables of subclasses of BareOaiRecord.

orcids()[source]

The list of ORCIDs of all authors

plain_fingerprint(verbose=False)[source]

Debugging function to display the plain fingerprint

prioritary_oai_records

OAI records from custom sources we trust (positive priority)

publications

The OAI records with publication metadata (i.e. journal title and publisher name). These records can potentially be associated with publisher policies.

publications_with_unique_publisher()[source]

Iterable of publications where subsequent publications with the same publisher are removed.

publisher()[source]

Returns the first publisher we can find for this paper, otherwise a DummyPublisher

set_researcher(position, researcher_id)[source]

Sets the researcher_id for the author at the given position (0-indexed)

slug
sorted_oai_records

OAI records sorted by decreasing order of priority (lower priority means poorer overall quality of the source).

unique_prioritary_oai_records

OAI records from sources we trust, with unique source.

update_availability(cached_oairecords=[])[source]

Updates the BarePaper‘s own pdf_url field based on its sources (BareOaiRecord).

This uses a non-trivial logic, hence it is useful to keep this result cached in the database row.

Parameters:cached_oairecords – the list of OaiRecords if we already have it from somewhere (otherwise it is fetched)
update_visible()[source]

Updates the visibility of the paper. Only papers with at least one source should be visible.

year

Year of publication of the paper.