backend.maintenance module

This module contains functions that are not normally needed to run the platform, but that can be useful during development, or to cleanup various things in the database.

These functions are not integrated in the rest of the platform, so running them involves starting a Django shell (“python manage.py shell”), importing this module and running the function manually.

backend.maintenance.cleanup_abstracts()[source]

Run HTML sanitizing on the abstracts (this is normally done on creation of the papers, but not for old dumps of the database)

backend.maintenance.cleanup_names(dry_run=False)[source]

Deletes all the names that are not linked to any researcher

backend.maintenance.cleanup_paper_researcher_ids()[source]

Ensures that all researcher_ids in Papers link to actual researchers

backend.maintenance.cleanup_researchers()[source]

Deletes all the researchers who have not authored any paper.

backend.maintenance.cleanup_titles()[source]

Run HTML sanitizing on all the titles of the papers (this is normally done on creation of the papers, but not for old dumps of the database)

backend.maintenance.create_publisher_aliases(erase_existing=True)[source]
backend.maintenance.enumerate_large_qs(queryset, key=u'pk', batch_size=256)[source]

Enumerates a large queryset (milions of rows) efficiently

backend.maintenance.find_collisions()[source]

Recomputes all the fingerprints and reports those which would be merged by recompute_fingerprints()

backend.maintenance.merge_names(fro, to)[source]

Merges the name object ‘fro’ into ‘to

backend.maintenance.recompute_fingerprints()[source]

Recomputes the fingerprints of all papers, merging those who end up having the same fingerprint

backend.maintenance.recompute_publisher_policies()[source]

Recomputes the publisher policy according to some possibly new criteria

backend.maintenance.refetch_containers()[source]

Tries to assign containers to OaiRecords without containers

backend.maintenance.refetch_publishers()[source]

Tries to assign publishers to OaiRecords without Journals

backend.maintenance.update_availability()[source]
backend.maintenance.update_index_for_model(model, batch_size=256, batches_per_commit=10, firstpk=0)[source]

More efficient update of the search index for large models such as Paper

Parameters:
  • batch_size – the number of instances to retrieve for each query
  • batches_per_commit – the number of batches after which we should commit to the search engine
  • firstpk – the instance to start with.
backend.maintenance.update_paper_statuses()[source]

Should only be run if something went wrong, the backend is supposed to update the fields by itself