backend.extractors module

class backend.extractors.BaseExtractor(mappings)[source]

Bases: backend.extractors.RegexExtractor

class backend.extractors.CairnExtractor(mappings)[source]

Bases: backend.extractors.RegexExtractor

class backend.extractors.OpenAireExtractor(mappings)[source]

Bases: backend.extractors.RegexExtractor

class backend.extractors.RegexExtractor(mappings)[source]

Bases: backend.extractors.URLExtractor

class backend.extractors.URLExtractor[source]

Bases: object

extract(header, metadata)[source]

Take a record (header + metadata) and return a dict() containing some keys among [‘pdf’, ‘splash’] and whose values are respectively the PDF URL and splash URL