kuha_oai_pmh_repo_handler
Kuha OAI-PMH Repo Handler application.
Serve records from Kuha Document Store throught OAI-PMH protocol.
constants.py
OAI-PMH Repo handler constants.
serve.py
Main entry point for starting OAI-PMH Repo Handler.
- exception kuha_oai_pmh_repo_handler.serve.KuhaEntryPointException[source]
Base for entry point exceptions.
- exception kuha_oai_pmh_repo_handler.serve.InvalidMetadataFormatException[source]
A loaded metadataformat is constructed wrong.
Raised for instance, when a metadataformat implements different sets that other metadataformats in same repository.
- exception kuha_oai_pmh_repo_handler.serve.ConflictingMetadataPrefixException[source]
Raise when non-overridable metadataformats implement the same metadataprefix
- exception kuha_oai_pmh_repo_handler.serve.NoMetadataFormatsException[source]
Unable to load any metadataformats for OAI-PMH Repo Handler
- kuha_oai_pmh_repo_handler.serve.load_metadataformats(entry_point_group)[source]
Load metadataformat using plugin discovery via setuptools entry-points.
The following constraints apply to loaded metadataformats:
Every loaded metadataformat must have a unique mdprefix. Consults overridable (bool) attribute to check if a certain mdprefix can be overridden by another metadataformat. Raises ConflictingMetadataPrefixException, if metadataformats have same mdprefix and are non-overridable.
Every loaded metadataformat must implement the same sets; sets-attribute and list_sets method must be the same for every loaded metadataformat. Note that overridden metadataformats can have different sets that the ones that are finally loaded, as long as the loaded ones implement the same sets. Raises InvalidMetadataFormatException if loaded metadataformats implement different sets.
Also, it is mandatory to load at least one metadataformat. Raises NoMetadataFormatsException if no metadataformats are loaded.
controller.py
OAI-PMH Repo Handler controller
Connects backend components together, mainly metadataformats, XML templates and the oai.Protocol. Is responsible for cathing oai.errors and routing them to the correct template.
- class kuha_oai_pmh_repo_handler.controller.Controller(respond_with_requested_url, mdformats, stylesheet_url=None)[source]
Controls processing of OAI requests
Holds information about the expected behaviour and routes requests to correct handlers using defined metadataformats. Interprets the OAI request and calls the correct metadataformats. Catches oai.errors and routes them to the error handler & template.
- Parameters
respond_with_requested_url (bool) – Configures how to create the OAI response url. If True, the url is generated based on the incoming OAI request. Otherwise, the configured base_url is used. :seealso:
oai.protocol.Response
mdformats (list) – Loaded metadataformats
stylesheet_url (str) – Optional XML stylesheet URL that is added to templates. Defaults to an empty string.
- async static oai_error(ctx, *args, **kwargs)
Handle OAI errors
- :param
oai.Protocol
oai_protocol: OAI protocol instantiated with the current request.
- Returns
Response context
- Return type
- :param
- async static identify(ctx, *args, **kwargs)
Handle Identify OAI request
- :param
Controller._mdfwrapperclass
mdformats: Wrapped & loaded metadataformats
- :param
oai.Protocol
oai_protocol: OAI protocol instantiated with the current request
- Returns
Response context
- Return type
- :param
- async static listsets(ctx, *args, **kwargs)
Handle ListSets OAI request
- :param
Controller._mdfwrapperclass
mdformats: Wrapped & loaded metadataformats
- :param
oai.Protocol
oai_protocol: OAI protocol instantiated with the current request
- Returns
Response context
- Return type
- :param
- async static listmetadataformats(ctx, *args, **kwargs)
Handle ListMetadataFormats OAI request
- :param
Controller._mdfwrapperclass
mdformats: Wrapped & loaded metadataformats
- :param
oai.Protocol
oai_protocol: OAI protocol instantiated with the current request
- Returns
Response context
- Return type
- :param
- async static listidentifiers(ctx, *args, **kwargs)
Handle ListIdentifiers OAI request
- :param
Controller._mdfwrapperclass
mdformats: Wrapped & loaded metadataformats
- :param
oai.Protocol
oai_protocol: OAI protocol instantiated with the current request
- Returns
Response context
- Return type
- :param
- async static listrecords(mdformats, oai_protocol)[source]
Handle ListRecords OAI request
- :param
Controller._mdfwrapperclass
mdformats: Wrapped & loaded metadataformats
- :param
oai.Protocol
oai_protocol: OAI protocol instantiated with the current request
- Returns
Response context
- Return type
- :param
- async static getrecord(mdformats, oai_protocol)[source]
Handle GetRecord OAI request
- :param
Controller._mdfwrapperclass
mdformats: Wrapped & loaded metadataformats
- :param
oai.Protocol
oai_protocol: OAI protocol instantiated with the current request
- Returns
Response context
- Return type
- :param
- kuha_oai_pmh_repo_handler.controller.from_settings(settings, mdformats)[source]
Get Controller from settings.
:param
argparse.Namespace
settings: Loaded settings. :param list mdformats: Loaded metadataformats. :returns: Initialized controller. :rtype:Controller
genshi_loader.py
Load genshi templates.
Usage:
from genshi_loader import add_template_folder, GenPlate
add_template_folder('/path/to/genshi/templates')
@GenPlate('identify.xml')
def handler_identify(genplate_instance):
return {}
- kuha_oai_pmh_repo_handler.genshi_loader.FOLDERS = []
Template folders
- kuha_oai_pmh_repo_handler.genshi_loader.add_template_folders(*folders)[source]
Add folder to lookup for templates.
- Parameters
folder (str) – absolute path to folder containing genshi templates.
- kuha_oai_pmh_repo_handler.genshi_loader.get_template_folders()[source]
Get template folders.
- Returns
template folders.
- Return type
- class kuha_oai_pmh_repo_handler.genshi_loader.KuhaXMLSerializer(**kw)[source]
Subclass XMLSerializer to add a custom filter.
- class kuha_oai_pmh_repo_handler.genshi_loader.GenPlate(template_file, **kw)[source]
Genshi template decorator.
Decorate functions that should write output to genshi-templates. The decorated function must be an asynchronous function and it must return a dictionary.
Example:
from genshi_loader import GenPlate class Handler: @GenPlate('error.xml') async def build_error_message(self, genplate_instance): ... return {'msg': 'there was an error'}
- Parameters
- Raises
ValueError
if decorated function returns invalid type.
http_api.py
Declare HTTP API
This module is responsible for interpreting the HTTP request parameters and headers. All OAI-PMH specific parameters are interpreted in controller logic. The controller must be linked to HTTP API via Tornado WebApplication initialization (see get_app()) and is used in OAIRouteHandler via self.application.settings[‘controller’].
- class kuha_oai_pmh_repo_handler.http_api.OAIRouteHandler(*args, **kwargs)[source]
Declares the OAI-PMH HTTP API.
Takes in HTTP verbs GET and POST. Gathers their parameters and dispatches the parameters, correlation_id HTTP headers and a tornado request object to the controller.
The dispatch is done by calling the controller’s oai_request async method with parameters: args, correlation_id_header, tornado_request
args is a list of two-tuples containing request parameters that were submitted
via GET or POST. * correlation_id_header is a dictionary that can be used as is for further requests using Tornado’s HTTP clients: {‘X-REQUEST-ID’: <corr_id>}. * tornado_request is a tornado request object that wraps the current request. It should be used as a read-only object.
- async write_output(iterable)[source]
Write output by iterating the parameter.
- Parameters
iterable – output to be written as HTTP response.
- async get()[source]
HTTP-GET handler
Gathers request arguments. Calls dispatch. Finishes the response.
“URLs for GET requests have keyword arguments appended to the base URL”
—http://www.openarchives.org/OAI/openarchivesprotocol.html#ProtocolFeatures
- async post()[source]
HTTP-POST handler
Validates request content type. Gathers request arguments. Calls dispatch. Finishes the response.
“Keyword arguments are carried in the message body of the HTTP POST. The Content-Type of the request must be application/x-www-form-urlencoded.”
—http://www.openarchives.org/OAI/openarchivesprotocol.html#ProtocolFeatures
- kuha_oai_pmh_repo_handler.http_api.get_app(api_version, controller, app_class=None)[source]
Setup routes and return initialized Tornado web application.
- Parameters
api_version (str) – HTTP Api version gets prepended to routes.
controller – Controller logic for HTTP API.
app_class – Use custom WebApplication class. Defaults to kuha_common.server.WebApplication.
- Returns
Tornado web application.
- Return type
tornado.web.Application
list_records.py
Run list records sequence on-demand against an OAI-PMH Repo Handler.
Helper script runs through the entire list records sequence with a given metadataPrefix and conditions. Can be used to ensure that all records within a repository are good to serve by catching timeouts from Document Store Client and non-serializable Document Store records.
Logs out the time it takes to complete the full sequence. Prints out all identifiers found by the requested conditions.
If any error conditions are encountered, the best place to look for the cause is the Kuha OAI-PMH Repo Handler log output and Kuha Document Store log output.
metadataformats
Define metadata formats.
Metadataformats create contexts by calling oai_response object and declare templates if needed. Metadataformats raise oai_errors if needed.
- exception kuha_oai_pmh_repo_handler.metadataformats.DuplicateSetSpec[source]
Every OAI set must have a unique spec value
- class kuha_oai_pmh_repo_handler.metadataformats.MDFormat(oai, corr_id_header)[source]
Base class for metadata formats.
Defines common attributes and methods. Subclass to define metadataformats.
- overridable = False
overridable controls how plugin discovery handles metadataformats with same mdprefix. Built-in metadataformats could be overridable, those developed as a plugin should not.
- study_class
- variable_class
- question_class
- class MDSet(mdformat)
Subclass to define OAI-Sets
- classmethod add_cli_args(parser)
Add command line arguments to parser.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure(settings)
Configure set using settings.
Consult settings and configure set. Called via MDFormat on server startup. Return False if this set should not be loaded.
:param
argparse.Namespace
setting: Loaded settings :returns: False to bypass loading of this set.
- async fields()
Return list of fields to include when querying for record headers.
This is used when gathering all docstore fields that are needed to construct oai headers.
- Returns
list of fields
- Return type
- async filter(value)
Return a query filter that includes all studies matching ‘value’.
This is used when constructing docstore query that will include all records in this OAI-set group. In other words, in selective harvesting.
- async get(study)
Get values from record used in setspec: ‘<key>:<value>’. A None item ([None]) will leave out the <value> part: ‘<key>’
This is used when constructing setspecs for a specific record.
- Parameters
study – study record to get set values from
- Returns
List of values
- async query(on_set_cb)
Query and add distinct values for setspecs
This is used when constructing ListSets OAI response.
- Parameters
on_set_cb – Async callback with signature (spec, name=None, description=None), where spec is the setSpec value.
- Returns
None
- classmethod add_cli_args(parser)[source]
Add command line arguments to parser.
Adds required command line arguments regarding metadataformats & sets.
This should be called on program startup along with other command line argument definitions if the program is allowing configuration of metadataformats & sets.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure_sets(settings)[source]
Configure & load sets using settings.
Calls configure() of each MDSet class stored in class variable ‘sets’. The configure() will be called with ‘settings’-parameter. If the configure() return False the set will not be loaded, but will be discarded instead. Otherwise, the configured set will be stored in module level variable and used to serve OAI requests.
:param
argparse.Namespace
setting: Loaded settings :raises:DuplicateSetSpec
if two configured sets shouldhave duplicate value in ‘spec’ class level variable.
- classmethod configure(settings)[source]
Configure metadataformats & sets using settings.
:param
argparse.Namespace
setting: Loaded settings
- classmethod get_set(setspec)[source]
Get set matching ‘setspec’ value.
- Parameters
setspec (str) – Set to lookup.
- Returns
Found set, which is a subclass of
MDSet
- Raises
exc.NoSuchSet
if a set is not found.
- async get_earliest_datestamp()[source]
Get earliest datestamp as python datetime object.
- Returns
earliest datestamp for this metadataformat.
- Return type
- async list_sets()[source]
Outputs all sets from all records in the whole repository.
If overridden, this should be overridden in all subclasses. It should also have the same behaviour in all subclasses:
async def _list_sets(): ... class MyMetadataFormat(MDFormat): list_sets = _list_sets
- async list_identifiers()[source]
Query record identifiers from backend.
Queries records and raises NoRecordsMatch oai error if the request is selective and no records were found.
- async list_metadata_formats()[source]
Adds information regarding this metadataformat to response.
If the request contains an identifiers, first makes sure the record exists in backend, then adds the metadataformat information to response.
- async get_record()[source]
Adds record to response.
This is an abstract method that must be implemented in subclass. Note that also the correct templates needs to be defined in subclass via decoration.
The implementation must query the backend for the requested record, raise OAI errors if needed and return the correct oai.response.context.
- Raises
- async list_records()[source]
Adds records to response.
This is an abstract method that must be implemented in subclass. The subclass must also define the correct template via decoration.
The implementation must query the backend for the requested records, raise OAI errors when needed and return the correct oai.response.context.
- Raises
- class kuha_oai_pmh_repo_handler.metadataformats.DCMetadataFormat(oai, corr_id_header)[source]
- classmethod add_cli_args(parser)[source]
Add command line arguments to parser.
Adds required command line arguments regarding metadataformats & sets.
This should be called on program startup along with other command line argument definitions if the program is allowing configuration of metadataformats & sets.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure(settings)[source]
Configure metadataformats & sets using settings.
:param
argparse.Namespace
setting: Loaded settings
- async get_record(*args, **kwargs)
Adds record to response.
This is an abstract method that must be implemented in subclass. Note that also the correct templates needs to be defined in subclass via decoration.
The implementation must query the backend for the requested record, raise OAI errors if needed and return the correct oai.response.context.
- Raises
- async list_records(*args, **kwargs)
Adds records to response.
This is an abstract method that must be implemented in subclass. The subclass must also define the correct template via decoration.
The implementation must query the backend for the requested records, raise OAI errors when needed and return the correct oai.response.context.
- Raises
- class kuha_oai_pmh_repo_handler.metadataformats.EAD3MetadataFormat(oai, corr_id_header)[source]
- classmethod add_cli_args(parser)[source]
Add command line arguments to parser.
Adds required command line arguments regarding metadataformats & sets.
This should be called on program startup along with other command line argument definitions if the program is allowing configuration of metadataformats & sets.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure(settings)[source]
Configure metadataformats & sets using settings.
:param
argparse.Namespace
setting: Loaded settings
- async get_record(*args, **kwargs)
Adds record to response.
This is an abstract method that must be implemented in subclass. Note that also the correct templates needs to be defined in subclass via decoration.
The implementation must query the backend for the requested record, raise OAI errors if needed and return the correct oai.response.context.
- Raises
- async list_records(*args, **kwargs)
Adds records to response.
This is an abstract method that must be implemented in subclass. The subclass must also define the correct template via decoration.
The implementation must query the backend for the requested records, raise OAI errors when needed and return the correct oai.response.context.
- Raises
- class kuha_oai_pmh_repo_handler.metadataformats.DDICMetadataFormat(oai, corr_id_header)[source]
- classmethod add_cli_args(parser)[source]
Add command line arguments to parser.
Adds required command line arguments regarding metadataformats & sets.
This should be called on program startup along with other command line argument definitions if the program is allowing configuration of metadataformats & sets.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure(settings)[source]
Configure metadataformats & sets using settings.
:param
argparse.Namespace
setting: Loaded settings
- async get_record(*args, **kwargs)
Adds record to response.
This is an abstract method that must be implemented in subclass. Note that also the correct templates needs to be defined in subclass via decoration.
The implementation must query the backend for the requested record, raise OAI errors if needed and return the correct oai.response.context.
- Raises
- async list_records(*args, **kwargs)
Adds records to response.
This is an abstract method that must be implemented in subclass. The subclass must also define the correct template via decoration.
The implementation must query the backend for the requested records, raise OAI errors when needed and return the correct oai.response.context.
- Raises
- class kuha_oai_pmh_repo_handler.metadataformats.OAIDDI25MetadataFormat(oai, corr_id_header)[source]
- classmethod add_cli_args(parser)[source]
Add command line arguments to parser.
Adds required command line arguments regarding metadataformats & sets.
This should be called on program startup along with other command line argument definitions if the program is allowing configuration of metadataformats & sets.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure(settings)[source]
Configure metadataformats & sets using settings.
:param
argparse.Namespace
setting: Loaded settings
- async get_record(*args, **kwargs)
Adds record to response.
This is an abstract method that must be implemented in subclass. Note that also the correct templates needs to be defined in subclass via decoration.
The implementation must query the backend for the requested record, raise OAI errors if needed and return the correct oai.response.context.
- Raises
- async list_records(*args, **kwargs)
Adds records to response.
This is an abstract method that must be implemented in subclass. The subclass must also define the correct template via decoration.
The implementation must query the backend for the requested records, raise OAI errors when needed and return the correct oai.response.context.
- Raises
- class kuha_oai_pmh_repo_handler.metadataformats.OAIDataciteMetadataFormat(oai, corr_id_header)[source]
Metadataformat for OpenAIRE DataCite
- classmethod add_cli_args(parser)[source]
Add command line arguments to parser.
Adds required command line arguments regarding metadataformats & sets.
This should be called on program startup along with other command line argument definitions if the program is allowing configuration of metadataformats & sets.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure(settings)[source]
Configure metadataformats & sets using settings.
:param
argparse.Namespace
setting: Loaded settings
- async classmethod get_preferred_identifier(study)[source]
OpenAIRE datacite requires a certain type of ID.
- Identifier type must be one of (also the lookup order):
DOI
ARK
Handle
PURL
URN
URL
- :param
kuha_common.document_store.records.Study
study: Currently serialized study.
- Returns
(<str:type>, <str:id>)
- Return type
OpenAIRE Datacite requires a certain type of relatedIdentifier.
- :param
kuha_common.document_store.records.Study
study: Currently serialized study.
- Returns
List of two-tuples containing type & id [(<str:type>, <str:id>)]
- Return type
- :param
- async static get_publisher_lang_value_pair(study)[source]
Get publisher language & value pair as tuple.
- :param
kuha_common.document_store.records.Study
study: Currently serialized study.
- Returns
(<str:language>, <str:publisher>)
- Return type
- :param
- async static get_funders(study)[source]
Get OpenAIRE Datacite funders.
OpenAIRE Datacite requires a certain nameIdentifier for Contributor. The syntax is described at https://guidelines.openaire.eu/en/latest/data/field_contributor.html#nameidentifier-ma-o This method filters in study.grant_number values that conform to the syntax.
:param
kuha_common.document_store.records.Study
study: Currently serialized study. :returns: list of three-tuples [(<str:language>,<str:nameidentifier>, <str:agency>)]
- Return type
- async get_record(*args, **kwargs)
Adds record to response.
This is an abstract method that must be implemented in subclass. Note that also the correct templates needs to be defined in subclass via decoration.
The implementation must query the backend for the requested record, raise OAI errors if needed and return the correct oai.response.context.
- Raises
- async list_records(*args, **kwargs)
Adds records to response.
This is an abstract method that must be implemented in subclass. The subclass must also define the correct template via decoration.
The implementation must query the backend for the requested records, raise OAI errors when needed and return the correct oai.response.context.
- Raises
metadataformats/const.py
metadataformats/exc.py
metadataformats/_mdsets.py
- class kuha_oai_pmh_repo_handler.metadataformats._mdsets.MDSet(mdformat)[source]
Subclass to define OAI-Sets
- classmethod add_cli_args(parser)[source]
Add command line arguments to parser.
:param
configargparse.ArgumentParser
parser: Active command line parser.
- classmethod configure(settings)[source]
Configure set using settings.
Consult settings and configure set. Called via MDFormat on server startup. Return False if this set should not be loaded.
:param
argparse.Namespace
setting: Loaded settings :returns: False to bypass loading of this set.
- async fields()[source]
Return list of fields to include when querying for record headers.
This is used when gathering all docstore fields that are needed to construct oai headers.
- Returns
list of fields
- Return type
- async query(on_set_cb)[source]
Query and add distinct values for setspecs
This is used when constructing ListSets OAI response.
- Parameters
on_set_cb – Async callback with signature (spec, name=None, description=None), where spec is the setSpec value.
- Returns
None
- async get(study)[source]
Get values from record used in setspec: ‘<key>:<value>’. A None item ([None]) will leave out the <value> part: ‘<key>’
This is used when constructing setspecs for a specific record.
- Parameters
study – study record to get set values from
- Returns
List of values
- class kuha_oai_pmh_repo_handler.metadataformats._mdsets.LanguageSet(mdformat)[source]
OAI-Set Language
- async fields()[source]
Return a list of fields to include in query for header fields.
These fields are used to build record headers for language OAI-set.
- Returns
list of fields
- async query(on_set_cb)[source]
Query and add distinct values for languages.
- Parameters
on_set_cb – Async callback called for each setspec
- Returns
None
- class kuha_oai_pmh_repo_handler.metadataformats._mdsets.StudyGroupsSet(mdformat)[source]
OAI-Set Study groups
- async fields()[source]
Return a list of fields to include in query for header fields.
- Returns
list of fields
- Return type
- async query(on_set_cb)[source]
Query and add distinct values for study groups.
- Parameters
on_set_cb – Async callable called for each setspec
- Returns
None
- class kuha_oai_pmh_repo_handler.metadataformats._mdsets.DataKindSet(mdformat)[source]
OAI-Set Data kind
- async fields()[source]
Return list of fields to include when querying for header fields.
- Returns
list of fields
- Return type
- async query(on_set_cb)[source]
Query and add distinct values for Data kinds.
- Parameters
on_set_cb – Async callable called for each setspec.
- Returns
None
- class kuha_oai_pmh_repo_handler.metadataformats._mdsets.OpenAIREDataSet(mdformat)[source]
OAI-Set OpenAIRE data
- async fields()[source]
Return a list of fields to include in query for header fields
- Returns
list of fields
- Return type
- async query(on_set_cb)[source]
Query and add distinct values for OpenAIRE data.
The openaire_data setspec is non-hiearchical, but only contains a single setspec: ‘<setSpec>openaire_data</setSpec>’
- Parameters
on_set_cb – Async callable, called for each setspec.
- Returns
None
- async get(study)[source]
Get values from study used to construct setspecs for openaire_data.
OpenAIRE set does not have a hiearchy of values. Instead a study belongs in the set, if the study has a suitable identifier.
- Parameters
study – Document store study
- Returns
[None] if study belongs to the set, [] if not.
- Return type
oai
Defines OAI-PMH protocol.
Provides classes for handling requests and responses supported by the protocol.
oai/errors.py
Errors for OAI-protocol
- exception kuha_oai_pmh_repo_handler.oai.errors.OAIError(msg=None, context=None)[source]
Base for OAI errors
- exception kuha_oai_pmh_repo_handler.oai.errors.MissingVerb(msg=None, context=None)[source]
OAIError for missing verb
- exception kuha_oai_pmh_repo_handler.oai.errors.BadVerb(msg=None, context=None)[source]
OAIError for bad verb
- exception kuha_oai_pmh_repo_handler.oai.errors.NoMetadataFormats(msg=None, context=None)[source]
OAIError for no metadata formats
- exception kuha_oai_pmh_repo_handler.oai.errors.IdDoesNotExist(msg=None, context=None)[source]
OAIError for no such id
- exception kuha_oai_pmh_repo_handler.oai.errors.BadArgument(msg=None, context=None)[source]
OAIError for bad argument
- exception kuha_oai_pmh_repo_handler.oai.errors.CannotDisseminateFormat(msg=None, context=None)[source]
OAIError for cannot disseminate format
- exception kuha_oai_pmh_repo_handler.oai.errors.NoRecordsMatch(msg=None, context=None)[source]
OAIError for no records match
oai/constants.py
OAI constants
- kuha_oai_pmh_repo_handler.oai.constants.REGEX_OAI_IDENTIFIER = "oai:[a-zA-Z][a-zA-Z0-9\\-]*(\\.[a-zA-Z][a-zA-Z0-9\\-]*)+:[a-zA-Z0-9\\-_\\.!~\\*'\\(\\);/\\?:@&=\\+$,%]+"
Regex to validate oai-identifier. http://www.openarchives.org/OAI/2.0/guidelines-oai-identifier.htm
- kuha_oai_pmh_repo_handler.oai.constants.REGEX_SETSPEC = "([A-Za-z0-9\\-_\\.!~\\*'\\(\\)])+(:[A-Za-z0-9\\-_\\.!~\\*'\\(\\)]+)*"
Sets not complying with this regular expression are invalid according to OAI-PMH schema: see: http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd
oai/protocol.py
Defines the protocol
- kuha_oai_pmh_repo_handler.oai.protocol.REGEX_VALID_SETSPEC = re.compile("([A-Za-z0-9\\-_\\.!~\\*'\\(\\)])+(:[A-Za-z0-9\\-_\\.!~\\*'\\(\\)]+)*")
Validation regex for setspec
- kuha_oai_pmh_repo_handler.oai.protocol.is_valid_setspec(candidate)[source]
Validates setSpec value.
- kuha_oai_pmh_repo_handler.oai.protocol.as_supported_datetime(datetime_str, raise_oai_exc=True)[source]
Convert string representation of datetime to
datetime
.- Note
If the datetime_str does not come from HTTP-Request, set raise_oai_exc to False.
- Note
The legitimate formats are YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ.
- Parameters
- Returns
converted datetime.
- Return type
- Raises
kuha_oai_pmh_repo_hander.oai.errors.BadArgument
for invalid format if raise_oai_exc is True.
- kuha_oai_pmh_repo_handler.oai.protocol.as_supported_datestring(datetime_obj, fmt='%Y-%m-%dT%H:%M:%SZ')[source]
Convert
datetime
to string representation.The target format is YYYY-MM-DDThh:mm:ssZ
- class kuha_oai_pmh_repo_handler.oai.protocol.Response(request_url=None)[source]
Represents the response.
The response is stored in a dictionary which then gets submitted to XML-templates. Thus it is required that the dictionary built within this class is supported by the templates.
- Parameters
request_url (str) – Requested url.
- classmethod set_repository_name(name)[source]
Set repository name.
- Parameters
name (str) – repository name.
- classmethod set_admin_email(email)[source]
Set admin email address.
- Parameters
email (list) – Admin email(s)
- classmethod set_protocol_version(version)[source]
Set protocol version
- Parameters
version (float) – OAI-PMH protocol version.
- async identify_response(earliest_datestamp=None, deleted_records='no', granularity='YYYY-MM-DDThh:mm:ssZ')[source]
Prepare and return context for OAI verb Identify.
- :param
datetime.datetime
earliest_datestamp: Repository earliest datestamp. None if docstore contains no records.
- Parameters
- Returns
Response context
- Return type
- :param
- async add_available_metadata_format(prefix, schema, namespace)[source]
Set supported metadata format.
- set_error(oai_error)[source]
Set OAI-PMH error.
- Note
These are the errors that are defined in the OAI-protocol. Programming errors are handled separately in higher levels.
- Parameters
oai_error (Subclass of
kuha_oai_pmh_repo_handler.oai.errors.OAIError
) – OAI error.
- class kuha_oai_pmh_repo_handler.oai.protocol.ResumptionToken(cursor=0, from_=None, until=None, complete_list_size=None, metadata_prefix=None, set_=None, from_req=False)[source]
Class representing OAI-PMH Resumption Token.
Holds attributes of the resumption token. Creates a new resumption token with initial values or takes a dictionary of resumption token arguments. Validates the token based on records list size. If the list size has been changed between requests asserts that the token is invalid by raising a
kuha_oai_pmh_repo_handler.oai.errors.BadResumptionToken
exception.- Note
Since
OAIArgument.set_
is not supported by resumption token, changing the requested set may result in falsely valid resumption token. But changing the requested set in the middle of a list request sequence should be seen as bad behaviour by the requester/harvester.- Parameters
cursor (int) – Optional parameter for the current position in list.
from (str) – Optional parameter for from datestamp. Converted to
datetime.datetime
on init.until (str) – Optional parameter for until datestamp. Converted to
datetime.datetime
on init.complete_list_size (int) – Optional parameter for the umber of records in the complete list.
metadata_prefix (str) – Optional parameter for the requested metadata prefix.
set (str) – Optional parameter containing requested set information.
- class Attribute(key, value)
Store ResumptionToken attribute keys and values.
- key
Alias for field number 0
- value
Alias for field number 1
- classmethod load_arg(argument)[source]
Create new resumption token from request arguments.
Use to load resumption token from OAI request.
- Parameters
argument (str) – Resumption token argument. This comes from HTTP-request.
- Returns
New
ResumptionToken
- class kuha_oai_pmh_repo_handler.oai.protocol.Arguments(verb, resumption_token=None, identifier=None, metadata_prefix=None, set_=None, from_=None, until=None)[source]
Arguments of OAI-protocol.
Store arguments. Convert datestamps string to datetime objects. Validate arguments for each verb.
- Parameters
verb (str) – requested OAI verb.
resumption_token (str) – requested resumption token.
identifier (str) – requested identifier.
metadata_prefix (str) – requested metadata prefix.
set (str) – requested set.
from (str) – requested datestamp for from attribute.
until (str) – requested datestamp for until attribute.
- Raises
kuha_oai_pmh_repo_handler.oai.errors.OAIError
for OAI errors.
- supported_verbs = ['Identify', 'ListSets', 'ListMetadataFormats', 'ListIdentifiers', 'ListRecords', 'GetRecord']
Define supported verbs
- resumable_verbs = ['ListSets', 'ListIdentifiers', 'ListRecords']
Define resumption token verbs
- is_verb_resumable()[source]
Is the requested verb a resumable list request?
- Returns
True if verb is resumable False otherwise
- Return type
- get_local_identifier()[source]
Get requested local identifier.
Local identifier does not have prefixes for oai and namespace. It is used to identify records locally.
- Returns
Local identifier if applicable for the request.
- Return type
- Raises
kuha_oai_pmh_repo_handler.oai.errors.IdDoesNotExist
for invalid identifier.