kuha_oai_pmh_repo_handler

Kuha OAI-PMH Repo Handler application.

Serve records from Kuha Document Store throught OAI-PMH protocol.

serve.py

Main entry point for starting OAI-PMH Repo Handler.

kuha_oai_pmh_repo_handler.serve.get_app(api_version, app_settings=None)[source]

Setup routes and return initialized Tornado web application.

Parameters:
  • api_version (str) – HTTP Api version gets prepended to routes.
  • app_settings (dict or None.) – Settings to store to application.
Returns:

Tornado web application.

Return type:

tornado.web.Application

kuha_oai_pmh_repo_handler.serve.main()[source]

Application main function.

Parse commandline for settings. Setup and serve webapp. Exit on exceptions propagated at this level.

Returns:exit code, 1 on error, 0 on success.
Return type:int

configure.py

Configure OAI-PMH Repo Handler

kuha_oai_pmh_repo_handler.configure.configure(settings=None)[source]

Get settings and configure application.

Declares application spesific configuration options and some common options declared in kuha_common.cli_setup

Configure application with arguments specified in configuration file, environment variables and command line arguments.

Note:Calling this function multiple times will not initiate new settings to be parsed, but will return previously parsed settings instead.
Parameters:settings – Optional settings to use for configuration. If settings are already loaded this parameter will be silently ignored.
Returns:settings
Return type:argparse.Namespace

genshi_loader.py

Load genshi templates.

Configure:

from genshi_loader import add_template_folder, set_template_writer
add_template_folder(settings.oai_pmh_template_folder)
set_template_writer(handler.template_writer)

Use as decorator:

from genshi_loader import OAITemplate

class Handler:
...
    @OAITemplate('error.xml')
    async def build_error_message(self):
        return {'msg': 'there was an error'}
kuha_oai_pmh_repo_handler.genshi_loader.FOLDERS = []

Template folders. There can be multiple.

kuha_oai_pmh_repo_handler.genshi_loader.add_template_folder(folder)[source]

Add folder to lookup for templates.

Parameters:folder (str) – absolute path to folder containing genshi templates.
kuha_oai_pmh_repo_handler.genshi_loader.get_template_folder()[source]

Get template folder.

Returns:template folders.
Return type:list
kuha_oai_pmh_repo_handler.genshi_loader.WRITER = []

Template writer. Function which accepts an iterator as parameter.

kuha_oai_pmh_repo_handler.genshi_loader.set_template_writer(writer)[source]

Set template writer.

Note:Supports only one template writer.
Parameters:writer – Function that writes the template. Must accept an iterator as a paramter.
kuha_oai_pmh_repo_handler.genshi_loader.get_template_writer()[source]

Get template writer.

Returns:template writer
Return type:function
class kuha_oai_pmh_repo_handler.genshi_loader.OAITemplate(template_file)[source]

OAITemplate class.

Decorate functions that should write output to genshi-templates. The decorated function must be an asynchronous function and it must return a dictionary.

Example:

from genshi_loader import OAITemplate
class Handler:
    @OAITemplate('error.xml')
    async def build_error_message(self):
        ...
        return {'msg': 'there was an error'}
Parameters:
  • template_file (str) – filename of the template to use.
  • template_folder (str) – optional parameter to use a different template folder to lookup for given template_file.
Raises:

ValueError if decorated function returns invalid type.

handlers.py

Define handlers for responding to HTTP-requests.

class kuha_oai_pmh_repo_handler.handlers.OAIRouteHandler(*args, **kwargs)[source]

Handle requests to OAI endpoint.

OAIRouteHandler extends kuha_common.server.RequestHandler.

Input and output goes throught this class. It is responsible for accepting requests via HTTP and routing the requests to OAI-protocol and to the correct verb-handler. Verb-handlers are defined in this class.

Verb-handlers are responsible for calling the kuha_common.query.QueryController and again routing the records to OAI-protocol.

Verb-handlers also define the templates used to serialize XML, which is then sent as HTTP-response via template_writer().

The oai protocol is defined in kuha_oai_pmh_repo_handler.oai.protocol.

prepare()[source]

Prepare each response.

Initialize response. Load query controller. Set output content type.

template_writer(generator)[source]

Writes the output from genshi template.

Parameters:generator (generator) – generator object containing the XML-serialization.
get()[source]

HTTP-GET handler

Gathers request arguments. Calls router. Finishes the response.

“URLs for GET requests have keyword arguments appended to the base URL”

http://www.openarchives.org/OAI/openarchivesprotocol.html#ProtocolFeatures

post()[source]

HTTP-POST handler

Validates request content type. Gathers request arguments. Calls router. Finishes the response.

“Keyword arguments are carried in the message body of the HTTP POST. The Content-Type of the request must be application/x-www-form-urlencoded.”

http://www.openarchives.org/OAI/openarchivesprotocol.html#ProtocolFeatures

list_records.py

Run list records sequence on-demand against an OAI-PMH Repo Handler.

Helper script runs through the entire list records sequence with a given metadataPrefix and conditions. Can be used to ensure that all records within a repository are good to serve by catching timeouts from Document Store Client and non-serializable Document Store records.

Logs out the time it takes to complete the full sequence. Prints out all identifiers found by the requested conditions.

If any error conditions are encountered, the best place to look for the cause is the Kuha OAI-PMH Repo Handler log output and Kuha Document Store log output.

exception kuha_oai_pmh_repo_handler.list_records.InvalidOAIResponse[source]

The response was not expected.

Raised when:

  • HTTP response code is invalid
  • Result cannot be parsed as XML
  • OAI response has error <error> element
kuha_oai_pmh_repo_handler.list_records.main()[source]

Command line interface entry point.

Gather configuration. Setup application. Run sequence and report encountered identifiers.

Returns:0 on success
Return type:int

oai

Defines OAI-PMH protocol.

Provides classes for handling requests and responses supported by the protocol. Builds records from kuha_common.document_store.records.

oai/errors.py

Errors for OAI-protocol

exception kuha_oai_pmh_repo_handler.oai.errors.OAIError(msg=None, context=None)[source]

Base for OAI errors

get_code()[source]

Get OAI error code

get_msg()[source]

Get OAI error message

get_context()[source]

Get error context

get_contextual_message()[source]

Get error message with possible context.

Returns:message with context.
Return type:str
exception kuha_oai_pmh_repo_handler.oai.errors.MissingVerb(msg=None, context=None)[source]

OAIError for missing verb

exception kuha_oai_pmh_repo_handler.oai.errors.BadVerb(msg=None, context=None)[source]

OAIError for bad verb

exception kuha_oai_pmh_repo_handler.oai.errors.NoMetadataFormats(msg=None, context=None)[source]

OAIError for no metadata formats

exception kuha_oai_pmh_repo_handler.oai.errors.IdDoesNotExist(msg=None, context=None)[source]

OAIError for no such id

exception kuha_oai_pmh_repo_handler.oai.errors.BadArgument(msg=None, context=None)[source]

OAIError for bad argument

exception kuha_oai_pmh_repo_handler.oai.errors.CannotDisseminateFormat(msg=None, context=None)[source]

OAIError for cannot disseminate format

exception kuha_oai_pmh_repo_handler.oai.errors.NoRecordsMatch(msg=None, context=None)[source]

OAIError for no records match

exception kuha_oai_pmh_repo_handler.oai.errors.BadResumptionToken(msg=None, context=None)[source]

OAIError for bad resumption token

oai/constants.py

OAI constants

kuha_oai_pmh_repo_handler.oai.constants.REGEX_OAI_IDENTIFIER = "oai:[a-zA-Z][a-zA-Z0-9\\-]*(\\.[a-zA-Z][a-zA-Z0-9\\-]*)+:[a-zA-Z0-9\\-_\\.!~\\*'\\(\\);/\\?:@&=\\+$,%]+"

Regex to validate oai-identifier. http://www.openarchives.org/OAI/2.0/guidelines-oai-identifier.htm

kuha_oai_pmh_repo_handler.oai.constants.REGEX_SETSPEC = "([A-Za-z0-9\\-_\\.!~\\*'\\(\\)])+(:[A-Za-z0-9\\-_\\.!~\\*'\\(\\)]+)*"

Sets not complying with this regular expression are invalid according to OAI-PMH schema: see: http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd

oai/metadata_formats.py

Define supported metadata formats.

class kuha_oai_pmh_repo_handler.oai.metadata_formats.MetadataFormatBase[source]

Base class for metadata formats.

Defines common attributes and methods.

Note:This class must be subclassed and the class attributes overriden.
prefix = None

Prefix for metadata format. Override in sublass.

schema = None

Schema URL for metadata format. Override in subclass.

namespace = None

Namespace for metadata format. Override in subclass.

record_fields = None

Record fields. Override in subclass

relative_records = []

Relative records. Override in subclass. Set empty list if no relative records.

get_prefix()[source]

Get metadata prefix.

Returns:metadata prefix.
Return type:str
get_schema()[source]

Get metadata schema URL.

Returns:URL to metadata schema.
Return type:str
get_namespace()[source]

Get metadata namespace.

Returns:Metadata namespace.
Return type:str
get_relative_records()[source]

Get document store records required by this schema.

These fields are required to represent the record in this metadata schema.

Returns:list of relative records.
Return type:list
get_record_fields(record=<class 'kuha_common.document_store.records.Study'>)[source]

Get fields for querying Document Store.

These fields are required to represent the record in this metadata schema.

Parameters:record (kuha_common.document_store.records.Study or kuha_common.document_store.records.Variable or kuha_common.document_store.records.Question or kuha_common.document_store.records.StudyGroup) – Get fields for this Document Store record. Defaults to kuha_common.document_store.records.Study
Returns:document store record fields
Return type:list
Raises:KeyError if record is not defined in record_fields
as_dict()[source]

Return metadata attributes in dictionary representation.

Returns:metadata attributes.
Return type:dict
class kuha_oai_pmh_repo_handler.oai.metadata_formats.DCMetadataFormat[source]

Metadata format for OAI-DC.

prefix = 'oai_dc'

Metadata prefix for OAI-DC

schema = 'http://www.openarchives.org/OAI/2.0/oai_dc.xsd'

Metadata schema url for OAI-DC

namespace = 'http://www.openarchives.org/OAI/2.0/oai_dc/'

Namespace for OAI-DC

class kuha_oai_pmh_repo_handler.oai.metadata_formats.DDIMetadataFormat[source]

Metadata format for DDI-C.

prefix = 'ddi_c'

Metadata prefix for DDI-C

schema = 'http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd'

Metadata schema url for DDI-C

namespace = 'ddi:codebook:2_5'

Namespace for DDI-C

class kuha_oai_pmh_repo_handler.oai.metadata_formats.CDCDDI25MetadataFormat[source]

Metadata format for Cessda Data Catalogue DDI 2.5

prefix = 'oai_ddi25'

Metadata prefix for CESSDA Data Catalogue

schema = 'http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd'

Metadata schema url for DDI-C

namespace = 'ddi:codebook:2_5'

Namespace for DDI-C

class kuha_oai_pmh_repo_handler.oai.metadata_formats.EAD3MetadataFormat[source]

Metadata format for EAD3

oai/protocol.py

Defines the protocol

kuha_oai_pmh_repo_handler.oai.protocol.as_supported_datetime(datetime_str, raise_oai_exc=True)[source]

Convert string representation of datetime to datetime.

Note:

If the datetime_str does not come from HTTP-Request, set raise_oai_exc to False.

Note:

The legitimate formats are YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ.

Parameters:
  • datetime_str (str) – datetime to convert
  • raise_oai_exc (bool) – Catch datetime.strptime errors and reraise as oai-error.
Returns:

converted datetime.

Return type:

datetime

Raises:

kuha_oai_pmh_repo_hander.oai.errors.BadArgument for invalid format if raise_oai_exc is True.

kuha_oai_pmh_repo_handler.oai.protocol.as_supported_datestring(datetime_obj)[source]

Convert datetime to string representation.

The target format is YYYY-MM-DDThh:mm:ssZ

Parameters:datetime_obj (datetime) – datetime to convert.
Returns:string representation of datetime_obj.
Return type:str
kuha_oai_pmh_repo_handler.oai.protocol.encode_uri(string)[source]

Encode uri string.

Replace special characters in string using urllib.parse.quote(). Return resulting string.

Parameters:string (str) – value to encode.
Returns:encoded value
Return type:str
kuha_oai_pmh_repo_handler.oai.protocol.decode_uri(uri)[source]

Decode uri string.

Replace uri encoded special characters in string using urllib.parse.unquote(). Return resulting string.

Parameters:string (str) – value to decode.
Returns:decoded value
Return type:str
kuha_oai_pmh_repo_handler.oai.protocol.min_increment_step(datetime_str)[source]

Count smallest increment step from datetime string.

Parameters:datetime_str (str) – string representation of a datetime. Datetime must be represented either by day’s precision or by second’s precision.
Returns:smallest increment step.
Rype:datetime.timedelta
Raises:ValueError if string lenght is invalid.
class kuha_oai_pmh_repo_handler.oai.protocol.ResumptionToken(cursor=0, from_=None, until=None, complete_list_size=None, metadata_prefix=None, set_=None)[source]

Class representing OAI-PMH Resumption Token.

Holds attributes of the resumption token. Creates a new resumption token with initial values or takes a dictionary of resumption token arguments. Validates the token based on records list size. If the list size has been changed between requests asserts that the token is invalid by raising a kuha_oai_pmh_repo_handler.oai.errors.BadResumptionToken exception.

Note:

Since OAIArgument.set_ is not supported by resumption token, changing the requested set may result in falsely valid resumption token. But changing the requested set in the middle of a list request sequence should be seen as bad behaviour by the requester/harvester.

Parameters:
  • cursor (int) – Optional parameter for the current position in list.
  • from (str.) – Optional parameter for from datestamp. Converted to datetime.datetime on init.
  • until (str.) – Optional parameter for until datestamp. Converted to datetime.datetime on init.
  • complete_list_size (int) – Optional parameter for the umber of records in the complete list.
  • metadata_prefix (str) – Optional parameter for the requested metadata prefix.
  • set – Optional parameter containing requested set information.
class Attribute(key, value)

Store ResumptionToken attribute keys and values.

key

Alias for field number 0

value

Alias for field number 1

response_list_size = 100

Configurable value for the size of the list response.

classmethod set_response_list_size(size)[source]

Configure response list size.

Parameters:size (int) – Number of records in list response.
classmethod load_resumption_token_argument(argument)[source]

Create new resumption token from arguments.

Use to load resumption token from OAI request.

Parameters:argument (str) – Resumption token argument. This comes from HTTP-request.
Returns:New ResumptionToken
set_complete_list_size(size)[source]

Set the number of records in the complete query response.

Note:Resumption token is invalid if the number of records for the complete query response has been changed between requests.
Parameters:size (int) – Number of records for the complete query response.
Raises:kuha_oai_pmh_repo_handler.oai.errors.BadResumptionToken if list sizes don’t match.
get_encoded()[source]

Get encoded Resumption Token.

Returns uri-encoded representation of the resumption token if the list request sequence is ongoing. If the list request sequence is over, returns None.

Returns:uri-encoded represenation of the token, or None
Return type:str or None
class kuha_oai_pmh_repo_handler.oai.protocol.OAIResponse(request_url=None)[source]

Represents the response.

The response is stored in a dictionary which then gets submitted to XML-templates. Thus it is required that the dictionary built within this class is supported by the templates.

Parameters:request_url (str or None) – Optional requested url. Leave empty to use base url.
classmethod set_repository_name(name)[source]

Set repository name.

Parameters:name (str) – repository name.
classmethod set_base_url(url)[source]

Set base url

Parameters:url (str) – url.
classmethod set_admin_email(email)[source]

Set admin email address.

Parameters:email (list) – Admin email(s)
classmethod set_protocol_version(version)[source]

Set protocol version

Parameters:version (float) – OAI-PMH protocol version.
add_record(record)[source]

Add record to response

Parameters:record (kuha_oai_pmh_repo_handler.oai.records.OAIRecord) – OAIRecord to add.
has_records()[source]

Return True if response has records.

Return type:bool
assert_single_record()[source]

Assert the response has a single record.

Raises:AssertionError if there is more or less than a single record.
set_earliest_datestamp(datestamp)[source]

Set earliest datestamp.

Parameters:datestamp (str) – datestamp in finest granularity ISO8601
set_deleted_records_declaration(declaration)[source]

Set deleted records declaration.

Parameters:declaration (str) – declare support for deleted records
set_granularity(granularity)[source]

Set datestamp granularity.

Parameters:granularity (str) – datestamp format for finest granularity supported by this repository.
set_metadata_formats(metadata_formats)[source]

Set supported metadata formats.

Parameters:metadata_formats (list) – supported metadata formats
set_resumption_token(token)[source]

Set resumption token.

Parameters:token (ResumptionToken) – resumption token.
set_error(oai_error)[source]

Set OAI-PMH error.

Note:These are the errors that are defined in the OAI-protocol. Programming errors are handled separately in higher levels.
Parameters:oai_error (Subclass of kuha_oai_pmh_repo_handler.oai.errors.OAIError) – OAI error.
add_sets_element(spec, name)[source]

Add new sets element.

Parameters:
  • spec (str) – setSpec-sublement value.
  • name – setName-sublement value.
extend_sets_element(sets_list)[source]

Add multiple sets elements

Note:Parameter may come directly from kuha_oai_pmh_repo_handler.oai.records.Sets.get_sets_list_from_records()
Parameters:sets_list (list) – list of sets-elements.
set_request_params(oai_request)[source]

Gather response parameters from request.

Note:These are common response parameters that can be added to each response.
Parameters:oai_request (OAIRequest) – Current OAI-request.
get_response()[source]

Get dictionary representation of the response.

The response attributes are gathered in a dictionary that is to be parsed in the templates.

Note:The dictionary will contain python objects, so it is not serializable to JSON or arbitrary formats as is.
Returns:Response ready to pass to templates.
Return type:dict
class kuha_oai_pmh_repo_handler.oai.protocol.OAIArguments(verb, resumption_token=None, identifier=None, metadata_prefix=None, set_=None, from_=None, until=None)[source]

Arguments of OAI-protocol.

Store arguments. Convert datestamps string to datetime objects. Validate arguments for each verb.

Parameters:
  • verb (str) – requested OAI verb.
  • resumption_token (str) – requested resumption token.
  • identifier (str) – requested identifier.
  • metadata_prefix (str) – requested metadata prefix.
  • set (str) – requested set.
  • from (str) – requested datestamp for from attribute.
  • until (str) – requested datestamp for until attribute.
Raises:

kuha_oai_pmh_repo_handler.oai.errors.OAIError for OAI errors.

supported_verbs = ['Identify', 'ListSets', 'ListMetadataFormats', 'ListIdentifiers', 'ListRecords', 'GetRecord']

Define supported verbs

resumable_verbs = ['ListSets', 'ListIdentifiers', 'ListRecords']

Define resumption token verbs

supported_metadata_formats = [<class 'kuha_oai_pmh_repo_handler.oai.metadata_formats.DCMetadataFormat'>, <class 'kuha_oai_pmh_repo_handler.oai.metadata_formats.DDIMetadataFormat'>, <class 'kuha_oai_pmh_repo_handler.oai.metadata_formats.CDCDDI25MetadataFormat'>, <class 'kuha_oai_pmh_repo_handler.oai.metadata_formats.EAD3MetadataFormat'>]

Define supported metadata formats

is_verb_resumable()[source]

Is the requested verb a resumable list request?

Returns:True if verb is resumable False otherwise
Return type:bool
get_verb()[source]

Get requested OAI-verb.

Returns:requested OAI-verb.
Return type:str
get_resumption_token()[source]

Get resumption token for request.

The resumption token is either submitted in the request or created automatically.

Returns:resumption token.
Return type:ResumptionToken
get_cursor()[source]

Get resumptionToken cursor

Returns:cursor value
Return type:str or None
get_from()[source]

Get from argument.

Returns:from argument
Return type:str or None
get_until()[source]

Get until argument.

Returns:until argument.
Return type:str or None
get_query_param_until()[source]

Get until datestamp for querying.

Note:This is until + smallest increment step.
Returns:datestamp of query_param_until attribute.
Return type:datetime.datetime
get_identifier()[source]

Get requested identifier.

Returns:requested identifier if any.
Return type:str or None
get_local_identifier()[source]

Get requested local identifier.

Local identifier does not have prefixes for oai and namespace. It is used to identify records locally.

Returns:Local identifier if applicable for the request.
Return type:str
Raises:kuha_oai_pmh_repo_handler.oai.errors.IdDoesNotExist for invalid identifier.
get_metadata_format()[source]

Get requested metadata format.

This is one of the supported metadata formats defined in OAIArguments.supported_metadata_formats

Returns:requested metadata format if any.
Return type:Subclass of kuha_oai_pmh_repo_handler.oai.metadata_formats.MetadataFormatBase or None
get_set()[source]

Get requested set.

Returns:requested set.
Return type:str
is_selective()[source]

Return True if request is selective.

Selective refers to selective harvesting supported by OAI-PMH.

Returns:True if selective, False if not.
Return type:bool
has_set()[source]

Return True if the request contained set.

Return type:bool
iterate_supported_metadata_formats()[source]

Generator for iterating throught supported metadata formats.

Returns:Generator object for iterating supported metadata formats.
class kuha_oai_pmh_repo_handler.oai.protocol.OAIRequest(args)[source]

Represents the OAI request.

Subclass of OAIArguments. Defines keys for OAI arguments.

request_attrs = None

Request attributes untouched.

oai/records.py

Define OAI records.

note:This module has a strict dependency to kuha_common.document_store.records

Contains information for querying records from document store and appending them to responses with OAIHeaders, OAIRecord and SETS.

kuha_oai_pmh_repo_handler.oai.records.SetAttribute

Attribute to store set configuration

alias of Set

kuha_oai_pmh_repo_handler.oai.records.SET_STUDY_GROUP = Set(setname='Study group', setspec='study_groups', record_field_setname=<kuha_common.document_store.field_types.FieldAttribute object>, record_field_setspec='study_group', record_query_field=<kuha_common.document_store.field_types.FieldAttribute object>, set_values_query_field=<kuha_common.document_store.field_types.FieldTypeFactory object>)

Configuration for study_group set

kuha_oai_pmh_repo_handler.oai.records.SET_LANGUAGE = Set(setname='Language', setspec='language', record_field_setname=None, record_field_setspec='language', record_query_field=<kuha_common.document_store.field_types.FieldAttribute object>, set_values_query_field=<kuha_common.document_store.field_types.FieldAttribute object>)

Configuration for language set

kuha_oai_pmh_repo_handler.oai.records.SET_DATAKIND = Set(setname='Kind of data', setspec='data_kind', record_field_setname=None, record_field_setspec='data_kind', record_query_field=<kuha_common.document_store.field_types.FieldAttribute object>, set_values_query_field=<kuha_common.document_store.field_types.FieldAttribute object>)

Configuration for datakind set

kuha_oai_pmh_repo_handler.oai.records.SETS = [Set(setname='Study group', setspec='study_groups', record_field_setname=<kuha_common.document_store.field_types.FieldAttribute object>, record_field_setspec='study_group', record_query_field=<kuha_common.document_store.field_types.FieldAttribute object>, set_values_query_field=<kuha_common.document_store.field_types.FieldTypeFactory object>), Set(setname='Language', setspec='language', record_field_setname=None, record_field_setspec='language', record_query_field=<kuha_common.document_store.field_types.FieldAttribute object>, set_values_query_field=<kuha_common.document_store.field_types.FieldAttribute object>), Set(setname='Kind of data', setspec='data_kind', record_field_setname=None, record_field_setspec='data_kind', record_query_field=<kuha_common.document_store.field_types.FieldAttribute object>, set_values_query_field=<kuha_common.document_store.field_types.FieldAttribute object>)]

Supported sets

kuha_oai_pmh_repo_handler.oai.records.REGEX_VALID_SETSPEC = re.compile("([A-Za-z0-9\\-_\\.!~\\*'\\(\\)])+(:[A-Za-z0-9\\-_\\.!~\\*'\\(\\)]+)*")

Validation regex for setspec

kuha_oai_pmh_repo_handler.oai.records.is_valid_setspec(candidate)[source]

Validates setSpec value.

Parameters:candidate (str) – setSpec value to validate.
Returns:True if valid, False if not.
Return type:bool
kuha_oai_pmh_repo_handler.oai.records.get_record_query_field_by_setspec(setspec)[source]

Get document store field to query for set value.

Parameters:setspec (str) – setSpec field of the requested set.
Returns:document store field or None
Return type:kuha_common.document_store.field_types.FieldAttribute or None
kuha_oai_pmh_repo_handler.oai.records.get_set_specs_from_ds_record(ds_record)[source]

Get set specs from document store record.

Parameters:ds_record (Record object from kuha_common.document_store.records) – One of the document store records. Currently only Study is supported.
Returns:set specs for use in oai-headers.
Return type:dict
kuha_oai_pmh_repo_handler.oai.records.get_sets_list_from_query_result(set_, query_result)[source]

Get sets list from query results.

Query is built on the basis of set attributes defined in this class. It is a distinct type of query, so the retuned object is not a document store record. This function accepts the results and builds a sets list with each cell containing setName and setSpec keys with their values.

Parameters:
  • set (SetAttribute) – set-attribute used for the query.
  • query_result (dict) – results from the query.
Returns:

list of sets to be used in list sets response.

Return type:

list

kuha_oai_pmh_repo_handler.oai.records.get_query_filter_for_set(set_request)[source]

Get filter to use for querying document store.

Returns a dictionary to use for querying document store and filtering by requested set. Returns None if requested set does not exists or is unsupported.

Parameters:set_request (str) – requested set
Returns:Query filter or None
Return type:dict or None
class kuha_oai_pmh_repo_handler.oai.records.OAIHeaders(identifier, datestamp, **set_specs)[source]

Represents OAI-PMH record headers.

Store information of a single record’s headers and document store fields to include in query. Provides methods to validate OAI-Identifiers and to iterate set specs list.

Parameters:
  • identifier (str) – local identifier of a record.
  • datestamp (str) – last modified/updated datestamp.
  • **set_specs – key-value pairs of set specs for the record.
namespace_identifier = None

Namespace identifier used to construct an OAI-Identifier Use None if wish to use local identifiers in OAI-responses.

identifier_oai_prefix = 'oai'

Prefix for all identifiers when constructing an OAI-Identifier.

valid_oai_identifier = re.compile("oai:[a-zA-Z][a-zA-Z0-9\\-]*(\\.[a-zA-Z][a-zA-Z0-9\\-]*)+:[a-zA-Z0-9\\-_\\.!~\\*'\\(\\);/\\?:@&=\\+$,%]+")

Validation regex for OAI-Identifier

valid_identifier = re.compile("[a-zA-Z0-9\\-_\\.!~\\*'\\(\\);/\\?:@&=\\+$,%]+")

Validation regex for local identifier (a subset of oai-identifier)

classmethod from_ds_record(ds_record)[source]

Return OAIHeaders constructed from document store record.

Note:Currently supports only Study
Parameters:ds_record (Record object defined in kuha_common.document_store.records) – Document Store record.
Returns:headers constructed from Document Store record.
Return type:OAIHeaders
classmethod set_namespace_identifier(ns_id)[source]

Set namespace identifier for all instances.

Note:this will be validated afterwards in set_identifier()
Parameters:ns_id (str) – namespace identifier
classmethod as_local_id(identifier)[source]

Get local identifier part of OAI-Identifier.

Parameters:identifier (str) – records identifier.
Returns:local identifier or None for invalid identifier.
Return type:str or None
static get_header_fields()[source]

Get header fields to query.

These are the fields required to construct the OAI-HEADER in templates. Check that each OAI-SET field is found here.

Note:currently supports only Study.
Returns:list of fields to contain in query.
Return type:list
set_identifier(identifier)[source]

Set identifier.

If namespace_identifier is not None, will build an OAI-Identifier. The identifier will be validated and ValueError will be raised if the validation fails.

Parameters:identifier (str) – Record’s local identifier.
Raises:ValueError if validation fails.
get_identifier()[source]

Get identifer

Returns:record’s identifier.
Return type:str
get_datestamp()[source]

Get records datestamp

Returns:record’s datestamp
Return type:str
iterate_set_specs()[source]

Iterate over setSpec key-value pairs.

Returns:Generator object for iterating over setSpec key-value pairs.
Return type:Generator
class kuha_oai_pmh_repo_handler.oai.records.OAIRecord(study)[source]

Class stores record and headers.

Parameters:study (kuha_common.document_store.records.Study) – Document Store study record.
add_variable(variable)[source]

Add variable to OAIRecord.

Parameters:variable (kuha_common.document_store.records.Variable) – Document Store variable.
add_question(question)[source]

Add question to OAIRecord.

Question lookup is done by variable name. Therefore it makes sense to use a dictionary with variable_name as key. The key content will be a list, since a variable may refer multiple questions.

Note:questions without variable_name will be discarded and a warning will be logged.
Parameters:question (kuha_common.document_store.records.Question) – Document Store question.
get_questions_by_variable(variable)[source]

Get questions for OAIRecord by variable.

Lookup questions by variable’s variable_name.

Parameters:variable (kuha_common.document_store.records.Variable) – Document Store variable.
Returns:List of kuha_common.document_store.records.Question
Return type:list
iter_relpubls()[source]

Iterates related publications by distinct description and lang.

Generator yields two-tuples (‘lang_desc’, ‘relpubls’): ‘lang_desc’ is a two-tuple with first item being the related publication description and the second item being the language of the relpubl element. ‘relpubls’ is a list containing all bibliographic citation contents of the related publication.

Returns:generator that yields tuples (lang_desc, relpubls)