Kuha OAI-PMH Repo Handler
Kuha OAI-PMH Repo Handler is a HTTP API written in Python for serving Kuha Document Store records through OAI-PMH.
Kuha OAI-PMH Repo Handler is a part of Open Source software bundle Kuha2.
Features
OAI-PMH features:
Selective harvesting with Sets & Datestamps.
List request sequence with ResumptionTokens.
OAI-Identifiers.
Deleted records.
Supported metadata standards:
DDI-C 2.5
EAD3
OAI-DC
Datacite
Dependencies & requirements
Python 3.8 or newer
The software is continuously tested against supported Python versions.
Python packages
The following can be obtained from Python package index.
tornado (License: Apache License 2.0)
Genshi (License: BSD)
Kuha Common is a library used with Kuha2 software. It can be obtained from https://gitlab.tuni.fi/fsd/kuha_common
kuha_common (License: EUPL)
License
Kuha OAI-PMH Repo Handler is available under the EUPL. See LICENSE.txt for the full license.
Configuration
The application can be configured with a configuration file, via command line arguments or by environment variables. If a configuration option is specified in more than one place, then command line values override environment variables which override configuration file values which override defaults.
Note
Configuration options for –oai-pmh-base-url and –oai-pmh-admin-email are required.
Some of the configuration options configure the OAI-PMH repository. Refer to OAI-PMH protocol description for more information.
This lists some of the available configuration options. Use –help to list all available options.
- -h, --help
Show help message and exit.
- --print-configuration
Print active configuration and exit.
- --port <port>
Port for serving OAI-PMH Repo Handler. Defaults to
6003
May also be controlled by setting environment variable:KUHA_OPRH_PORT
.
- --oai-pmh-base-url <base_url>
OAI-PMH base url. Required configuration value. May also be controlled by setting environment variable:
KUHA_OPRH_OP_BASE_URL
.
- --oai-pmh-admin-email <email>
OAI-PMH administrator email address. Required configuration value. Repeat to give multiple addresses. May also be controlled by setting environment variable:
KUHA_OPRH_OP_EMAIL_ADMIN
.
- --oai-pmh-repo-name <repo_name>
OAI-PMH repository name. Defauts to
Kuha2 oai-pmh repository
. May also be controlled by setting environment variable:KUHA_OPRH_OP_REPO_NAME
.
- --oai-pmh-namespace-identifier <namespace_id>
Namespace identifier to use with OAI-Identifiers. Set
None
to disable use of OAI-Identifiers. Defaults toNone
. May also be controlled by setting environment variable:KUHA_OPRH_OP_NAMESPACE_ID
.
- --document-store-url <url>
Full URL to Kuha document store database. Defaults to
http://localhost/v0
. May also be controlled by setting environment variable:KUHA_DS_URL
.
- --loglevel <loglevel>
Lowest logging level of log messages that get output. Valid values are logging levels supported by Python’s
logging
[CRITICAL,ERROR,WARNING,INFO,DEBUG]
. Defaults toINFO
. May also be controlled by setting environment variable:KUHA_LOGLEVEL
Configuration file
Args that start with ‘–’ (eg. –document-store-port) can also be set
in a config file. The configuration file lookup searches the file
from current working directory and from the package directory.
The name of the configuration file is kuha_oai_pmh_repo_handler.ini
.
Note
Invoke with --help
to print out config file lookup paths.
Environment variables
If the program will be run by using the scripts provided in scripts
subdirectory, the runtime environment can be controlled via scripts/runtime_env
,
which will be created by copying from scripts/runtime_env.dist
at
installation time by scripts/install_kuha_oai_pmh_repo_handler_virtualenv.sh
.
Running the Server
This guide will use convenience scripts from scripts
subdirectory.
It is assumed that the program was installed by using
scripts/install_kuha_oai_pmh_repo_handler_virtualenv.sh
.
Run OAI-PMH Repo Handler server:
./scripts/run_kuha_oai_pmh_repo_handler.sh --oai-pmh-base-url=<base-url> --oai-pmh-admin-email=<admin-email>
The script will source scripts/runtime_env
and activate the
installed virtualenv. Finally it calls kuha_oai_serve
, with given
command line arguments.
Ensuring OAI-PMH serves correct records
The program contains a helper script to run through all records from OAI-PMH Repo Handler using OAI verb ListRecords. The script will print out all identifiers it encounters and log out the time it took to complete the full ListRecords sequence. Note that the OAI-PMH Repo Handler server must be running and accessible in order to get correct results.
If any error conditions are encountered the best place to determine the cause is Kuha OAI-PMH Repo Handler server log.
Run through all records using oai_dc
metadataprefix:
./scripts/list_records.sh oai_dc
See help for more information and configuration options:
./scripts/list_records.sh --help