Kuha Client¶
Kuha Client is used to submit records to Kuha Document Store. Kuha Client is written in Python and uses HTTP to communicate with Document Store.
Features¶
- Support for DDI 3.1, DDI 2.5 and DDI 1.2.2 metadata standards.
- Import records to Document Store.
- Update records already stored in Document Store.
- Delete records in Document Store.
- Batch process DDI files by recursing into directories:
- Option to remove records from Document Store not found in the current batch.
- Option to keep track of previously processed files and bypass processing if modification times have not changed.
Dependencies & requirements¶
- Python 3.5. or newer
- Recommended: python3-venv 3.5.1 or newer
The software is continuously tested against Python versions 3.5, 3.6, 3.7, and 3.8.
Python packages
Kuha Common is a library used with Kuha2 software. It can be obtained from https://bitbucket.org/tietoarkisto/kuha_common
- kuha_common (License: EUPL)
License¶
Kuha Client is available under the EUPL. See LICENSE.txt for the full license.
Configuration¶
Most common configuration options are described here. Use --help
to print all
available options.
-
paths
¶
Required positional argument. Absolute path to file or directory. Repeat to process multiple paths.
-
-h
,
--help
¶
Show help and exit.
-
--collection
<collection>
¶ Only for upsert and import run. Limits the import to a spesific document type. Valid values are
[studies,variables,questions,study_groups]
. SetNone
to import all document types. Defaults toNone
.
-
--document-store-url
<document_store_url>
¶ Required. Full URL to Document Store, for example
http://localhost:6001/v0
. May also be controlled by setting environment variable:KUHA_DS_URL
.
-
--file-log-path
<path>
¶ Only for upsert and import run. Store processed files to file log. Compare modification times on subsequent run. Bypass if modification times have not changed.
-
--remove-absent
¶
Only for upsert run. Remove records from Document Store not present in current batch.
Running the program¶
If installed to a Python virtual environment, the environment must be activated before running the program.
Import records to Document Store by scanning a directory tree for .xml files to submit and create a file-log to keep track of processed files:
python -m kuha_client.kuha_import --file-log-path=file_log /path/to/directory
Upsert records (insert and update) to Document Store by scanning a directory tree for .xml files and comparing found files to the ones store in file-log. If a file’s modification time is newer than the one stored in file-log, the file gets processed. When using the –remove-absent flag, any ID found from document store, but not from the current batch, gets removed:
python -m kuha_client.kuha_upsert --file-log-path=file_log --remove-absent /path/to/directory
Delete record from collection:
python -m kuha_client.kuha_delete studies 5af94ff06fb71d7646160bd4
Delete all records from collection:
python -m kuha_client.kuha_delete studies ALL
Delete all records from all collections:
python -m kuha_client.kuha_delete ALL ALL