Kuha Client
Kuha Client is used to submit records to Kuha Document Store. Kuha Client is written in Python and uses HTTP to communicate with Document Store.
Features
Support for DDI 3.3, DDI 3.1, DDI 2.5 and DDI 1.2.2 metadata standards.
Synchronize records from filesystem to Document Store.
Delete records in Document Store. Support logical and physical deletions.
Batch process DDI files by recursing into directories:
Option to not remove records in Document Store that cannot be not found in the current batch.
Option to keep track of previously processed files and bypass processing if modification times have not changed.
Dependencies & requirements
Python 3.8. or newer
The software is continuously tested against supported Python versions.
Python packages
Kuha Common is a library used with Kuha2 software. It can be obtained from https://gitlab.tuni.fi/fsd/kuha_common
kuha_common (License: EUPL)
License
Kuha Client is available under the EUPL. See LICENSE.txt for the full license.
Configuration
Some common configuration options are described here. Use --help
to print all
available options.
- paths
Only for kuha_sync. Required positional argument. Absolute path to file or directory. Repeat to process multiple paths.
- --document-store-url <document_store_url>
Required. Full URL to Document Store, for example
http://localhost:6001/v0
. May also be controlled by setting environment variable:KUHA_DS_URL
.
- --collection <collection>
Only for kuha_sync. Limits the import to a spesific document type. Valid values are
[studies,variables,questions,study_groups]
. SetNone
to import all document types. Defaults toNone
.
- --file-cache <path>
- Only for kuha_sync. Path to a cache file. Will be created if
not present. Leave unset (default) to not use file caching.
- --no-remove
Only for kuha_sync. Do not remove records that were not found in this batch.
- --delete-type <type>
Only for kuha_delete. Select delete type: soft or hard. Soft is for logical deletions, hard is for physical deletions. Defaults to soft.
- -h, --help
Show help and exit.
Running the program
If installed to a Python virtual environment, the environment must be activated before running the program.
Synchronize records (insert, update & remove) to Document Store by scanning a directory tree for .xml files and comparing found files to the ones stored in file-cache. If a file’s modification time is newer than the one stored in file-cache, the file gets processed:
kuha_sync --file-cache=file_cache /path/to/directory
Note
The file-cache is not invalidated automatically. It must be removed
manually if you have removed records using kuha_delete
, or you
have upgraded Kuha Client, or you have altered the records in Document
Store using some other mechanism than kuha_sync
.
Delete record 5af94ff06fb71d7646160bd4 from studies-collection:
kuha_delete studies 5af94ff06fb71d7646160bd4
Delete study by study_number:
kuha_delete studies study_number=study_3
Delete all records from studies-collection:
kuha_delete studies ALL
Delete all records from all collections:
kuha_delete ALL ALL
Note that when deleting records with kuha_delete, the file-cache will become invalid and should be deleted. You can simply delete it:
rm file_cache