1. Login to the system
- have a quick look in the whole procedure of ingestion;
- check the tasks refered to their packages;
- choose one of the main functions. This part contains "links" to the profile, the metadata sources and the available projects;
- look at the notifications refered to their packages.
2. New harvest
Create a new metadata source
The user has to fill in the fields:
- type, selecting OAI-PMH Repository ;
- project, one of the projects in the list. In our case, the project refers to LoCloud;
- label, the label of the new metadata source;
Then the user chooses the button "Next" so as to fill in some more fields.
These fields are the following:
- schema, one of the acceptable schemas of the project;
- url, only the OAI-PHM URL without the "verb" commands. The user after inserting the url should validate it through ;
- set, select the appropriate set from the list;
- format, select the OAI metadata format which corresponds to the set;
- record, OAI record element which takes the value 'record'.
The metadata sources appear in a list, in which is shown:
- the label of the metadata source,
- the project that it refers to,
- the type,
- the url and
- some more details about the source.
Edit a metadata source
Create a new harvest
The user is ready now to create a new harvesting job. Harvesting can be done through 2 different ways:
- After creating the Metadata Source, the user can select the button "New harvest".
- From the tab "Projects" in the left menu, the user chooses the project that intends to publish the dataset, chooses the button "New harvest", selects the appropriate metadata source and submits it.
After submitting the form, a new package id is created. The package is received and validated, so as the harvesting to be completed.
The user is able to see all the packages in a list, as it is shown below. This list describes the package id, the metadata source, the schema used in the source, the number of items that it contains, the status and the progress.
The package after the ingestion should be transformed to EDM provided that the schema used in the package is one of the intermediate schemas. In case that the native schema is EDM, this step is ommited. Apart from the mapping, the user has to choose a "Right statement" for all the records, if needed.
After transforming the package, the user is able to enrich the records, using a number of services.
The next step is to add Service (Add Service) in the created Enrichment Plan. All the microservices are available to be used in EDM schema. For some microservices, parameters are needed to be added. More specifically:
- Geo Normalization service, which normalizes in an appropriate way the coordinates to the class edm:Place, takes as parameters the delimeter (, /, -, _) used to the records and the invert status (x instead of y, y instead of x).
- Vocabulary service, which allows users to create collections of thesauri terms and insert them automatically into all items of their aggregated packages, takes as parameter the subject Collection.
Especially for the "Vocabulary service", the user has to create a Subject collection through the corresponding tab in the menu.
The user is able to choose among 27 vocabularies (Vocabulary microservice) the appropriate concepts so as their records to be enriched.
In general, all the microservices are integrated into the LoCloud aggregation environment through their APIs. The result of each microservice after enrichment phase is:
- The GeoLocation service adds coordinates and/or place names in the EDM record. More specifically, in the class edm:Place, the elements wgs84_pos:lat, wgs84_pos:long <wgs84_pos:lat><wgs84_pos:long>and skos:prefLabel <skos:preflabel>are added.
- The Vocabulary service adds the selected subject to the records, through the element dc:subject in the class edm:providedCHO.
- The Vocabulary matching service matches the subject of the item with the appropriate thesauri of the list of the Vocabularies. This information is not depicted in the EDM record yet but the user can see it in the “Enrichment Details”.
- The Background link service matches the subject of the item with a subject of DBPedia. This information is not depicted in the EDM record yet but the user can see it in the “Enrichment Details”.
The package now contains both the native.xml (CARARE, OAI-DC, etc),the edm.xml and the enriched edm.xml. The user has to select in ehich schema (EDM or enriched EDM)wants to publish the records.
One more time, the task manager assigns this job to the user.
After publication, there are 3 possible actions:
1. The package to be delivered to Europeana.
2. The package to be rejected, giving a short description for the reason of this rejection.
3. The package to be withdrawn
This information is:
- the total number of items of the package;
- the status of the package (harvested, validated, ingested, transformed or enriched, published) and the current status of it;
- the metadata schema, included all the available metadata schemas of the package;
- the metadata source, used for the harvesting;
- the history of all the tasks that have been completed;
- the noticiations appeared in the user during the whole workflow;
- statistics about the mandatory elements per schema (native, EDM, enrichedEDM);
- package details, which refer to the harvested, the validated, the transformed and the enriched items;
- completeness about the mandatory and the strongly recommended elements per schema (native, EDM, enrichedEDM).
Additionally, the user can be informed about:
- enrichment details including details about the microservices and more specifically about background links and vocabulary matching;
- thematic information, including all the subjects that appear to the records;
- spatial information, indicating in the map the points that the records refer to.
Moreover, the user is able to see a list of the items of the package, choosing the button "View items".
This list contains information about the name of the item (label), the native id, the native schema, the EDM and the enriched EDM.