MORE ingestion through OAI-PMH

1. Login to the system

The user has to fill in the username and password.
After logging into, the main page of the LoCloud MORE appears. The user is able to:

  • have a quick look in the whole procedure of ingestion;
  • check the tasks refered to their packages;
  • choose one of the main functions. This part contains "links" to the profile, the metadata sources and the available projects;
  • look at the notifications refered to their packages.

MORE welcome

2. New harvest

Create a new metadata source

The first step for the harvesting task is to create a metadata source, through the tab "Metadata sources" in the left menu. After this, the user has to choose the button "New Data Source".

The user has to fill in the fields:

  • type, selecting OAI-PMH Repository ;
  • project, one of the projects in the list. In our case, the project refers to LoCloud;
  • label, the label of the new metadata source;

Then the user chooses the button "Next" so as to fill in some more fields.


These fields are the following:

  • schema, one of the acceptable schemas of the project;
  • url, only the OAI-PHM URL without the "verb" commands. The user after inserting the url should validate it through Image ;
  • set, select the appropriate set from the list;
  • format, select the OAI metadata format which corresponds to the set;
  • record, OAI record element which takes the value 'record'.

The metadata sources appear in a list, in which is shown:

  • the label of the metadata source,
  • the project that it refers to,
  • the type,
  • the url and
  • some more details about the source. 


Edit a metadata source

The user is able to edit a metadata source by choosing it from the list and after changing it, the user saves the form. Image

Create a new harvest

The user is ready now to create a new harvesting job. Harvesting can be done through 2 different ways:

  • After creating the Metadata Source, the user can select the button "New harvest".


  • From the tab "Projects" in the left menu, the user chooses the project that intends to publish the dataset, chooses the button "New harvest", selects the appropriate metadata source and submits it.


After submitting the form, a new package id is created. The package is received and validated, so as the harvesting to be completed. 
The user is able to see all the packages in a list, as it is shown below. This list describes the package id, the metadata source, the schema used in the source, the number of items that it contains, the status and the progress.

3. Ingest

Provided that the harvest is completed, it can be ingested. The user is informed to ingest the package through the task assignment with a relevant message.


4. Transform

The package after the ingestion should be transformed to EDM provided that the schema used in the package is one of the intermediate schemas. In case that the native schema is EDM, this step is ommited. Apart from the mapping, the user has to choose a "Right statement" for all the records, if needed.

The task manager informs the user about the details of the transformation, from which EDM records of the package will be created. 

5. Enrich

After transforming the package, the user is able to enrich the records, using a number of services. 

First of all, the user has to make a new Enrichment plan. In the left menu, choosing the tab "Enrichment" and after this the "Enrichment Plan", the user can create a new plan or edit an existing one.

The next step is to add Service (Add Service) in the created Enrichment Plan. All the microservices are available to be used in EDM schema. For some microservices, parameters are needed to be added. More specifically:

  • Geo Normalization service, which normalizes in an appropriate way the coordinates to the class edm:Place, takes as parameters the delimeter (, /, -, _) used to the records and the invert status (x instead of y, y instead of x).
  • Vocabulary service, which allows users to create collections of thesauri terms and insert them automatically into all items of their aggregated packages, takes as parameter the subject Collection.

Especially for the "Vocabulary service", the user has to create a Subject collection through the corresponding tab in the menu.
my_subject_collection .
The user is able to choose among 27 vocabularies (Vocabulary microservice) the appropriate concepts so as their records to be enriched.
subject example .

In general, all the microservices are integrated into the LoCloud aggregation environment through their APIs. The result of each microservice after enrichment phase is:

  • The GeoLocation service adds coordinates and/or place names in the EDM record. More specifically, in the class edm:Place, the elements wgs84_pos:lat, wgs84_pos:long <wgs84_pos:lat><wgs84_pos:long>and skos:prefLabel <skos:preflabel>are added.
  • The Vocabulary service adds the selected subject to the records, through the element dc:subject  in the class edm:providedCHO.
  • The Vocabulary matching service matches the subject of the item with the appropriate thesauri of the list of the Vocabularies. This information is not depicted in the EDM record yet but the user can see it in the “Enrichment Details”. 
  • The Background link service matches the subject of the item with a subject of DBPedia. This information is not depicted in the EDM record yet but the user can see it in the “Enrichment Details”.

At this time, the user returns to the package so as to start the enrichment, by selecting the appropriate enrichment plan.


6. Publish

The package now contains both the native.xml (CARARE, OAI-DC, etc),the edm.xml and the enriched edm.xml. The user has to select in ehich schema (EDM or enriched EDM)wants to publish the records.

One more time, the task manager assigns this job to the user.


After publication, there are 3 possible actions: 

1. The package to be delivered to Europeana. 

2.  The package to be rejected, giving a short description for the reason of this rejection.

Image .

3. The package to be withdrawn


Package information

The user is able to follow the status of one package by choosing one of them from the list and to see some more information about the package.

This information is:

  • the total number of items of the package;
  • the status of the package (harvested, validated, ingested, transformed or enriched, published) and the current status  of it;
  • the metadata schema, included all the available metadata schemas of the package;
  • the metadata source, used for the harvesting;
  • the history of all the tasks that have been completed;
  • the noticiations appeared in the user during the whole workflow;
  • statistics about the mandatory elements per schema (native, EDM, enrichedEDM);
  • package details, which refer to the harvested, the validated, the transformed and the enriched items;
  • completeness about the mandatory and the strongly recommended elements per schema (native, EDM, enrichedEDM).

Additionally, the user can be informed about:

  • enrichment details including details about the microservices and more specifically about background links and vocabulary matching;
  • thematic information, including all the subjects that appear to the records;
  • spatial information, indicating in the map the points that the records refer to.


Moreover, the user is able to see a list of the items of the package, choosing the button "View items".
This list contains information about the name of the item (label), the native id, the native schema, the EDM and the enriched EDM.

Furthermore, the user is able to follow the history of the package (when it has been harvested, transformed, enriched or published).

Last but not least, the user is able to be informed about the status of the packages from the tab "Notifications".

Notifications inform the user for the status of each package, while a history report is logged for better custody of the whole procedure.