[DD-1360] Solr XML update format exporter for on-the-fly export @ Task execution endpoint Erstellt: 29/Apr/16  Aktualisiert: 24/Nov/16  Erledigt: 24/Nov/16

Status: Fertig
Projekt: D:SWARM
Komponente(n): Keine
betrifft Version(en): Keine
Lösungsversion(en): Keine
Sicherheitsstufe: Default

Typ: Story Priorität: Schwer
Autor: Gängler, Thomas Bearbeiter: Nicht zugewiesen
Lösung: Fertig Stimmen: 1
Stichwörter: Solr, XML, export, finc, on-the-fly, task_execution, xml_export, xml_export_on-the-fly

Sprint: sprint 62, sprint 63, sprint 64, sprint 65, sprint 66, sprint 67


Additionally, to the currently existing approach for producing an export format that can be utilised for ingesting data into a Solr index, it looks very useful to implement this variant as well (Nauber, Jens can elaborate on the advantages of this variant and the disadvantages of the current solution).

For details on the Solr XML update format see https://wiki.apache.org/solr/UpdateXmlMessages

(note: was ticket DD-961 in that past (which has been deleted))

Kommentar durch Nauber, Jens [ 29/Apr/16 ]

The Solr XML update format is great for file based Solr DataImportHandler Tasks, because no need of transformation the input data.

Example of Solr DIH configuration:

<entity name="dswarm-swb" processor="FileListEntityProcessor" baseDir="/data/raw/dswarm_schema/swb/latest" fileName=".*\.xml" dataSource="null" recursive="true" rootEntity="false">
	<entity name="dswarm-swb-record" processor="XPathEntityProcessor" datasource="null" stream="true" url="${dswarm-swb.fileAbsolutePath}" useSolrAddSchema="true" transformer="LogTransformer" logTemplate="SWB Record with record_id ${dswarm-swb-record.record_id} is processed: ${dswarm-swb-record}" logLevel="trace"/>
Kommentar durch Gängler, Thomas [ 12/Sep/16 ]

see https://github.com/zazi/dswarm/tree/dd-1360 (+ https://github.com/zazi/dswarm-commons/tree/dd-1360)

Kommentar durch Gängler, Thomas [ 12/Sep/16 ]

see https://github.com/dswarm/dswarm-commons/pull/21

Kommentar durch Gängler, Thomas [ 14/Sep/16 ]

see https://github.com/dswarm/dswarm/pull/143

Erstellt am Wed Aug 15 03:06:48 CEST 2018 mit JIRA 7.8.0#78000-sha1:4568b9d484113d74dfb6f152fb925b5fa1be2ef7.