You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 01_Getting_Metafacture.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,8 +15,10 @@ installation is needed. The Playground is a web interface that helps you getting
15
15
It is useful to test, share and export metafacture workflows.
16
16
17
17
Starting with [Chapter 6](https://github.com/metafacture/metafacture-tutorial/blob/main/06_MetafactureCLI.md)
18
-
we switch from using Playground to running Metafacture on our own Hardware.
19
-
At this point, to be able to follow the examples, you need a Linux/Unix Bash Shell (part of every Linux, MacOS and Windows >=10)
20
-
with Metafacture Core and Metafacture Fix installed.
18
+
we can switch from using Playground to running Metafacture on our own Hardware.
19
+
But the examples are still provided in the playground.
20
+
21
+
To run Metafacture on your local maschine you need you need a Linux/Unix Bash Shell (part of every Linux, MacOS and Windows >=10) with Metafacture Core installed. In this course we are not teaching you how to use the command line. For that see:
22
+
21
23
22
24
**Next lesson**: [02 Introduction into Metafacture Flux](./02_Introduction_into_Metafacture-Flux.md)
There are two extra path structures that need to be explained:
123
124
124
-
* repeatable fields
125
+
* repeated fields
125
126
* arrays
126
127
128
+
In general: Repeated fields as well arrays are both handled as arrays. They can also call these internal arrays lists.
129
+
Both names (list and array) are reflected in some fix functions (e.g. `add_array` or the `list`-Bind.)
130
+
127
131
In an data set an element sometimes can have multiple instances. Different data models solve this possibility differently. XML-Records can have all elements multiple times, element repition is possible and in many schemas it is (partly) allowed. E.g. the subject element exists three times:
128
132
133
+
### Working with repeated fields
134
+
129
135
```XML
130
136
<subject>Metadata</subject>
131
137
<subject>Datatransformation</subject>
@@ -152,6 +158,8 @@ If you want to refer to all creators then you can use the array wildcard `*` whi
In JSON or YAML element repetion is possible but unusual. Instead of repeating elements repetition is constructed as list so that an element can have more than one value. This is called an array and looks like this in YAML:
156
164
157
165
In our book example e.g. we have the following array:
@@ -236,7 +244,7 @@ e.g.:
236
244
237
245
[Here is a way to collect and count all paths in all records by using the `list-fix-paths`-command.](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+as-lines%0A%7C+decode-pica%0A%7C+list-fix-paths%0A%7C+print%0A%3B&data=001@+%1Fa5%1F01-2%1E001A+%1F01100%3A15-10-94%1E001B+%1F09999%3A12-06-06%1Ft16%3A10%3A17.000%1E001D+%1F09999%3A99-99-99%1E001U+%1F0utf8%1E001X+%1F00%1E002@+%1F0Aag%1E003@+%1F0482147350%1E006U+%1F094%2CP05%1E007E+%1F0U+70.16407%1E007I+%1FSo%1F074057548%1E011@+%1Fa1970%1E017A+%1Farh%1E021A+%1FaDie+@Berufsfreiheit+der+Arbeitnehmer+und+ihre+Ausgestaltung+in+vo%CC%88lkerrechtlichen+Vertra%CC%88gen%1FdEine+Grundrechtsbetrachtg%1E028A+%1F9106884905%1F7Tn3%1FAgnd%1F0106884905%1FaProjahn%1FdHorst+D.%1E033A+%1FpWu%CC%88rzburg%1E034D+%1FaXXXVIII%2C+165+S.%1E034I+%1Fa8%1E037C+%1FaWu%CC%88rzburg%2C+Jur.+F.%2C+Diss.+v.+7.+Aug.+1970%1E%0A001@+%1F01%1Fa5%1E001A+%1F01140%3A08-12-99%1E001B+%1F09999%3A05-01-08%1Ft22%3A57%3A29.000%1E001D+%1F09999%3A99-99-99%1E001U+%1F0utf8%1E001X+%1F00%1E002@+%1F0Aa%1E003@+%1F0958090564%1E004A+%1Ffkart.+%3A+DM+9.70%2C+EUR+4.94%2C+sfr+8.00%2C+S+68.00%1E006U+%1F000%2CB05%2C0285%1E007I+%1FSo%1F076088278%1E011@+%1Fa1999%1E017A+%1Farb%1Fasi%1E019@+%1FaXA-AT%1E021A+%1FaZukunft+Bildung%1FhPolitische+Akademie.+%5BHrsg.+von+Gu%CC%88nther+R.+Burkert-Dottolo+und+Bernhard+Moser%5D%1E028C+%1F9130681849%1F7Tp1%1FVpiz%1FAgnd%1F0130681849%1FE1952%1FaBurkert%1FdGu%CC%88nther+R.%1FBHrsg.%1E033A+%1FpWien%1FnPolit.+Akad.%1E034D+%1Fa79+S.%1E034I+%1Fa24+cm%1E036F+%1Fx299+12%1F9551720077%1FgAdn%1F7Tb1%1FAgnd%1F01040469-7%1FaPolitische+Akademie%1FgWien%1FYPA-Information%1FhPolitische+Akademie%2C+WB%1FpWien%1FJPolitische+Akad.%2C+WB%1Fl99%2C2%1E036F/01+%1Fx12%1F9025841467%1FgAdvz%1Fi2142105-5%1FYAktuelle+Fragen+der+Politik%1FhPolitische+Akademie%1FpWien%1FJPolitische+Akad.+der+O%CC%88VP%1FlBd.+2%1E045E+%1Fa22%1Fd18%1Fm370%1E047A+%1FSFE%1Fata%1E%0A001@+%1Fa5%1F01%1E001A+%1F01140%3A19-02-03%1E001B+%1F09999%3A19-06-11%1Ft01%3A20%3A13.000%1E001D+%1F09999%3A26-04-03%1E001U+%1F0utf8%1E001X+%1F00%1E002@+%1F0Aal%1E003@+%1F0361809549%1E004A+%1FfHlw.%1E006U+%1F000%2CL01%1E006U+%1F004%2CP01-s-41%1E006U+%1F004%2CP01-f-21%1E007G+%1FaDNB%1F0361809549%1E007I+%1FSo%1F072658383%1E007M+%1F04413/0275%1E011@+%1Fa1925%1E019@+%1FaXA-DXDE%1FaXA-DE%1E021A+%1FaHundert+Jahre+Buchdrucker-Innung+Hamburg%1FdWesen+u.+Werden+d.+Vereinigungen+Hamburger+Buchdruckereibesitzer+1825-1925+%3B+Gedenkschrift+zur+100.+Wiederkehr+d.+Gru%CC%88ndungstages%2C+verf.+im+Auftr.+d.+Vorstandes+d.+Buchdrucker-Innung+%28Freie+Innung%29+zu+Hamburg%1FhFriedrich+Voeltzer%1E028A+%1F9101386281%1F7Tp1%1FVpiz%1FAgnd%1F0101386281%1FE1895%1FaVo%CC%88ltzer%1FdFriedrich%1E033A+%1FpHamburg%1FnBuchdrucker-Innung+%28Freie+Innung%29%1E033A+%1FpHamburg%1Fn%5BVerlagsbuchh.+Broschek+%26+Co.%5D%1E034D+%1Fa44+S.%1E034I+%1Fa4%1E%0A001@+%1Fa5%1F01-3%1E001A+%1F01240%3A01-08-95%1E001B+%1F09999%3A24-09-10%1Ft17%3A42%3A20.000%1E001D+%1F09999%3A99-99-99%1E001U+%1F0utf8%1E001X+%1F00%1E002@+%1F0Af%1E003@+%1F0945184085%1E004A+%1F03-89007-044-2%1FfGewebe+%3A+DM+198.00%2C+sfr+198.00%2C+S+1386.00%1E006T+%1F095%2CN35%2C0856%1E006U+%1F095%2CA48%2C1186%1E006U+%1F010%2CP01%1E007I+%1FSo%1F061975997%1E011@+%1Fa1995%1E017A+%1Fara%1E021A+%1Fx213%1F9550711899%1FYNeues+Handbuch+der+Musikwissenschaft%1Fhhrsg.+von+Carl+Dahlhaus.+Fortgef.+von+Hermann+Danuser%1FpLaaber%1FJLaaber-Verl.%1FS48%1F03-89007-030-2%1FgAc%1E021B+%1FlBd.+13.%1FaRegister%1Fhzsgest.+von+Hans-Joachim+Hinrichsen%1E028C+%1F9121445453%1F7Tp3%1FVpiz%1FAgnd%1F0121445453%1FE1952%1FaHinrichsen%1FdHans-Joachim%1E034D+%1FaVIII%2C+408+S.%1E045V+%1F9090001001%1E047A+%1FSFE%1Fagb/fm%1E%0A001@+%1F01-2%1Fa5%1E001A+%1F01239%3A18-08-11%1E001B+%1F09999%3A05-09-11%1Ft23%3A31%3A44.000%1E001D+%1F01240%3A30-08-11%1E001U+%1F0utf8%1E001X+%1F00%1E002@+%1F0Af%1E003@+%1F01014417392%1E004A+%1Ffkart.%1E006U+%1F011%2CA37%1E007G+%1FaDNB%1F01014417392%1E007I+%1FSo%1F0752937239%1E010@+%1Fager%1E011@+%1Fa2011%1E017A+%1Fara%1Fasf%1E021A+%1Fxtr%1F91014809657%1F7Tp3%1FVpiz%1FAgnd%1F01034622773%1FE1958%1FaLu%CC%88beck%1FdMonika%1FYPersonalwirtschaft+mit+DATEV%1FhMonika+Lu%CC%88beck+%3B+Helmut+Lu%CC%88beck%1FpBodenheim%1FpWien%1FJHerdt%1FRXA-DE%1FS650%1FgAc%1E021B+%1FlTrainerbd.%1E032@+%1Fg11%1Fa1.+Ausg.%1E034D+%1Fa129+S.%1E034M+%1FaIll.%1E047A+%1FSFE%1Famar%1E047A+%1FSERW%1Fasal%1E047I+%1Fu%24%1Fc04%1FdDNB%1Fe1%1E)
238
246
239
-
Other ways are also possible too.
247
+
Other ways are also possible, too.
240
248
241
249
## Bonus: XML in MF and their paths
242
250
@@ -258,5 +266,4 @@ title.lang
258
266
259
267
If you want to create xml with attributes then you need to map to this structure too. We will come back to lection working with xml in lesson 10.
260
268
261
-
262
269
Next lessons: [05 More Fix Concepts](./05-More-Fix-Concepts.md)
Copy file name to clipboardExpand all lines: 05-More-Fix-Concepts.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -160,7 +160,7 @@ end
160
160
161
161
Metafacture supports lots of conditionals, find a list of all of them [here](https://github.com/metafacture/metafacture-documentation/blob/master/Fix-function-and-Cookbook.md#conditionals).
162
162
163
-
Hint: Some conditionals have variations with `all_`or `any_` while they behave in the same way if you process them on simple string-elements. They also can be used with arrays/lists then the conditional has different out-come depending on the fact that all (`all_`) or at least one (`any_`) value of an array matches the requierement.
163
+
Hint: Some conditionals have variations with `all_`, `any_` or `none` while they behave in the same way if you process them on simple string-elements. They also can be used with arrays/lists then the conditional has different out-come depending on the fact that all (`all_`) or at least one (`any_`) value of an array matches the requierement. `none` checks if the conditionally does not match.
Excercise: Download the following folder with three test examples and run them. Adjust them if needed:
155
156
156
-
TODO: Give homework:
157
-
- Provide a file or a file-folder.
158
-
- Give a homework.
159
-
- Give the solution.
160
-
157
+
- Run example script locally.
158
+
- Adjust example script so that all json files but no other in the folder are read. Get inspired by https://github.com/metafacture/metafacture-core/blob/master/metafacture-runner/src/main/dist/examples/misc/reading-dirs/read-dirs.flux.
159
+
- Change the FLUX script so that you write the output in the local file instead of stoudt.
160
+
- Add a fix file and add the fix module in the flux. With `nothing()` as content.
161
+
- Add some transformations to the fix e.g. add fields.
161
162
162
163
Next lesson: [07 Processing MARC](./07_Processing_MARC.md)
Copy file name to clipboardExpand all lines: 07_Processing_MARC.md
+6-5Lines changed: 6 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -304,6 +304,8 @@ join_field(isbn,",")
304
304
retain("id","title","isbn")
305
305
```
306
306
307
+
HINT: Sometimes it makes sense to create an empty array by `add_array` or an empty hash/object by `add_hash` before adding content to the array or hash. The is depending to the use-cases. In our case we need empty values if no field is mapped for the csv.
308
+
307
309
Step 2, create the flux workflow and execute this worklow either with CLI or the playground:
308
310
309
311
```default
@@ -337,15 +339,14 @@ You will see this as output:
337
339
"1080278184","Renfro Valley Kentucky Rainer H. Schmeissner",""
338
340
```
339
341
340
-
In the fix above we mapped the 245-field to the title, and iterated over every subfield with the help of the list-bind and the `?`- wildcard.
341
-
. The ISBN is in the 020-field. Because MARC records can contain one or more 020 fields we created an isbn array with add_arrayy and added the values using the isbn.$append syntax. Next we turned the isbn array back into a comma separated string using the join_field fix. As last step we deleted all the fields we didn’t need in the output with the `retain` syntax.
342
+
In the fix above we mapped the 245-field to the title, and iterated over every subfield with the help of the list-bind and the `?`- wildcard. The ISBN is in the 020-field. Because MARC records can contain one or more 020 fields we created an isbn array with add_arrayy and added the values using the isbn.$append syntax. Next we turned the isbn array back into a comma separated string using the join_field fix. As last step we deleted all the fields we didn’t need in the output with the `retain` syntax.
343
+
344
+
Different versions of MARC-Serialization need different workflows: e.g. h[ere see an example of Aseq-Marc Files that are transformed to marcxml.](https://test.metafacture.org/playground/?flux=%22https%3A//raw.githubusercontent.com/LibreCat/Catmandu-MARC/dev/t/rug01.aleph%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-aseq%0A%7C+merge-same-ids%0A%7C+encode-marcxml%0A%7C+print%0A%3B)
342
345
343
346
In this post we demonstrated how to process MARC data. In the next post we will show some examples how catmandu typically can be used to process library data.
344
347
345
348
## Excercise.
346
349
347
-
348
-
349
-
# TODO_ Add example that transforms aleph sequential. Also open ticket, that enables the transformation.
350
+
TODO: ADD some examples for MARC, e.g. the examples from our last workshop.
350
351
351
352
Next lesson: [08 Harvest data with OAI-PMH](./08_Harvest_data_with_OAI-PMH.md)
Copy file name to clipboardExpand all lines: 08_Harvest_data_with_OAI-PMH.md
+5-7Lines changed: 5 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,28 +31,25 @@ To get some Dublin Core records from the collection of Ghent University Library
31
31
32
32
But if you just want to use the specific metadata records and not the oai-pmh specific metadata wrappers then specify the xml handler like this: `| handle-generic-xml(recordtagname="dc")`
33
33
34
-
You can also harvest MARC data and store it in a file:
34
+
You can also harvest MARC data, serialze it to marc-binary and store it in a file:
> TODO: Revisit this example when https://github.com/metafacture/metafacture-core/issues/454 is fixed.
47
-
48
46
You can also transform incoming data and immediately store/index it with MongoDB or Elasticsearch. For the transformation you need to create a fix (see Lesson 3) in the playground or in a text editor:
49
47
50
48
Add the following fixes to the file:
51
49
52
50
```PEARL
53
51
copy_field("001","_id")
54
52
copy_field("245??.a","title")
55
-
add_arrayy("creator[]")
56
53
copy_field("100??.a","creator[].$append")
57
54
copy_field("260??.c","date")
58
55
retain("_id","title","creator[]","date")
@@ -66,12 +63,13 @@ Now you can run an ETL process (extract, transform, load) with this worklflow:
XML elements often come with namespaces. By default namespaces are not emitted, only the element names are provided.
102
+
When elements have the name but belong to different namespaces, or you want to emit the incoming namespaces you can use
103
+
the option `emitnamespace="true"` for the `handle-generic-xml` command.
104
+
105
+
Add this option to the previous example and see that there are elements belonging to lido as well as skos.
106
+
107
+
See this in the Playground [here](https://metafacture.org/playground/?flux=%22http%3A//www.lido-schema.org/documents/examples/LIDO-v1.1-Example_FMobj00154983-LaPrimavera.xml%22%0A%7C+open-http%0A%7C+decode-xml%0A%7C+handle-generic-xml%28recordtagname%3D%22lido%22%2C+emitnamespace%3D%22true%22%29%0A%7C+encode-yaml%0A%7C+print%0A%3B).
108
+
109
+
When you want to add the namespace definition to the output metafacture does not know that by itself but you have to tell metafacture
110
+
the new namespace when `encoding-xml` either by a file with the option `namespacefile` or in the flux with the option `namespaces`.
111
+
112
+
See here an example for adding namespaces in the flux:
0 commit comments