You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 01_Getting_Metafacture.md
+5-6Lines changed: 5 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,17 +8,16 @@ It was initially developed by DNB starting in 2011 and is maintained since 2019
8
8
Metafacture can be used as a stand-alone application or as a Java library in other applications.
9
9
The name Metafacture is a portmanteau of the words metadata and manufacture.
10
10
11
-
In this tutorial we are going to teach how to use Metafacture to peform simple and advanced data processing tasks.
11
+
In this tutorial we are going to teach how to use Metafacture to perform simple and advanced data processing tasks.
12
12
13
13
At the beginning we will use the web application [Metafacture Playground](https://metafacture.org/playground/). So no
14
14
installation is needed. The Playground is a web interface that helps you getting started.
15
-
It is useful to test, share and export metafacture workflows.
15
+
It is useful to test, share and export Metafacture workflows.
16
16
17
-
Starting with [Chapter 6](https://github.com/metafacture/metafacture-tutorial/blob/main/06_MetafactureCLI.md)
18
-
we can switch from using Playground to running Metafacture on our own Hardware.
19
-
But the examples are still provided in the playground.
17
+
Starting with [Chapter 6](./06_MetafactureCLI.md) we can switch from using Playground to running Metafacture on our own hardware.
18
+
But the examples are still provided in the Playground.
20
19
21
-
To run Metafacture on your local maschine you need you need a Linux/Unix Bash Shell (part of every Linux, MacOS and Windows >=10) with Metafacture Core installed. In this course we are not teaching you how to use the command line. For that see:
20
+
To run Metafacture on your local machine you need a Linux/Unix Bash Shell (part of every Linux, MacOS and Windows >=10) with Metafacture Core installed. In this course we are not teaching you how to use the command line. For that see:[Chapter 6](./06_MetafactureCLI.md)
22
21
23
22
24
23
**Next lesson**: [02 Introduction into Metafacture Flux](./02_Introduction_into_Metafacture-Flux.md)
Copy file name to clipboardExpand all lines: 03_Introduction_into_Metafacture-Fix.md
+38-33Lines changed: 38 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,19 +1,19 @@
1
1
# Lesson 3: Introduction into Metafacture Fix
2
2
3
-
In the last session we learned about Flux-Moduls.
4
-
Flux-Moduls can do a lot of things. They configure the the "high-level" transformation pipeline.
3
+
In the last session we learned about Flux moduls.
4
+
Flux moduls can do a lot of things. They configure the "high-level" transformation pipeline.
5
5
6
-
But the main transformation of incoming data at record-, elemenet- and value-level is usually done by the transformation moduls: `fix` or `morph` as one step in the pipeline.
6
+
But the main transformation of incoming data at record, elemenet and valuelevel is usually done by the transformation moduls[Fix](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html#fix) or [Morph](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html#morph) as one step in the pipeline.
7
7
8
-
What do we mean when we talk about transformation, e.g.:
8
+
By transformation we mean things like:
9
9
10
-
* Manipulating element-names and element-values
10
+
* Manipulating elementnames and elementvalues
11
11
* Change hierachies and structures of records
12
-
* Lookup values in concordance list.
12
+
* Lookup values in concordance list
13
13
14
14
But not changing serialization that is part of encoding and decoding.
15
15
16
-
In this tutorial we focus on Fix. If you want to learn about Morph have a look at https://slides.lobid.org/metafacture-2020/#/
16
+
In this tutorial we focus on [Fix](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html#fix). If you want to learn about Morph have a look [at this presentation](https://slides.lobid.org/metafacture-2020/#/) and the [great documentation by Swiss Bib](https://sschuepbach.github.io/metamorph-hacks/).
17
17
18
18
19
19
## Metafacture Fix and Fix Functions
@@ -22,7 +22,7 @@ So let's dive into Metafacture Fix and get back to the [Playground](https://meta
22
22
23
23
Clear it if needed and paste the following Flux in the Flux-File area.
24
24
25
-
```
25
+
```default
26
26
"https://openlibrary.org/books/OL2838758M.json"
27
27
| open-http
28
28
| as-lines
@@ -45,14 +45,14 @@ The `fix` module in Metafacture is used to manipulate the input data filtering f
45
45
HINT: As long as you embedd the fix functions in the Flux Workflow, you have to use double quotes to fence the fix functions,
46
46
and single quotes in the fix functions. As we did here: `fix ("retain('title')")`
47
47
48
-
Now let us additionally keep the info that is given in the element `"publish_date"` and in the subfield `"key"` as well as the subfield `"key"` in `'type'` by adding `'publish_date', 'type.key'` to `retain`:
48
+
Now let us additionally keep the info that is given in the element `"publish_date"` and the subfield `"key"` in `'type'` by adding `'publish_date', 'type.key'` to `retain`:
@@ -64,15 +64,15 @@ You should now see something like this:
64
64
---
65
65
title: "Ordinary vices"
66
66
publish_date: "1984"
67
-
type:
68
-
key: "/type/edition"
67
+
notes:
68
+
value: "Bibliography: p. 251-260.\nIncludes index."
69
69
70
70
```
71
71
72
-
When manipulating data you often need to create many fixes to process a data file in the format and structure you need. With a text editor you can write all fix functions in a singe separate fix-file.
72
+
When manipulating data you often need to create many fixes to process a data file in the format and structure you need. With a text editor you can write all fix functions in a singe separate Fix file.
73
73
74
-
The playground has an transformationFile-content area that can be used as if the fix is in a separate file.
75
-
In the playground we use the variable `transformationFile` to adress the fix file in the playground.
74
+
The playground has an transformationFile-content area that can be used as if the Fix is in a separate file.
75
+
In the playground we use the variable `transformationFile` to adress the Fix file in the playground.
@@ -107,40 +107,45 @@ The output should be something like this:
107
107
title: "Ordinary vices"
108
108
publish_date: "1984"
109
109
pub_type: "/type/edition"
110
+
notes:
111
+
value: "Bibliography: p. 251-260.\nIncludes index."
110
112
```
111
113
112
-
So with`move_field` we moved and renamed an existing element.
114
+
With`move_field` we moved and renamed an existing element.
113
115
As next step add the following function before the `retain` function.
114
116
115
117
```
116
118
replace_all("pub_type","/type/","")
117
119
```
118
120
119
-
If you execute your last workflow with the Process-Button again, you should now see as ouput:
121
+
If you execute your last workflow with the "Process" button again, you should now see as ouput:
120
122
121
123
```YAML
122
124
---
123
125
title: "Ordinary vices"
124
126
publish_date: "1984"
125
127
pub_type: "edition"
128
+
notes:
129
+
value: "Bibliography: p. 251-260.\nIncludes index."
126
130
```
127
131
128
-
We cleaned up the `"pub_type"` element, so that we can better read it.
132
+
We cleaned up the value of `"pub_type"` element for better readability.
129
133
130
134
[See the example in the playground.](https://metafacture.org/playground/?flux=%22https%3A//openlibrary.org/books/OL2838758M.json%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-json%0A%7C+fix+%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=move_field%28%22type.key%22%2C%22pub_type%22%29%0Areplace_all%28%22pub_type%22%2C%22/type/%22%2C%22%22%29%0Aretain%28%22title%22%2C+%22publish_date%22%2C+%22pub_type%22%29)
131
135
132
-
Metafacture contains many fix function to manipulate data. Also there are many flux commands/modules that can be used.
136
+
Metafacture contains many Fix functions to manipulate data. Also there are many Flux commands/modules that can be used.
133
137
134
-
Check the documentation to get a complete list of [flux command](https://github.com/metafacture/metafacture-documentation/blob/master/flux-commands.md) and [fix functions](https://github.com/metafacture/metafacture-documentation/blob/master/Fix-function-and-Cookbook.md#functions). This post only presented a short introduction into Metafacture. In the next posts we will go deeper into its capabilities.
138
+
Check the documentation to get a complete list of [Flux commands](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html) and [Fix functions](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html). This post only presented a short introduction into Metafacture. In the next posts we will go deeper into its capabilities.
135
139
136
-
Besides fix functions you can also add as many comments and linebreaks as you want to a fix.
140
+
Besides Fix functions you can also add as many comments and linebreaks as you want to a Fix.
137
141
138
-
Comments are good if you want to add descriptions to you transformation. Like the following.
139
-
Comments in Fix start with a hashtag `#`, while in Flux they start with `//`
142
+
Adding comments will save you a lot of time and effort when you look at your code in the future.
140
143
141
-
e.g.:
144
+
Comments in Fix start with a hash mark `#`, while in Flux they start with `//`.
142
145
143
-
```
146
+
Example:
147
+
148
+
```PERL
144
149
# Make type.key a top level element.
145
150
move_field("type.key","pub_type")
146
151
@@ -162,9 +167,9 @@ Have a look at the fix functions: https://metafacture.org/metafacture-documentat
or [use timestamp](https://metafacture.org/playground/?flux=%22https%3A//openlibrary.org/books/OL2838758M.json%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-json%0A%7C+fix+%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=move_field%28%22type.key%22%2C%22pub_type%22%29%0Areplace_all%28%22pub_type%22%2C%22/type/%22%2C%22%22%29%0Atimestamp%28%22mape_date%22%2Cformat%3A%22yyyy-MM-dd%27T%27HH%3Amm%3Ass%22%2C+timezone%3A%22Europe/Berlin%22%29%0Aretain%28%22title%22%2C+%22publish_date%22%2C+%22by_statement%22%2C+%22pub_type%22%29)
172
+
or [use timestamp](https://metafacture.org/playground/?flux=%22https%3A//openlibrary.org/books/OL2838758M.json%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-json%0A%7C+fix+%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=move_field%28%22type.key%22%2C%22pub_type%22%29%0Areplace_all%28%22pub_type%22%2C%22/type/%22%2C%22%22%29%0Atimestamp%28%22map_date%22%2Cformat%3A%22yyyy-MM-dd%27T%27HH%3Amm%3Ass%22%2C+timezone%3A%22Europe/Berlin%22%29%0Aretain%28%22title%22%2C+%22publish_date%22%2C+%22by_statement%22%2C+%22pub_type%22%2C+%22map_date%22%29)
0 commit comments