Skip to content

Commit 6488b28

Browse files
committed
Adjustments after phus review
1 parent 2d383b1 commit 6488b28

File tree

4 files changed

+33
-30
lines changed

4 files changed

+33
-30
lines changed

02_Introduction_into_Metafacture-Flux.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Lesson 2: Introduction into Metafacture Flux
22

3-
To perform data processing with Metafacture transformation workflows are configured with **Metafacture Flux**, a domain-specific scripting language (DSL).
3+
To perform data processing with Metafacture transformation workflows are configured with **Metafacture Flux**, a [domain-specific scripting language (DSL)](https://en.wikipedia.org/wiki/Domain-specific_language).
44
With Metafacture Flux we combine different modules for reading, opening, transforming, and writing data sets.
55

66
In this lesson we will learn about Metafacture Flux, what Flux workflows are and how to combine different Flux modules to create a workflow in order to process datasets.
@@ -9,9 +9,9 @@ In this lesson we will learn about Metafacture Flux, what Flux workflows are and
99

1010
To process data Metafacture can be used with the command line, as JAVA library or you can use the Metafacture Playground.
1111

12-
For this introduction we will start with the Playground since it allows a quick start without additional installing. The [Metafacture Playground](https://metafacture.org/playground) is a web interface to test and share Metafacture workflows. The commandline handling will be subject in [lesson 6](./06_MetafactureCLI.md)
12+
For this introduction we will start with the Playground since it allows a quick start without additional installing. The [Metafacture Playground](https://metafacture.org/playground) is a web interface to test and share Metafacture workflows. The commandline handling will be subject in [lesson 6](./06_MetafactureCLI.md)
1313

14-
In this tutorial we are going to process *structured information*. We call data structured when it organised in such a way that is easy processable by computers. Literary text documents like *War and Peace* are structured only in words and sentences, but a computer doesn’t know which words are part of the title or which words contain names. We had to tell the computer that. Today we will download a weather report in a structured format called JSON and inspect it with Metafacture.
14+
In this tutorial we are going to process *structured information*. We call data structured when it organised in such a way that is easy processable by computers. Literary text documents like *War and Peace* are structured only in words and sentences, but a computer doesn’t know which words are part of the title or which words contain names. We had to tell the computer that. Today we will download a book record in a structured format called JSON and inspect it with Metafacture.
1515

1616
## Flux Workflows
1717

@@ -56,7 +56,7 @@ But the result is the same if you process the flux.
5656

5757
Often you want to process data stored in a file.
5858

59-
The playground has an input area called `ìnputFile-content`. In this text area you can insert data that you have usually stored in a file. The variable `inputFile` can be used at the beginning of the workflow and it refers to the input file.
59+
The playground has an input area called `ìnputFile-content`. In this text area you can insert data that you have usually stored in a file. The variable `inputFile` can be used at the beginning of the workflow and it refers to the input file represented by the `ìnputFile-content`-area.
6060

6161
e.g.
6262

@@ -65,7 +65,7 @@ e.g.
6565

6666
So lets use `inputFile` instead of `INPUT` and copy the value of the text string in the Data field above the Flux.
6767

68-
Data:
68+
Data for `inputFile-content`:
6969

7070
`Hello, friend. I'am Metafacture!`
7171

@@ -81,7 +81,7 @@ Oops... There seems to be unusual output. Its a file path. Why?
8181
Because the variable `inputFile` refers to a file (path).
8282
To read the content of the file we need to handle the incoming file path differently.
8383

84-
(You will learn how to process files on your computer in lesson 06 when we show how to run metafacture on the command line.)
84+
(You will learn how to process files on your computer in lesson 06 when we show how to run metafacture on the command line on your computer.)
8585

8686
We need to add two additional Metafacture commands: `open-file` and `as-lines`
8787

@@ -215,13 +215,13 @@ last_modified:
215215
216216
This is better readable, right?
217217
218-
But we cannot only open the data we have in our `ìnputFile-content` field, we also can open stuff on the web:
218+
But we cannot only open the data we have in our `inputFile-content` field, we also can open stuff on the web:
219219

220220
Instead of using `inputFile` lets read the book data which is provided by the URL from above:
221221

222222
Clear your playground and copy the following Flux workflow:
223223

224-
```
224+
```default
225225
"https://openlibrary.org/books/OL2838758M.json"
226226
| open-http
227227
| as-lines
@@ -259,7 +259,7 @@ Now take some time and play around a little bit more and use some other modules.
259259

260260
<summary>Click to see the new workflow</summary>
261261

262-
```
262+
```default
263263
"https://openlibrary.org/books/OL2838758M.json"
264264
| open-http
265265
| as-lines

03_Introduction_into_Metafacture-Fix.md

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
In the last session we learned about Flux moduls.
44
Flux moduls can do a lot of things. They configure the "high-level" transformation pipeline.
55

6-
But the main transformation of incoming data at record, elemenet and value level is usually done by the transformation moduls Fix or Morph as one step in the pipeline.
6+
But the main transformation of incoming data at record, elemenet and value level is usually done by the transformation moduls [Fix](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html#fix) or [Morph](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html#morph) as one step in the pipeline.
77

88
By transformation we mean things like:
99

@@ -13,7 +13,7 @@ By transformation we mean things like:
1313

1414
But not changing serialization that is part of encoding and decoding.
1515

16-
In this tutorial we focus on Fix. If you want to learn about Morph have a look at https://slides.lobid.org/metafacture-2020/#/
16+
In this tutorial we focus on [Fix](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html#fix). If you want to learn about Morph have a look [at this presentation](https://slides.lobid.org/metafacture-2020/#/) and the [great documentation by Swiss Bib](https://sschuepbach.github.io/metamorph-hacks/).
1717

1818

1919
## Metafacture Fix and Fix Functions
@@ -22,7 +22,7 @@ So let's dive into Metafacture Fix and get back to the [Playground](https://meta
2222

2323
Clear it if needed and paste the following Flux in the Flux-File area.
2424

25-
```
25+
```default
2626
"https://openlibrary.org/books/OL2838758M.json"
2727
| open-http
2828
| as-lines
@@ -47,12 +47,12 @@ and single quotes in the fix functions. As we did here: `fix ("retain('title')")
4747

4848
Now let us additionally keep the info that is given in the element `"publish_date"` and the subfield `"key"` in `'type'` by adding `'publish_date', 'type.key'` to `retain`:
4949

50-
```
50+
```default
5151
"https://openlibrary.org/books/OL2838758M.json"
5252
| open-http
5353
| as-lines
5454
| decode-json
55-
| fix ("retain('title', 'publish_date', 'type.key')")
55+
| fix ("retain('title', 'publish_date', 'notes.value', 'type.key')")
5656
| encode-yaml
5757
| print
5858
;
@@ -64,11 +64,10 @@ You should now see something like this:
6464
---
6565
title: "Ordinary vices"
6666
publish_date: "1984"
67-
type:
68-
key: "/type/edition"
67+
notes:
68+
value: "Bibliography: p. 251-260.\nIncludes index."
6969
7070
```
71-
**TODO**: In diesem Beispiel macht es im Ergebnis keinen Unterscheid, ob man `type` oder `type.key` adressiert. Vielleicht ein anderes Beispiel wählen wie z. B. `identifiers`, wo mehrere Unterfelder vorhanden sind?
7271

7372
When manipulating data you often need to create many fixes to process a data file in the format and structure you need. With a text editor you can write all fix functions in a singe separate Fix file.
7473

@@ -82,7 +81,7 @@ Like this.
8281
Fix:
8382

8483
```PERL
85-
retain("title", "publish_date", "type.key")
84+
retain("title", "publish_date", "notes.value", "type.key")
8685
```
8786

8887
Using a separate Fix file is recommended if you need to write many Fix functions. It will keep the Flux workflow clear and legible.
@@ -98,7 +97,7 @@ Also change the `retain` function so that you keep the new element `"pub_type"`
9897

9998
```
10099
move_field("type.key","pub_type")
101-
retain("title", "publish_date", "pub_type")
100+
retain("title", "publish_date", "notes.value", "pub_type")
102101
```
103102
104103
The output should be something like this:
@@ -108,6 +107,8 @@ The output should be something like this:
108107
title: "Ordinary vices"
109108
publish_date: "1984"
110109
pub_type: "/type/edition"
110+
notes:
111+
value: "Bibliography: p. 251-260.\nIncludes index."
111112
```
112113

113114
With `move_field` we moved and renamed an existing element.
@@ -124,6 +125,8 @@ If you execute your last workflow with the "Process" button again, you should no
124125
title: "Ordinary vices"
125126
publish_date: "1984"
126127
pub_type: "edition"
128+
notes:
129+
value: "Bibliography: p. 251-260.\nIncludes index."
127130
```
128131
129132
We cleaned up the value of `"pub_type"` element for better readability.
@@ -142,7 +145,7 @@ Comments in Fix start with a hash mark `#`, while in Flux they start with `//`.
142145

143146
Example:
144147

145-
```
148+
```PERL
146149
# Make type.key a top level element.
147150
move_field("type.key","pub_type")
148151

04_Fix-Path.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# Lesson 4: FixPath and more complex transformations in Fix
22

3-
Over the last lessons we learned how to construct a Metafacture workflow, how to use the Playground and how Metafacture Flux and Fix can be used to parse structured information. We saw how you can use Flux to transform the JSON format into the YAML format which is easier to read and contains the same information. We also learned how to retrieve information out of the JSON file using a Fix function like `retain("title", "publish_date", "type.key")`.
3+
Over the last lessons we learned how to construct a Metafacture workflow, how to use the Playground and how Metafacture Flux and Fix can be used to parse structured information. We saw how you can use Flux to transform the JSON format into the YAML format which is easier to read and contains the same information. We also learned how to retrieve information out of the JSON file using a Fix function like `retain("title", "publish_date", "notes.value", "type.key")`.
44

55
In this lesson we will go deeper into Metafacture Fix and describe how to pluck data out of structured information.
66

77
First, let's fetch of a new book with the Metafacture Playground:
88

9-
```
9+
```default
1010
"https://openlibrary.org/books/OL27333998M.json"
1111
| open-http
1212
| as-lines
@@ -115,7 +115,7 @@ If you want to refer to all creators then you can use the `*` sign as a wildcard
115115
[See here](https://metafacture.org/playground/?flux=inputFile%0A%7Copen-file%0A%7Cas-records%0A%7Cdecode-yaml%0A%7Cfix%28transformationFile%29%0A%7Cencode-json%28prettyPrinting%3D%22true%22%29%0A%7Cprint%0A%3B&transformation=append%28%22creator.1%22%2C%22+Jonas%22%29%0Aappend%28%22creator.2%22%2C%22+Shaw%22%29%0Aappend%28%22creator.3%22%2C%22+Andrews%22%29%0Aprepend%28%22creator.%2A%22%2C%22Investigator+%22%29&data=---%0Acreator%3A+Justus%0Acreator%3A+Peter%0Acreator%3A+Bob%0A)
116116
</details>
117117

118-
### Working with arrays
118+
### Working with JSON and YAML arrays
119119

120120
In JSON or YAML element repetion is possible but unusual. Instead of repeating elements an element can have a list or array of values.
121121

@@ -166,7 +166,7 @@ So, the path of the `red` would be: `my.colors[].2`
166166

167167
And the path for `Peter` would be `characters[].2.name`
168168

169-
Also if you want to generate an array in the target format, then you need to add `[]` at the end of an list element like `newArray[]`.
169+
Also if you want to generate an array in the target format JSON or YAML, then you need to add `[]` at the end of an list element like `newArray[]`.
170170

171171
## Excercise:
172172

05-More-Fix-Concepts.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -127,19 +127,19 @@ end
127127

128128
### if...else
129129

130-
[You can also use conditionals `if` they meet the requirements and handle all others differently with `else`:](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+as-records%0A%7C+decode-yaml%0A%7C+fix%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=if+all_equal%28%22medium%22%2C%22Book%22%29%0A++++add_field%28%22type%22%2C%22BibliographicResource%22%29%0Aelse%0A++++add_field%28%22type%22%2C%22Other%22%29%0Aend&data=---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22eBook%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22Die+13%C2%BD+Leben+des+K%C3%A4pt%E2%80%99n+Blaub%C3%A4r%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22ger%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Audio+Book%22%0Aauthor%3A+%22Walter+Moers%22%0Anarrator%3A+%22Bronson+Pinchot%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22K%C3%A4pt%27n+Blaub%C3%A4r+-+Der+Film%22%0Amedium%3A+%22Movie%22%0Aauthor%3A+%22Walter+Moers%22%0Adirector%3A+%22Hayo+Freitag%22%0Alanguage%3A+%22ger%22)
130+
[You can add an `else`-block to any `if` conditional if you want to process fixes only if the contition is `falsy`:](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+as-records%0A%7C+decode-yaml%0A%7C+fix%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=if+all_equal%28%22medium%22%2C%22Book%22%29%0A++++add_field%28%22type%22%2C%22BibliographicResource%22%29%0Aelse%0A++++add_field%28%22type%22%2C%22Other%22%29%0Aend&data=---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22eBook%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22Die+13%C2%BD+Leben+des+K%C3%A4pt%E2%80%99n+Blaub%C3%A4r%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22ger%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Audio+Book%22%0Aauthor%3A+%22Walter+Moers%22%0Anarrator%3A+%22Bronson+Pinchot%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22K%C3%A4pt%27n+Blaub%C3%A4r+-+Der+Film%22%0Amedium%3A+%22Movie%22%0Aauthor%3A+%22Walter+Moers%22%0Adirector%3A+%22Hayo+Freitag%22%0Alanguage%3A+%22ger%22)
131131

132132
```PERL
133133
if all_equal("medium","Book")
134134
add_field("type","BibliographicResource")
135-
else
135+
else # if previous condition is false.
136136
add_field("type","Other")
137137
end
138138
```
139139

140-
### if...elseif...else
140+
### if...elsif(...else)
141141

142-
[You can also use conditionals `if` they meet the requirements and handle all others differently with an additional conditionals with `elsif` and the rest with `else`:](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+as-records%0A%7C+decode-yaml%0A%7C+fix%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=if+all_equal%28%22medium%22%2C%22Book%22%29%0A++++add_field%28%22type%22%2C%22BibliographicResource%22%29%0Aelsif+all_contain%28%22medium%22%2C%22Audio%22%29%0A++++add_field%28%22type%22%2C%22AudioResource%22%29%0Aelsif+all_match%28%22medium%22%2C%22.%2AMovie.%2A%22%29%0A++++add_field%28%22type%22%2C%22AudioVisualResource%22%29%0Aelse%0A++++add_field%28%22type%22%2C%22Other%22%29%0Aend&data=---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22eBook%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22Die+13%C2%BD+Leben+des+K%C3%A4pt%E2%80%99n+Blaub%C3%A4r%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22ger%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Audio+Book%22%0Aauthor%3A+%22Walter+Moers%22%0Anarrator%3A+%22Bronson+Pinchot%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22K%C3%A4pt%27n+Blaub%C3%A4r+-+Der+Film%22%0Amedium%3A+%22Movie%22%0Aauthor%3A+%22Walter+Moers%22%0Adirector%3A+%22Hayo+Freitag%22%0Alanguage%3A+%22ger%22)
142+
[You can also use additional `elsif`-blocks in as part of an `if`-conditional if you want to process data if the previous contitional is `falsy` but add a condition when the defined transformations should be processed:](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+as-records%0A%7C+decode-yaml%0A%7C+fix%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=if+all_equal%28%22medium%22%2C%22Book%22%29%0A++++add_field%28%22type%22%2C%22BibliographicResource%22%29%0Aelsif+all_contain%28%22medium%22%2C%22Audio%22%29%0A++++add_field%28%22type%22%2C%22AudioResource%22%29%0Aelsif+all_match%28%22medium%22%2C%22.%2AMovie.%2A%22%29%0A++++add_field%28%22type%22%2C%22AudioVisualResource%22%29%0Aelse%0A++++add_field%28%22type%22%2C%22Other%22%29%0Aend&data=---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22eBook%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22Die+13%C2%BD+Leben+des+K%C3%A4pt%E2%80%99n+Blaub%C3%A4r%22%0Amedium%3A+%22Book%22%0Aauthor%3A+%22Walter+Moers%22%0Alanguage%3A+%22ger%22%0A%0A---%0Aname%3A+%22The+13+1/2+lives+of+Captain+Bluebear%22%0Amedium%3A+%22Audio+Book%22%0Aauthor%3A+%22Walter+Moers%22%0Anarrator%3A+%22Bronson+Pinchot%22%0Alanguage%3A+%22eng%22%0A%0A---%0Aname%3A+%22K%C3%A4pt%27n+Blaub%C3%A4r+-+Der+Film%22%0Amedium%3A+%22Movie%22%0Aauthor%3A+%22Walter+Moers%22%0Adirector%3A+%22Hayo+Freitag%22%0Alanguage%3A+%22ger%22)
143143

144144
```PERL
145145
if all_equal("medium","Book")
@@ -244,8 +244,8 @@ end
244244

245245
[See this example here in the playground.](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+as-records%0A%7C+decode-yaml%0A%7C+fix%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=do+list%28path%3A%22colours%5B%5D%22%2C%22var%22%3A%22%24i%22%29%0A++++if+any_equal%28%22%24i%22%2C%22green%22%29%0A++++++++add_array%28%22result%5B%5D%22%29+%23+To+create+a+new+array+named+result%0A++++++++upcase%28%22%24i%22%29%0A++++++++append%28%22%24i%22%2C%22+is+a+nice+color%22%29%0A++++++++copy_field%28%22%24i%22%2C%22result%5B%5D.%24append%22%29%0A++++end%0Aend&data=---%0Acolours%3A%0A+-+red%0A+-+yellow%0A+-+green)
246246

247-
TO BE CONTINUED ...
247+
For the supported binds see: https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#selectors
248248

249-
TODO: Add more Excercises.
249+
TODO: Add excercises.
250250

251251
Next lesson: [06 Metafacture CLI](./06_MetafactureCLI.md)

0 commit comments

Comments
 (0)