You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-128Lines changed: 4 additions & 128 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,132 +1,8 @@
1
-
# What's Embulk?
1
+
# embulk-standards: Embulk's standard plugins
2
2
3
-
Embulk is a parallel bulk data loader that **helps data transfer between various storages, databases, NoSQL and cloud services**.
3
+
These are Embulk's "standard" plugins which are embedded in Embulk's executable binary distributions.
4
4
5
-
**Embulk supports plugins** to add functions. You can [share the plugins](https://plugins.embulk.org/) to keep your custom scripts readable, maintainable, and reusable.
[Embulk, an open-source plugin-based parallel bulk data loader](http://www.slideshare.net/frsyuki/embuk-making-data-integration-works-relaxed) at Slideshare
9
-
10
-
# Document
11
-
12
-
Embulk documents: https://www.embulk.org/
13
-
14
-
### Using plugins
15
-
16
-
You can use plugins to load data from/to various systems and file formats. Here is the list of publicly released plugins: [list of plugins by category](https://plugins.embulk.org/).
17
-
18
-
An example is [embulk-output-command](https://github.com/embulk/embulk-output-command) plugin. It executes an external command to output the records.
19
-
20
-
To install plugins, you can use `embulk gem install <name>` command:
21
-
22
-
```
23
-
embulk gem install embulk-output-command
24
-
embulk gem list
25
-
```
26
-
27
-
Embulk bundles some built-in plugins such as `embulk-encoder-gzip` or `embulk-formatter-csv`. You can use those plugins with following configuration file:
28
-
29
-
```yaml
30
-
in:
31
-
type: file
32
-
path_prefix: "./try1/csv/sample_"
33
-
...
34
-
out:
35
-
type: command
36
-
command: "cat - > task.$INDEX.$SEQID.csv.gz"
37
-
encoders:
38
-
- {type: gzip}
39
-
formatter:
40
-
type: csv
41
-
```
42
-
43
-
### Resuming a failed transaction
44
-
45
-
Embulk supports resuming failed transactions.
46
-
To enable resuming, you need to start transaction with `-r PATH` option:
47
-
48
-
```
49
-
embulk run config.yml -r resume-state.yml
50
-
```
51
-
52
-
If the transaction fails, embulk stores state some states to the yaml file. You can retry the transaction using exactly same command:
53
-
54
-
```
55
-
embulk run config.yml -r resume-state.yml
56
-
```
57
-
58
-
If you give up on resuming the transaction, you can use `embulk cleanup` subcommand to delete intermediate data:
59
-
60
-
```
61
-
embulk cleanup config.yml -r resume-state.yml
62
-
```
63
-
64
-
### Using plugin bundle
65
-
66
-
`embulk mkbundle` subcommand creates a isolated bundle of plugins. You can install plugins (gems) to the bundle directory instead of ~/.embulk directory. This makes it easy to manage versions of plugins.
67
-
To use the bundle, add `-b <bundle_dir>` option to `guess`, `preview`, or `run` subcommand. `embulk mkbundle` also generates some example plugins to \<bundle_dir>/embulk/\*.rb directory.
68
-
69
-
See the generated \<bundle_dir>/Gemfile file how to plugin bundles work.
70
-
71
-
```
72
-
embulk mkbundle ./embulk_bundle # please edit ./embulk_bundle/Gemfile to add plugins. Detailed usage is written in the Gemfile
73
-
embulk guess -b ./embulk_bundle ...
74
-
embulk run -b ./embulk_bundle ...
75
-
```
76
-
77
-
## Use cases
78
-
79
-
* [Scheduled bulk data loading to Elasticsearch + Kibana 5 from CSV files](https://www.embulk.org/recipes/scheduled-csv-load-to-elasticsearch-kibana5.html)
80
-
81
-
For further details, visit [Embulk documentation](https://www.embulk.org/).
82
-
83
-
## Upgrading to the latest version
84
-
85
-
Following command updates embulk itself to the specific released version.
86
-
87
-
```sh
88
-
embulk selfupdate x.y.z
89
-
```
90
-
91
-
## Embulk Development
92
-
93
-
### Build
94
-
95
-
```
96
-
./gradlew cli # creates pkg/embulk-VERSION.jar
97
-
```
98
-
99
-
You can see JaCoCo's test coverage report at `${project}/build/reports/tests/index.html`
100
-
You can see Findbug's report at `${project}/build/reports/findbug/main.html` # FIXME coverage information is not included somehow
101
-
102
-
You can use `classpath` task to use `bundle exec ./bin/embulk` for development:
103
-
104
-
```
105
-
./gradlew -t classpath # -x test: skip test
106
-
./bin/embulk
107
-
```
108
-
109
-
To deploy artifacts to your local maven repository at ~/.m2/repository/:
110
-
111
-
```
112
-
./gradlew install
113
-
```
114
-
115
-
To compile the source code of embulk-core project only:
116
-
117
-
```
118
-
./gradlew :embulk-core:compileJava
119
-
```
120
-
121
-
Task `dependencies` shows dependency tree of embulk-core project:
122
-
123
-
```
124
-
./gradlew :embulk-core:dependencies
125
-
```
126
-
127
-
### Update JRuby
128
-
129
-
Modify `jrubyVersion` in `build.gradle` to update JRuby of Embulk.
5
+
Their source code had been managed in the same [main repository of Embulk](https://github.com/embulk/embulk) until [`v0.10.33`](https://github.com/embulk/embulk/tree/v0.10.33). They have been split from the main repository since `v0.10.34`.
130
6
131
7
### Release
132
8
@@ -151,7 +27,7 @@ signing.secretKeyRingFile=(the absolute path to the secret key ring file contain
151
27
152
28
#### Release
153
29
154
-
Modify `version` in `build.gradle` at a detached commit to bump Embulk version up.
30
+
Modify `version` in `build.gradle` at a detached commit to bump up the versions of Embulk standard plugins.
0 commit comments