Skip to content

Commit e998a2d

Browse files
Merge pull request #38 from strongbox/SB-561
SB-561 maven indexer
2 parents defc932 + b6b90a9 commit e998a2d

File tree

3 files changed

+90
-94
lines changed

3 files changed

+90
-94
lines changed

docs/developer-guide/layout-providers/maven-2-layout-provider.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -42,10 +42,7 @@ This layout provider also indexes the artifacts using the Lucene-based [Maven In
4242

4343
## Maven 2 Search Providers
4444

45-
Please, be aware that the Maven 2 layout provider (unlike most of the other layout providers) supports two search providers:
46-
47-
* [OrientDB (default)](../search-providers.md#orientdbsearchprovider)
48-
* [Maven Indexer](../search-providers.md#mavenindexersearchprovider) (search provider)
45+
The Maven 2 layout provider uses the [OrientDB (default)](../search-providers#orientdbsearchprovider).
4946

5047
## Classes of Interest
5148

docs/developer-guide/maven-indexer.md

Lines changed: 85 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -43,65 +43,114 @@ except a record for the POM.
4343
Maven indexer also may or may not store a `.pom` file as an artifact. However, firstly it tries to find matching _real_
4444
artifact file in the file system, switching over to indexing that, instead of the `.pom` file.
4545

46-
### What's not indexable
46+
### What's not indexable ?
4747

4848
The following file types are not indexable:
4949

5050
* `maven-metadata.xml` files
5151
* `.properties` files
5252
* checksum and signature files `.asc`, `.md5`, `.sha1`
5353

54+
### What Are Packed Indexes?
55+
56+
Packed indexes are either a complete compressed index, or a compressed subset of data which can be applied to an
57+
existing index incrementally.
58+
59+
## What's the goal of packed indexes ?
60+
61+
Packed indexes are used for transferring indexes from the remote to the proxy/tool.
62+
5463
## What Is The Maven Indexer Used For In The Strongbox Project?
5564

5665
The Maven Indexer is used for integration with IDE-s.
5766

58-
The Maven indexes produced by most public repository managers (such as Maven Central), are usually rebuilt once a week,
59-
as it can take quite a while to scan large repositories with countless small artifacts. Hence, these indexes have proven
60-
to not be quite as up-to-date, as the real server's contents. For this reason, we are using OrientDB to keep more
61-
accurate information.
67+
## How Does The Maven Indexer Work In Strongbox ?
68+
69+
Strongbox allows you to download packed repository Maven Index. Every maven repository with indexing enabled serves the packed Maven Index.
70+
Based on the repository type, the index is prepared as follows:
71+
72+
* [Hosted][hosted-repositories-link] repositories:
73+
74+
Strongbox stores the information of uploaded artifacts (in hosted repositories) in the OrientDB database. This Information is used
75+
to create the hosted repository Maven Index. Strongbox serves following [Maven Indexer Fields][maven-indexer-fields-link] in indexer:
76+
77+
* artifactId;
78+
* version;
79+
* classifier;
80+
* packaging/extension
81+
* classnames
82+
* lastModified
83+
* size
84+
* signatureExists
85+
* sha1
86+
* sourcesExists
87+
* javadocExists
88+
89+
For each hosted maven repository defined in strongbox there should be a scheduled task configured to rebuild the index (unless you don't
90+
want to serve Maven Index for some repository). [Rebuild Maven Indexes Cron Job](https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/cron/jobs/RebuildMavenIndexesCronJob.java)
91+
is scheduled at strongbox startup on time specified by the `cronExpression` value within `repositoryConfiguration` in [strongbox.yaml][strongbox-yaml-link]
92+
configuration file for `repository` with `type` equal to `hosted`.
93+
94+
The process of rebuilding the hosted repository Maven Index purges previous index and recreates it from scratch using OrientDB to keep
95+
more accurate information. Thanks to this the index is up-to-date, as the real server's contents.
96+
97+
* [Proxy][proxy-repositories-link] repositories:
98+
99+
Strongbox fetches the proxy repository Maven Index from remote host, stores it locally and serves it. For each proxy maven repository
100+
defined in strongbox there should be a scheduled task configured to re-fetch the index (unless you don't want to serve Maven Index
101+
for some repository). [Download Remote Maven Index Cron Job](https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/cron/jobs/DownloadRemoteMavenIndexCronJob.java)
102+
is scheduled at strongbox startup on time specified by the `cronExpression` value within `repositoryConfiguration` in [strongbox.yaml][strongbox-yaml-link]
103+
configuration file for `repository` with `type` equal to `proxy`.
104+
105+
Strongbox supports incremental proxy repository Maven Index. It means that it will update the index by downloading only the missing Maven
106+
Index parts that were not downloaded before. Thanks to this feature, strongbox saves the bandwidth costs. Once the soft parts are downloaded,
107+
they are merged with the locally existing part and finally packed.
108+
109+
The Maven indexes produced by most public repository managers (such as Maven Central), are usually rebuilt once a week.
110+
111+
* [Group][group-repositories-link] repositories:
112+
113+
Strongbox creates the group repository Maven Index by merging their underlying repositories Maven Indexes. This process is recursive meaning that
114+
root group repository will contain in the Maven Index all the information stored in every inner and outer vertex repository Maven Index.
115+
[Merge Maven Group Repository Index Cron Job](https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/cron/jobs/MergeMavenGroupRepositoryIndexCronJob.java)
116+
is scheduled at strongbox startup on time specified by the `cronExpression` value within `repositoryConfiguration` in [strongbox.yaml][strongbox-yaml-link]
117+
configuration file for `repository` with `type` equal to `group`.
118+
119+
The process of rebuilding the group repository Maven Index purges previous index and recreates it from scratch to keep more accurate information.
120+
121+
## Where Are The Maven Indexes Located in Strongbox ?
62122

63123
There are two types of Maven Indexer indexes:
64124

65125
* Local
66126
* For hosted repositories, this contains the artifacts that have been deployed to this repository.
67-
* For proxy repositories, this contains the artifacts which have been requested and cached from the remote repository.
127+
* For group repositories, this contains the merged index from the underlying repositories.
68128
* Remote
69129
* This is downloaded from the remote repository and contains a complete index of what is available on the remote.
70130

71-
## Where Are The Maven Indexes Located?
72-
73-
Every repository has an index under the `strongbox-vault/storages/${storageId}/${repositoryId}/.index` directory
131+
Every repository (with enabled indexing) has an index under the `strongbox-vault/storages/${storageId}/${repositoryId}/.index` directory
74132
where the index is located.
75133

76-
* [Hosted](../knowledge-base/repositories.md#hosted) repositories have:
77-
* Local: `strongbox-vault/storages/${storageId}/${repositoryId}/local/.index`
78-
* [Proxy](../knowledge-base/repositories.md#proxy) repositories have:
134+
* [Hosted][hosted-repositories-link] repositories: have:
79135
* Local: `strongbox-vault/storages/${storageId}/${repositoryId}/local/.index`
136+
* [Proxy][proxy-repositories-link] repositories have:
80137
* Remote: `strongbox-vault/storages/${storageId}/${repositoryId}/remote/.index`
138+
* [Group][group-repositories-link] repositories have:
139+
* Local: `strongbox-vault/storages/${storageId}/${repositoryId}/local/.index`
81140

82-
## Do Maven Indexes Break And How To Repair Them?
83-
84-
Usually, you don't need to rebuild the index, because all artifact operations should be handled via the REST API.
85-
86-
However, there are cases like for example:
87-
- Some artifacts have gone missing (hdd error, or somebody removed them and you need to restore one, or a whole batch of
88-
them manually directly on the file system without not using the REST API)
89-
- You have added/removed some artifact(s) manually on the file system and would like to update the index
90-
91-
## Packed Indexes
141+
## How to force to rebuild the repository index ?
92142

93-
In contrast to unpacked indexes (which are used for searching and browsing the remote), packed indexes are used for
94-
transferring indexes from the remote to the proxy/tool.
143+
Use REST API endpoint:
95144

96-
### What Are Packed Indexes?
145+
* `POST` `/api/maven/index/{storageId}/{repositoryId}`
146+
* see also [MavenIndexController](https://github.com/strongbox/strongbox/blob/master/strongbox-web-core/src/main/java/org/carlspring/strongbox/controllers/layout/maven/MavenIndexController.java)
97147

98-
Packed indexes are either a complete compressed index, or a compressed subset of data which can be applied to an
99-
existing index incrementally.
148+
## How to download the packed repository index ?
100149

101-
### When Are Packed Indexes Generated?
150+
Use REST API endpoint:
102151

103-
Packed indexes are generated when the index for a repository is rebuilt. They are not generated when a re-indexing
104-
request for a path in the repository is executed.
152+
* `GET` `/storages/{storageId}/{repositoryId}/.index/nexus-maven-repository-index.gz`
153+
* see also [MavenArtifactController](https://github.com/strongbox/strongbox/blob/master/strongbox-web-core/src/main/java/org/carlspring/strongbox/controllers/layout/maven/MavenArtifactController.java)
105154

106155
## Information For Developers
107156

@@ -110,7 +159,7 @@ The code for the Maven indexing is located under the [strongbox-storage-maven-la
110159
## See Also
111160
* [Maven Indexer: Github](https://github.com/apache/maven-indexer/)
112161
* [Maven Indexer: About](http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/index.html)
113-
* [Maven Indexer: Fields](http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/indexer-core/index.html)
162+
* [Maven Indexer: Fields][maven-indexer-fields-link]
114163
* [Maven Indexer: Core (Notes)](https://github.com/apache/maven-indexer/tree/master/indexer-core)
115164
* [Maven Indexer: Examples](https://github.com/apache/maven-indexer/tree/master/indexer-examples)
116165
* [Maven Indexer: Incremental Downloading](http://blog.sonatype.com/2009/05/nexus-indexer-20-incremental-downloading/)
@@ -121,4 +170,9 @@ The code for the Maven indexing is located under the [strongbox-storage-maven-la
121170
* [Stackoverflow: [maven-indexer]](http://stackoverflow.com/questions/tagged/maven-indexer)
122171

123172

173+
[strongbox-yaml-link]: https://github.com/strongbox/strongbox/blob/master/strongbox-resources/strongbox-storage-api-resources/src/main/resources/etc/conf/strongbox.yaml
174+
[maven-indexer-fields-link]: http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/indexer-core/index.html
175+
[hosted-repositories-link]: ../knowledge-base/repositories.md#hosted
176+
[proxy-repositories-link]: ../knowledge-base/repositories.md#proxy
177+
[group-repositories-link]: ../knowledge-base/repositories.md#group
124178
[strongbox-storage-maven-layout-provider]: https://github.com/strongbox/strongbox/tree/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider

docs/developer-guide/search-providers.md

Lines changed: 4 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -3,66 +3,16 @@
33
## Introduction
44

55
Search providers offer a way to execute searches against different search engines. By default, searches are executed
6-
against OrientDB, unless a search provider has been specified.
7-
8-
The idea behind search providers is that certain layout providers could require their own search engine, as is the
9-
case with Maven. Information about Maven artifacts is stored both in OrientDB and in the [Maven Indexer].
10-
As the [Maven Indexer] can actually be consumed by various clients and tools (such as other repository managers,
11-
IDE-s and so on), we provided a way to further extend the searches for both existing and future layout providers which
12-
might also need to have their own search engine implementations, apart from the built-in one (OrientDB).
13-
6+
against OrientDB.
147

158
## Implemented Search Providers
169

1710
### OrientDbSearchProvider
1811

19-
The [OrientDbSearchProvider] is the default search provider which uses OrientDB.
20-
21-
### MavenIndexerSearchProvider
22-
23-
The [MavenIndexerSearchProvider] is the search provider for Maven artifacts when the [Maven Indexer] Lucene indexes
24-
should be queried.
25-
26-
## Implementing a Search Provider
27-
28-
Custom search providers should implement [SearchProvider] and register with [SearchProviderRegistry].
12+
Currently [OrientDbSearchProvider] is the only one supported search provider and it uses OrientDB.
2913

3014
## Executing A Search Programmatically
3115

32-
### MavenIndexerSearchProvider Example
33-
34-
```java
35-
@Inject
36-
private ArtifactIndexesService artifactIndexesService;
37-
38-
// Run a search against the index and get a list of
39-
// all the artifacts matching this exact GAV
40-
SearchRequest request = new SearchRequest(storageId,
41-
repositoryId,
42-
"+g:" + groupId + " " +
43-
"+a:" + artifactId + " " +
44-
"+v:" + version,
45-
MavenIndexerSearchProvider.ALIAS);
46-
47-
try
48-
{
49-
SearchResults results = artifactSearchService.search(request);
50-
51-
for (SearchResult result : results.getResults())
52-
{
53-
String artifactPath = result.getArtifactCoordinates().toPath();
54-
55-
logger.debug("Artifact path " + artifactPath);
56-
57-
// Do something else here that is more meaningful
58-
}
59-
}
60-
catch (SearchException e)
61-
{
62-
logger.error(e.getMessage(), e);
63-
}
64-
```
65-
6616
### OrientDbSearchProvider Example
6717

6818
```java
@@ -76,8 +26,7 @@ String query = "groupId=org.carlspring.strongbox.searches;" +
7626

7727
SearchRequest request = new SearchRequest(storageId,
7828
repositoryId,
79-
query,
80-
OrientDbSearchProvider.ALIAS);
29+
query);
8130

8231
try
8332
{
@@ -99,13 +48,9 @@ catch (SearchException e)
9948
```
10049

10150
## See Also
102-
* [Maven Indexer]
10351
* [REST-API]
10452

10553

10654
[SearchProvider]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-api/src/main/java/org/carlspring/strongbox/providers/search/SearchProvider.java
107-
[SearchProviderRegistry]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-api/src/main/java/org/carlspring/strongbox/providers/search/SearchProviderRegistry.java
10855
[OrientDbSearchProvider]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-api/src/main/java/org/carlspring/strongbox/providers/search/OrientDbSearchProvider.java
109-
[MavenIndexerSearchProvider]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/providers/search/MavenIndexerSearchProvider.java
110-
[REST-API]: ../user-guide/rest-api.md
111-
[Maven Indexer]: ./maven-indexer.md
56+
[REST-API]: ../user-guide/rest-api.md

0 commit comments

Comments
 (0)