Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FCREPO-3834 Enable Camel toolbox to send xml records to Solr indexing service #191

Merged
merged 5 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 1 addition & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,12 +128,11 @@ indexes objects into an external Solr server.
| :--- | :---| :---- |
| solr.indexing.enabled | Enables/disables the SOLR indexing service. Disabled by default | false |
| solr.fcrepo.checkHasIndexingTransformation | When true, check for an indexing transform in the resource matadata. | true |
| solr.fcrepo.defaultTransform | The solr default ldpath transform when none is provide in resource metadata. | null |
| solr.fcrepo.defaultTransform | The solr default XSL transform when none is provide in resource metadata. | null |
| solr.input.stream | The JMS topic or queue serving as the message source | broker:topic:fedora |
| solr.reindex.stream | The JMS topic or queue serving as the reindex message source | broker:queue:solr.reindex |
| solr.commitWithin | Milliseconds within which commits should occur | 10000 |
| solr.indexing.predicate | When true, check that resource is of type http://fedora.info/definitions/v4/indexing#Indexable; otherwise do not index it. | false |
| solr.ldpath.service.baseUrl | The LDPath service base url | http://localhost:9085/ldpath |
| solr.filter.containers | A comma-separate list of containers that should be ignored by the indexer | http://localhost:8080/fcrepo/rest/audit |


Expand All @@ -157,48 +156,6 @@ indexes objects into an external triplestore.
| triplestore.prefer.include | A list of [valid prefer values](https://fedora.info/2021/05/01/spec/#additional-prefer-values) defining predicates to be included | null |
| triplestore.prefer.omit | A list of [valid prefer values](https://fedora.info/2021/05/01/spec/#additional-prefer-values) defining predicates to be omitted. | http://www.w3.org/ns/ldp#PreferContainment |

### LDPath Service

This application implements an LDPath service on repository
resources. This allows users to dereference and follow URI
links to arbitrary lengths. Retrieved triples are cached locally
for a specified period of time.

More information about LDPath can be found at the [Marmotta website](http://marmotta.apache.org/ldpath/language.html).

Note: The LDPath service requires an LDCache backend, such as `fcrepo-service-ldcache-file`.

#### Usage
The LDPath service responds to `GET` and `POST` requests using any accessible resources as a context.

For example, a request to
`http://localhost:9086/ldpath/?context=http://localhost/rest/path/to/fedora/object`
will apply the appropriate ldpath program to the specified resource. Note: it is possible to
identify non-Fedora resources in the context parameter.

A `GET` request can include a `ldpath` parameter, pointing to the URL location of an LDPath program:

`curl http://localhost:9086/ldpath/?context=http://localhost/rest/path/to/fedora/object&ldpath=http://example.org/ldpath`

Otherwise, it will use a simple default ldpath program.

A `POST` request can also be accepted by this endpoint. The body of a `POST` request should contain
the entire `LDPath` program. The `Content-Type` of the request should be either `text/plain` or
`application/ldpath`.

`curl -XPOST -H"Content-Type: application/ldpath" -d @program.txt http://localhost:9086/ldpath/?context=http://localhost/rest/path/to/fedora/object

#### Properties
| Name | Description| Default Value |
| :--- | :---| :---- |
| ldpath.fcrepo.cache.timeout | The timeout in seconds for the ldpath cache | 0 |
| ldpath.rest.prefix | The LDPath rest endpoint prefix | no | /ldpath|
| ldpath.rest.port| The LDPath rest endpoint port | no | 9085 |
| ldpath.rest.host| The LDPath rest endpoint host | no | localhost |
| ldpath.cache.timeout | LDCache timeout in seconds | no | 86400 |
| ldpath.ldcache.directory | LDCache directory | no | ldcache/ |
| ldpath.transform.path | The LDPath transform file path | classpath:org/fcrepo/camel/ldpath/default.ldpath |

### Reindexing Service

This application implements a reindexing service so that
Expand Down
3 changes: 2 additions & 1 deletion docker-compose/camel-toolbox-config/configuration.properties
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
fcrepo.baseUrl=http://fcrepo:8080/fcrepo/rest
fcrepo.authHost=fcrepo
fcrepo.authHost=localhost

jms.brokerUrl=tcp://fcrepo:61616

solr.indexing.enabled=true
solr.baseUrl=http://solr:8983/solr/fcrepo
solr.fcrepo.defaultTransform=org/fcrepo/camel/indexing/solr/default_transform.xsl

triplestore.indexing.enabled=true
triplestore.baseUrl=http://fuseki:3030/fcrepo
Expand Down
6 changes: 0 additions & 6 deletions fcrepo-camel-toolbox-app/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,6 @@
<version>${project.parent.version}</version>
</dependency>

<dependency>
<groupId>${project.parent.groupId}</groupId>
<artifactId>fcrepo-ldpath</artifactId>
<version>${project.parent.version}</version>
</dependency>

<dependency>
<groupId>${project.parent.groupId}</groupId>
<artifactId>fcrepo-fixity</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,6 @@ static class SolrIndexingEnabled extends ConditionOnPropertyTrue {
@Value("${solr.indexing.predicate:false}")
private boolean indexingPredicate;

@Value("${solr.ldpath.service.baseUrl:http://localhost:9085/ldpath}")
private String ldpathServiceBaseUrl;

@Value("${solr.filter.containers:http://localhost:8080/fcrepo/rest/audit}")
private String filterContainers;

Expand Down Expand Up @@ -85,10 +82,6 @@ public boolean isIndexingPredicate() {
return indexingPredicate;
}

public String getLdpathServiceBaseUrl() {
return ldpathServiceBaseUrl;
}

public String getFilterContainers() {
return filterContainers;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
import static org.apache.camel.Exchange.CONTENT_TYPE;
import static org.apache.camel.Exchange.HTTP_METHOD;
import static org.apache.camel.Exchange.HTTP_QUERY;
import static org.apache.camel.Exchange.HTTP_URI;
import static org.apache.camel.builder.PredicateBuilder.and;
import static org.apache.camel.builder.PredicateBuilder.in;
import static org.apache.camel.builder.PredicateBuilder.not;
Expand Down Expand Up @@ -104,6 +103,7 @@ public void configure() throws Exception {
.when(and(simple(config.isIndexingPredicate() + " != 'true'"),
simple(config.isCheckHasIndexingTransformation() + " != 'true'")))
.setHeader(INDEXING_TRANSFORMATION).simple(config.getDefaultTransform())
.log(LoggingLevel.INFO, "sending to update_solr")
Surfrdan marked this conversation as resolved.
Show resolved Hide resolved
.to("direct:update.solr")
.otherwise()
.to(
Expand Down Expand Up @@ -136,48 +136,34 @@ public void configure() throws Exception {
.setHeader(HTTP_QUERY).simple("commitWithin=" + config.getCommitWithin())
.to(config.getSolrBaseUrl() + "/update?useSystemProperties=true");

from("direct:external.ldpath").routeId("FcrepoSolrLdpathFetch")
.removeHeaders("CamelHttp*")
.setHeader(HTTP_URI).header(INDEXING_TRANSFORMATION)
.setHeader(HTTP_METHOD).constant("GET")
.to("http://localhost/ldpath");

from("direct:transform.ldpath").routeId("FcrepoSolrTransform")
.removeHeaders("CamelHttp*")
.setHeader(HTTP_URI).simple(config.getLdpathServiceBaseUrl())
.setHeader(HTTP_QUERY).simple("context=${headers.CamelFcrepoUri}")
.to("http://localhost/ldpath");

/*
* Handle update operations
*/
from("direct:update.solr").routeId("FcrepoSolrUpdater")
.log(LoggingLevel.INFO, logger, "Indexing Solr Object ${header.CamelFcrepoUri}")
.setBody(constant(null))
.setHeader(INDEXING_URI).simple("${header.CamelFcrepoUri}")
// Don't index the transformation itself
.filter().simple("${header.CamelIndexingTransformation} != ${header.CamelIndexingUri}")
.choice()
.when(header(INDEXING_TRANSFORMATION).startsWith("http"))
.log(LoggingLevel.INFO, logger,
"Fetching external LDPath program from ${header.CamelIndexingTransformation}")
.to("direct:external.ldpath")
.setHeader(HTTP_METHOD).constant("POST")
.to("direct:transform.ldpath")
.to("direct:send.to.solr")
.when(or(header(INDEXING_TRANSFORMATION).isNull(), header(INDEXING_TRANSFORMATION).isEqualTo("")))
.setHeader(HTTP_METHOD).constant("GET")
.to("direct:transform.ldpath")
.to("direct:send.to.solr")
.otherwise()
.log(LoggingLevel.INFO, logger, "Skipping ${header.CamelFcrepoUri}");
.when(header(INDEXING_TRANSFORMATION).isNotNull())
.log(LoggingLevel.INFO, logger,
"Sending RDF for Transform with with XSLT from ${header.CamelIndexingTransformation}")
.toD("xslt:${header.CamelIndexingTransformation}")
.to("direct:send.to.solr")
Surfrdan marked this conversation as resolved.
Show resolved Hide resolved
.when(or(header(INDEXING_TRANSFORMATION).isNull(), header(INDEXING_TRANSFORMATION).isEqualTo("")))
.log(LoggingLevel.INFO, logger,"No Transform supplied")
.to("direct:send.to.solr")
.otherwise()
.log(LoggingLevel.INFO, logger, "Skipping ${header.CamelFcrepoUri}");

/*
* Send the transformed resource to Solr
*/
from("direct:send.to.solr").routeId("FcrepoSolrSend")
.log(LoggingLevel.INFO, logger, "sending to solr...")
.removeHeaders("CamelHttp*")
.setHeader(CONTENT_TYPE).constant("text/xml")
.setHeader(HTTP_METHOD).constant("POST")
.setHeader(HTTP_QUERY).simple("commitWithin=" + config.getCommitWithin())
.to(config.getSolrBaseUrl() + "/update?useSystemProperties=true");
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:fedora="http://fedora.info/definitions/v4/repository#"
xmlns:ldp="http://www.w3.org/ns/ldp#">

<xsl:template match="/">
<add>
<doc>
<field name="id"><xsl:value-of select="rdf:RDF/rdf:Description/@rdf:about" /></field>
<xsl:for-each select="rdf:RDF/rdf:Description/rdf:type">
<field name="rdftype"><xsl:value-of select="@rdf:resource" /></field>
</xsl:for-each>
<field name="contains"><xsl:value-of select="rdf:RDF/rdf:Description/ldp:contains/@rdf:resource" /></field>
<field name="lastmodified"><xsl:value-of select="rdf:RDF/rdf:Description/fedora:lastModified" /></field>
<field name="created"><xsl:value-of select="rdf:RDF/rdf:Description/fedora:created" /></field>
</doc>
</add>
</xsl:template>

</xsl:stylesheet>
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ public static void beforeClass() {
System.setProperty("solr.reindex.stream", "seda:bar");
System.setProperty("error.maxRedeliveries", "10");
System.setProperty("fcrepo.baseUrl", baseURL);
System.setProperty("solr.fcrepo.defaultTransform", "http://localhost/ldpath/program");
System.setProperty("solr.fcrepo.defaultTransform", "org/fcrepo/camel/indexing/solr/default_transform.xsl");
System.setProperty("solr.baseUrl", solrURL);
System.setProperty("solr.reindex.stream", "seda:reindex");
System.setProperty("solr.fcrepo.checkHasIndexingTransformation", "true");
Expand Down Expand Up @@ -214,7 +214,7 @@ public void testPrepareRouterIndexable() throws Exception {
deleteEndpoint.setAssertPeriod(ASSERT_PERIOD_MS);
updateEndpoint.expectedMessageCount(1);
updateEndpoint.expectedHeaderReceived("CamelIndexingTransformation",
"http://localhost/ldpath/default");
"org/fcrepo/camel/indexing/solr/default_transform.xsl");

template.sendBodyAndHeaders(
IOUtils.toString(loadResourceAsStream("indexable.rdf"), "UTF-8"),
Expand Down Expand Up @@ -245,7 +245,7 @@ public void testPrepareRouterContainer() throws Exception {
updateEndpoint.setAssertPeriod(ASSERT_PERIOD_MS);
deleteEndpoint.expectedMessageCount(1);
deleteEndpoint.expectedHeaderReceived("CamelIndexingTransformation",
"http://localhost/ldpath/program");
"org/fcrepo/camel/indexing/solr/default_transform.xsl");

template.sendBodyAndHeaders(
IOUtils.toString(loadResourceAsStream("container.rdf"), "UTF-8"),
Expand All @@ -271,10 +271,6 @@ public void testUpdateRouter() throws Exception {
a.mockEndpointsAndSkip("http*");
});

AdviceWith.adviceWith(context, "FcrepoSolrTransform", a -> {
a.mockEndpointsAndSkip("http*");
});

final var solrUpdateEndPoint = MockEndpoint.resolve(context, "mock:" + solrURL + "/update");
solrUpdateEndPoint.expectedMessageCount(1);
solrUpdateEndPoint.expectedHeaderReceived(Exchange.HTTP_METHOD, "POST");
Expand Down
2 changes: 1 addition & 1 deletion fcrepo-indexing-solr/src/test/resources/indexable.rdf
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@
<rdf:type rdf:resource="http://fedora.info/definitions/v4/repository#Container"/>
<rdf:type rdf:resource="http://fedora.info/definitions/v4/repository#Resource"/>
<rdf:type rdf:resource="http://fedora.info/definitions/v4/indexing#Indexable"/>
<indexing:hasIndexingTransformation rdf:about="http://localhost/ldpath/default"/>
<indexing:hasIndexingTransformation rdf:about="org/fcrepo/camel/indexing/solr/default_transform.xsl"/>
</rdf:Description>
</rdf:RDF>
130 changes: 0 additions & 130 deletions fcrepo-ldpath/my.ldpath

This file was deleted.

Loading
Loading