Skip to content

Commit

Permalink
FCREPO-3834 Enable Camel toolbox to send xml records to Solr indexing…
Browse files Browse the repository at this point in the history
… service (#191)

* replacing ldpath service with XSLT processing solr indexer

* no longer need HTTP_URI

* added properties for new solr xsl transforms to docker-compose properties file

* returning config to original state rather than my dev stack

* adding fields to solr xsl

---------

Co-authored-by: Dan Field <[email protected]>
  • Loading branch information
Surfrdan and Surfrdan authored Feb 15, 2024
1 parent b250e28 commit e71b893
Show file tree
Hide file tree
Showing 25 changed files with 46 additions and 1,680 deletions.
45 changes: 1 addition & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,12 +128,11 @@ indexes objects into an external Solr server.
| :--- | :---| :---- |
| solr.indexing.enabled | Enables/disables the SOLR indexing service. Disabled by default | false |
| solr.fcrepo.checkHasIndexingTransformation | When true, check for an indexing transform in the resource matadata. | true |
| solr.fcrepo.defaultTransform | The solr default ldpath transform when none is provide in resource metadata. | null |
| solr.fcrepo.defaultTransform | The solr default XSL transform when none is provide in resource metadata. | null |
| solr.input.stream | The JMS topic or queue serving as the message source | broker:topic:fedora |
| solr.reindex.stream | The JMS topic or queue serving as the reindex message source | broker:queue:solr.reindex |
| solr.commitWithin | Milliseconds within which commits should occur | 10000 |
| solr.indexing.predicate | When true, check that resource is of type http://fedora.info/definitions/v4/indexing#Indexable; otherwise do not index it. | false |
| solr.ldpath.service.baseUrl | The LDPath service base url | http://localhost:9085/ldpath |
| solr.filter.containers | A comma-separate list of containers that should be ignored by the indexer | http://localhost:8080/fcrepo/rest/audit |


Expand All @@ -157,48 +156,6 @@ indexes objects into an external triplestore.
| triplestore.prefer.include | A list of [valid prefer values](https://fedora.info/2021/05/01/spec/#additional-prefer-values) defining predicates to be included | null |
| triplestore.prefer.omit | A list of [valid prefer values](https://fedora.info/2021/05/01/spec/#additional-prefer-values) defining predicates to be omitted. | http://www.w3.org/ns/ldp#PreferContainment |

### LDPath Service

This application implements an LDPath service on repository
resources. This allows users to dereference and follow URI
links to arbitrary lengths. Retrieved triples are cached locally
for a specified period of time.

More information about LDPath can be found at the [Marmotta website](http://marmotta.apache.org/ldpath/language.html).

Note: The LDPath service requires an LDCache backend, such as `fcrepo-service-ldcache-file`.

#### Usage
The LDPath service responds to `GET` and `POST` requests using any accessible resources as a context.

For example, a request to
`http://localhost:9086/ldpath/?context=http://localhost/rest/path/to/fedora/object`
will apply the appropriate ldpath program to the specified resource. Note: it is possible to
identify non-Fedora resources in the context parameter.

A `GET` request can include a `ldpath` parameter, pointing to the URL location of an LDPath program:

`curl http://localhost:9086/ldpath/?context=http://localhost/rest/path/to/fedora/object&ldpath=http://example.org/ldpath`

Otherwise, it will use a simple default ldpath program.

A `POST` request can also be accepted by this endpoint. The body of a `POST` request should contain
the entire `LDPath` program. The `Content-Type` of the request should be either `text/plain` or
`application/ldpath`.

`curl -XPOST -H"Content-Type: application/ldpath" -d @program.txt http://localhost:9086/ldpath/?context=http://localhost/rest/path/to/fedora/object

#### Properties
| Name | Description| Default Value |
| :--- | :---| :---- |
| ldpath.fcrepo.cache.timeout | The timeout in seconds for the ldpath cache | 0 |
| ldpath.rest.prefix | The LDPath rest endpoint prefix | no | /ldpath|
| ldpath.rest.port| The LDPath rest endpoint port | no | 9085 |
| ldpath.rest.host| The LDPath rest endpoint host | no | localhost |
| ldpath.cache.timeout | LDCache timeout in seconds | no | 86400 |
| ldpath.ldcache.directory | LDCache directory | no | ldcache/ |
| ldpath.transform.path | The LDPath transform file path | classpath:org/fcrepo/camel/ldpath/default.ldpath |

### Reindexing Service

This application implements a reindexing service so that
Expand Down
3 changes: 2 additions & 1 deletion docker-compose/camel-toolbox-config/configuration.properties
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
fcrepo.baseUrl=http://fcrepo:8080/fcrepo/rest
fcrepo.authHost=fcrepo
fcrepo.authHost=localhost

jms.brokerUrl=tcp://fcrepo:61616

solr.indexing.enabled=true
solr.baseUrl=http://solr:8983/solr/fcrepo
solr.fcrepo.defaultTransform=org/fcrepo/camel/indexing/solr/default_transform.xsl

triplestore.indexing.enabled=true
triplestore.baseUrl=http://fuseki:3030/fcrepo
Expand Down
6 changes: 0 additions & 6 deletions fcrepo-camel-toolbox-app/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,6 @@
<version>${project.parent.version}</version>
</dependency>

<dependency>
<groupId>${project.parent.groupId}</groupId>
<artifactId>fcrepo-ldpath</artifactId>
<version>${project.parent.version}</version>
</dependency>

<dependency>
<groupId>${project.parent.groupId}</groupId>
<artifactId>fcrepo-fixity</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,6 @@ static class SolrIndexingEnabled extends ConditionOnPropertyTrue {
@Value("${solr.indexing.predicate:false}")
private boolean indexingPredicate;

@Value("${solr.ldpath.service.baseUrl:http://localhost:9085/ldpath}")
private String ldpathServiceBaseUrl;

@Value("${solr.filter.containers:http://localhost:8080/fcrepo/rest/audit}")
private String filterContainers;

Expand Down Expand Up @@ -85,10 +82,6 @@ public boolean isIndexingPredicate() {
return indexingPredicate;
}

public String getLdpathServiceBaseUrl() {
return ldpathServiceBaseUrl;
}

public String getFilterContainers() {
return filterContainers;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
import static org.apache.camel.Exchange.CONTENT_TYPE;
import static org.apache.camel.Exchange.HTTP_METHOD;
import static org.apache.camel.Exchange.HTTP_QUERY;
import static org.apache.camel.Exchange.HTTP_URI;
import static org.apache.camel.builder.PredicateBuilder.and;
import static org.apache.camel.builder.PredicateBuilder.in;
import static org.apache.camel.builder.PredicateBuilder.not;
Expand Down Expand Up @@ -104,6 +103,7 @@ public void configure() throws Exception {
.when(and(simple(config.isIndexingPredicate() + " != 'true'"),
simple(config.isCheckHasIndexingTransformation() + " != 'true'")))
.setHeader(INDEXING_TRANSFORMATION).simple(config.getDefaultTransform())
.log(LoggingLevel.INFO, "sending to update_solr")
.to("direct:update.solr")
.otherwise()
.to(
Expand Down Expand Up @@ -136,48 +136,34 @@ public void configure() throws Exception {
.setHeader(HTTP_QUERY).simple("commitWithin=" + config.getCommitWithin())
.to(config.getSolrBaseUrl() + "/update?useSystemProperties=true");

from("direct:external.ldpath").routeId("FcrepoSolrLdpathFetch")
.removeHeaders("CamelHttp*")
.setHeader(HTTP_URI).header(INDEXING_TRANSFORMATION)
.setHeader(HTTP_METHOD).constant("GET")
.to("http://localhost/ldpath");

from("direct:transform.ldpath").routeId("FcrepoSolrTransform")
.removeHeaders("CamelHttp*")
.setHeader(HTTP_URI).simple(config.getLdpathServiceBaseUrl())
.setHeader(HTTP_QUERY).simple("context=${headers.CamelFcrepoUri}")
.to("http://localhost/ldpath");

/*
* Handle update operations
*/
from("direct:update.solr").routeId("FcrepoSolrUpdater")
.log(LoggingLevel.INFO, logger, "Indexing Solr Object ${header.CamelFcrepoUri}")
.setBody(constant(null))
.setHeader(INDEXING_URI).simple("${header.CamelFcrepoUri}")
// Don't index the transformation itself
.filter().simple("${header.CamelIndexingTransformation} != ${header.CamelIndexingUri}")
.choice()
.when(header(INDEXING_TRANSFORMATION).startsWith("http"))
.log(LoggingLevel.INFO, logger,
"Fetching external LDPath program from ${header.CamelIndexingTransformation}")
.to("direct:external.ldpath")
.setHeader(HTTP_METHOD).constant("POST")
.to("direct:transform.ldpath")
.to("direct:send.to.solr")
.when(or(header(INDEXING_TRANSFORMATION).isNull(), header(INDEXING_TRANSFORMATION).isEqualTo("")))
.setHeader(HTTP_METHOD).constant("GET")
.to("direct:transform.ldpath")
.to("direct:send.to.solr")
.otherwise()
.log(LoggingLevel.INFO, logger, "Skipping ${header.CamelFcrepoUri}");
.when(header(INDEXING_TRANSFORMATION).isNotNull())
.log(LoggingLevel.INFO, logger,
"Sending RDF for Transform with with XSLT from ${header.CamelIndexingTransformation}")
.toD("xslt:${header.CamelIndexingTransformation}")
.to("direct:send.to.solr")
.when(or(header(INDEXING_TRANSFORMATION).isNull(), header(INDEXING_TRANSFORMATION).isEqualTo("")))
.log(LoggingLevel.INFO, logger,"No Transform supplied")
.to("direct:send.to.solr")
.otherwise()
.log(LoggingLevel.INFO, logger, "Skipping ${header.CamelFcrepoUri}");

/*
* Send the transformed resource to Solr
*/
from("direct:send.to.solr").routeId("FcrepoSolrSend")
.log(LoggingLevel.INFO, logger, "sending to solr...")
.removeHeaders("CamelHttp*")
.setHeader(CONTENT_TYPE).constant("text/xml")
.setHeader(HTTP_METHOD).constant("POST")
.setHeader(HTTP_QUERY).simple("commitWithin=" + config.getCommitWithin())
.to(config.getSolrBaseUrl() + "/update?useSystemProperties=true");
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:fedora="http://fedora.info/definitions/v4/repository#"
xmlns:ldp="http://www.w3.org/ns/ldp#">

<xsl:template match="/">
<add>
<doc>
<field name="id"><xsl:value-of select="rdf:RDF/rdf:Description/@rdf:about" /></field>
<xsl:for-each select="rdf:RDF/rdf:Description/rdf:type">
<field name="rdftype"><xsl:value-of select="@rdf:resource" /></field>
</xsl:for-each>
<field name="contains"><xsl:value-of select="rdf:RDF/rdf:Description/ldp:contains/@rdf:resource" /></field>
<field name="lastmodified"><xsl:value-of select="rdf:RDF/rdf:Description/fedora:lastModified" /></field>
<field name="created"><xsl:value-of select="rdf:RDF/rdf:Description/fedora:created" /></field>
</doc>
</add>
</xsl:template>

</xsl:stylesheet>
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ public static void beforeClass() {
System.setProperty("solr.reindex.stream", "seda:bar");
System.setProperty("error.maxRedeliveries", "10");
System.setProperty("fcrepo.baseUrl", baseURL);
System.setProperty("solr.fcrepo.defaultTransform", "http://localhost/ldpath/program");
System.setProperty("solr.fcrepo.defaultTransform", "org/fcrepo/camel/indexing/solr/default_transform.xsl");
System.setProperty("solr.baseUrl", solrURL);
System.setProperty("solr.reindex.stream", "seda:reindex");
System.setProperty("solr.fcrepo.checkHasIndexingTransformation", "true");
Expand Down Expand Up @@ -214,7 +214,7 @@ public void testPrepareRouterIndexable() throws Exception {
deleteEndpoint.setAssertPeriod(ASSERT_PERIOD_MS);
updateEndpoint.expectedMessageCount(1);
updateEndpoint.expectedHeaderReceived("CamelIndexingTransformation",
"http://localhost/ldpath/default");
"org/fcrepo/camel/indexing/solr/default_transform.xsl");

template.sendBodyAndHeaders(
IOUtils.toString(loadResourceAsStream("indexable.rdf"), "UTF-8"),
Expand Down Expand Up @@ -245,7 +245,7 @@ public void testPrepareRouterContainer() throws Exception {
updateEndpoint.setAssertPeriod(ASSERT_PERIOD_MS);
deleteEndpoint.expectedMessageCount(1);
deleteEndpoint.expectedHeaderReceived("CamelIndexingTransformation",
"http://localhost/ldpath/program");
"org/fcrepo/camel/indexing/solr/default_transform.xsl");

template.sendBodyAndHeaders(
IOUtils.toString(loadResourceAsStream("container.rdf"), "UTF-8"),
Expand All @@ -271,10 +271,6 @@ public void testUpdateRouter() throws Exception {
a.mockEndpointsAndSkip("http*");
});

AdviceWith.adviceWith(context, "FcrepoSolrTransform", a -> {
a.mockEndpointsAndSkip("http*");
});

final var solrUpdateEndPoint = MockEndpoint.resolve(context, "mock:" + solrURL + "/update");
solrUpdateEndPoint.expectedMessageCount(1);
solrUpdateEndPoint.expectedHeaderReceived(Exchange.HTTP_METHOD, "POST");
Expand Down
2 changes: 1 addition & 1 deletion fcrepo-indexing-solr/src/test/resources/indexable.rdf
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@
<rdf:type rdf:resource="http://fedora.info/definitions/v4/repository#Container"/>
<rdf:type rdf:resource="http://fedora.info/definitions/v4/repository#Resource"/>
<rdf:type rdf:resource="http://fedora.info/definitions/v4/indexing#Indexable"/>
<indexing:hasIndexingTransformation rdf:about="http://localhost/ldpath/default"/>
<indexing:hasIndexingTransformation rdf:about="org/fcrepo/camel/indexing/solr/default_transform.xsl"/>
</rdf:Description>
</rdf:RDF>
130 changes: 0 additions & 130 deletions fcrepo-ldpath/my.ldpath

This file was deleted.

Loading

0 comments on commit e71b893

Please sign in to comment.