Solr Cloud Support
Solr Cloud is supported by io.ino.solrs.CloudSolrServers, which is a SolrServers implementation (can be passed to RoundRobinLB/FastestServerLB).
Solr Cloud is supported with the following properties / restrictions:
- Standard collection aliases are supported, no support for (time) routed aliases (see also Collections API / CreateAlias docs)
- Can use a default collection, if this is not provided, per request the
SolrQuerymust specify the collection via the “collection” parameter. - New solr servers or solr servers that changed their state from inactive (e.g. down) to active can be tested with warmup queries before they’re used for load balancing queries, for this a
WarmupQueriesinstance can be set. - Querying solr is possible when ZooKeeper is temporarily not available
- Construction of
CloudSolrServersis possible while ZooKeeper is not available. When ZK becomes availableCloudSolrServerswill be connected to ZK. As interval for trying to connect theCloudSolrServers.zkConnectTimeoutproperty is (re)used (10 seconds by default). Connection is tried “forever”, but of course this does not preventCloudSolrServersorAsyncSolrClientto be shutdown. - Construction of
CloudSolrServersis possible when no solr instances are known by ZK. When solr servers have registered at ZK,CloudSolrServerswill notice this. As retry interval theCloudSolrServers.clusterStateUpdateIntervalproperty is (re)used (1 second by default). - ZK cluster state updates are read using the
CloudSolrServers.clusterStateUpdateInterval.
To run solrs connected to SolrCloud / ZooKeeper, you pass an instance of CloudSolrServers to RoundRobinLB/FastestServerLB. The simplest case looks like this:
- Java
-
source
CloudSolrServers<?> servers = CloudSolrServers.builder("localhost:2181").build(); JavaAsyncSolrClient solr = JavaAsyncSolrClient.builder(new RoundRobinLB(servers)).build(); - Scala
-
source
val servers = new CloudSolrServers("localhost:2181") val solr = AsyncSolrClient.Builder(RoundRobinLB(servers)).build
Here’s an example that shows all configuration properties in use:
- Java
-
source
CloudSolrServers<?> servers = CloudSolrServers.builder("host1:2181,host2:2181") .withZkClientTimeout(15, SECONDS) .withZkConnectTimeout(10, SECONDS) .withClusterStateUpdateInterval(1, SECONDS) .withDefaultCollection("collection1") .withWarmupQueries((collection) -> Collections.singletonList(new SolrQuery("*:*")), 10) .build(); JavaAsyncSolrClient solr = JavaAsyncSolrClient.builder(new RoundRobinLB(servers)).build(); - Scala
-
source
val servers = new CloudSolrServers( zkHost = "host1:2181,host2:2181", zkClientTimeout = 15 seconds, zkConnectTimeout = 10 seconds, clusterStateUpdateInterval = 1 second, defaultCollection = Some("collection1"), warmupQueries = WarmupQueries("collection1" => Seq(new SolrQuery("*:*")), count = 10)) val solr = AsyncSolrClient.Builder(RoundRobinLB(servers)).build
Remember to either specify a default collection (as shown above) or set the collection to use per query (via new SolrQuery("scala").setParam("collection", "collection1")).
When running SolrCloud you should also configure a retry policy (RetryPolicy.TryAvailableServers to be concrete), because restarts of solr nodes are not that fast registered by ZooKeeper (and therefore also not by our CloudSolrServers), so that for a short period of time queries might be failing because a solr node just became not available.