Assuming Solr 6.6 is installed as cloud with zookeeper. Solr 6.6 does not provide autoscale functioality, instead it provides collections apis to scale Solr Cloud when needed.
We have created solr cloud with zookeeper which includes three micro machines to form zookeeper ensemble and three nodes for solr cloud. We created one collection with two shards and replication factor 2. So total of six replicas distributed equally in every machine. Each machine is having 2 replicas.
Now if load increases then cloud will drop exceeded requests. After a threshold we need to add extra nodes. Previously we were using Solr 4.1 and adding new node to cloud was very easy. Just add a new machine from existing AWS AMI with current cloud zookeeper configuration, cloud will recognise new node and update it then start serving. But Solr 6.6 scaling does not work this way. What ever the replication factor we have provided at the time of collection creation it will remain same and adding new node will not add any replica to the new node.
Below are the steps to scale the Solr 6.6.
- We have to keep an AMI which includes clean Solr 6.6 installation.
## Install java ##
sudo apt update
sudo apt install openjdk-8-jdk openjdk-8-jre
### Install SOLR ###
wget https://archive.apache.org/dist/lucene/solr/6.6.0/solr-6.6.0.tgz
tar xzf solr-6.6.0.tgz solr-6.6.0/bin/install_solr_service.sh --strip-components=2
sudo bash ./install_solr_service.sh solr-6.6.0.tgz
### check solr status ###
sudo /etc/init.d/solr status
### Edit solr configuration ####
sudo vim /etc/default/solr.in.sh
### Copy below in /etc/default/solr.in.sh
ZK_HOST="<zkhost1>:2181,<zkhost2>:2181,<zkhost3>:2181"
SOLR_JAVA_MEM="-Xms12g -Xmx12g" ## As required
#### Restart SOLR ###
sudo /etc/init.d/solr restart
2. Now we have to find current replication factor of collection using collections api and increase replication factor. To add one more node we will increase replication factor by one [+1].
Hit below url replace with current cluster hostname
http://solrhost:8983/solr/admin/collections?action=clusterstatus&wt=json
In above apis json response check value of below key
cluster.collections.products.replicationFactor
New replication factor value will be <CurrentReplicationFactor> + 1 = <NewReplicationFactor>
Hit below url paths to add new shard replicas to new machine.
## Update the replication factor ###
http://solrhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&collection=products&replicationFactor=<NewReplicationFactor>
3. As we have 2 shards, we need to add replica of each shard to new node.
## Add replica for shard1 ###
http://solrhost:8983/solr/admin/collections?action=ADDREPLICA&collection=products&shard=shard1&node=<NEW_MACHINE_IP>:8983_solr
## Add replica for shard2 ###
http://solrhost:8983/solr/admin/collections?action=ADDREPLICA&collection=products&shard=shard2&node=<NEW_MACHINE_IP>:8983_solr
Following above steps will add new node to Solr cloud. Dev ops can add above steps in script which can get trigger when load reaches a threshold.
Thats all!