Advanced setup guides¶
Here we provide some advanced setup guides:
Setting up Elasticsearch via docker¶
Setting up Elasticsearch (ES) via docker is straightforward. Simply run the following command:
docker run -d --name elasticsearch-for-rubrix -p 9200:9200 -p 9300:9300 -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2
This will create an ES docker container named “elasticsearch-for-rubrix” that will run in the background.
To see the logs of the container, you can run:
docker logs elasticsearch-for-rubrix
Or you can stop/start the container via:
docker stop elasticsearch-for-rubrix
docker start elasticsearch-for-rubrix
Warning
Keep in mind, if you remove your container with docker rm elasticsearch-for-rubrix
, you will loose all your datasets in Rubrix!
For more details about the ES installation with docker, see their official documentation. For MacOS and Windows, Elasticsearch also provides homebrew formulae and a msi package, respectively. We recommend ES version 7.10 to work with Rubrix.
Server configurations¶
By default, the Rubrix server will look for your ES endpoint at http://localhost:9200
.
But you can customize this by setting the ELASTICSEARCH
environment variable.
Have a look at the list of available environment variables to further configure the Rubrix server.
Since the Rubrix server is built on fastapi, you can launch it using uvicorn directly:
uvicorn rubrix:app
(for Rubrix versions below 0.9 you can launch the server via)
uvicorn rubrix.server.server:app
For more details about fastapi and uvicorn, see here
Fastapi also provides beautiful REST API docs that you can check at http://localhost:6900/api/docs.
Environment variables¶
You can set following environment variables to further configure your server and client.
Server¶
ELASTICSEARCH
: URL of the connection endpoint of the Elasticsearch instance (Default: http://localhost:9200 ).RUBRIX_ELASTICSEARCH_SSL_VERIFY
: If “False”, disables SSL certificate verification when connection to the Elasticsearch backend.RUBRIX_NAMESPACE
: A prefix used to manage Elasticsearch indices. You can use this namespace to use the same Elasticsearch instance for several independent Rubrix instances.RUBRIX_DEFAULT_ES_SEARCH_ANALYZER
: Default analyzer for textual fields excluding the metadata (Default: “standard”).RUBRIX_EXACT_ES_SEARCH_ANALYZER
: Default analyzer for*.exact
fields in textual information (Default: “whitespace”).METADATA_FIELDS_LIMIT
: Max number of fields in the metadata (Default: 50, max: 100).CORS_ORIGINS
: List of host patterns for CORS origin access.DOCS_ENABLED
: If False, disables openapi docs endpoint at /api/docs.
Client¶
RUBRIX_API_URL
: The default API URL when callingrubrix.init()
.RUBRIX_API_KEY
: The default API key when callingrubrix.init()
.RUBRIX_WORKSPACE
: The default workspace when callingrubrix.init()
.
Launching the web app via docker¶
You can use vanilla docker to run our image of the web app. First, pull the image from the Docker Hub:
docker pull recognai/rubrix
Then simply run it.
Keep in mind that you need a running Elasticsearch instance for Rubrix to work.
By default, the Rubrix server will look for your Elasticsearch endpoint at http://localhost:9200
.
But you can customize this by setting the ELASTICSEARCH
environment variable.
docker run -p 6900:6900 -e "ELASTICSEARCH=<your-elasticsearch-endpoint>" --name rubrix recognai/rubrix
To find running instances of the Rubrix server, you can list all the running containers on your machine:
docker ps
To stop the Rubrix server, just stop the container:
docker stop rubrix
If you want to deploy your own Elasticsearch cluster via docker, we refer you to the excellent guide on the Elasticsearch homepage
Launching the web app via docker-compose¶
For this method you first need to install Docker Compose.
Then, create a folder:
mkdir rubrix && cd rubrix
and launch the docker-contained web app with the following command:
wget -O docker-compose.yml https://raw.githubusercontent.com/recognai/rubrix/master/docker-compose.yaml && docker-compose up -d
This is a convenient way because it automatically includes an Elasticsearch instance, Rubrix’s main persistent layer.
Warning
Keep in mind, if you execute docker-compose down
, you will loose all your datasets in Rubrix!
Configure Elasticsearch role/users¶
If you have an Elasticsearch instance and want to share resources with other applications, you can easily configure it for Rubrix.
All you need to take into account is:
Rubrix will create its ES indices with the following pattern
.rubrix*
. It’s recommended to create a new role (e.g., rubrix) and provide it with all privileges for this index pattern.Rubrix creates an index template for these indices, so you may provide related template privileges to this ES role.
Rubrix uses the ELASTICSEARCH
environment variable to set the ES connection.
You can provide the credentials using the following scheme:
http(s)://user:passwd@elastichost
Below you can see a screenshot for setting up a new rubrix Role and its permissions:
Change elasticsearch index analyzers¶
By default, for indexing text fields, Rubrix uses the standard analyzer for general search and the whitespace
analyzer for more exact queries (required by certain rules in the weak supervision module). If those analyzers
don’t fit your use case, you can change them using the following environment variables:
RUBRIX_DEFAULT_ES_SEARCH_ANALYZER
and RUBRIX_EXACT_ES_SEARCH_ANALYZER
.
Note that provided analyzers names should be defined as built-in ones. If you want to use a customized analyzer, you should create it inside an index_template matching Rubrix index names (`.rubrix*.records-v0), and then provide the analyzer name using the specific environment variable.
Deploy to aws instance using docker-machine¶
Setup an AWS profile¶
The aws
command cli must be installed. Then, type:
aws configure --profile rubrix
and follow command instructions. For more details, visit AWS official documentation
Once the profile is created (a new entry should be appear in file ~/.aws/config
), you can activate it via setting environment variable:
export AWS_PROFILE=rubrix
Create docker machine (aws)¶
docker-machine create --driver amazonec2 \
--amazonec2-root-size 60 \
--amazonec2-instance-type t2.large \
--amazonec2-open-port 80 \
--amazonec2-ami ami-0b541372 \
--amazonec2-region eu-west-1 \
rubrix-aws
Available ami depends on region. The provided ami is available for eu-west regions
Verify machine creation¶
$>docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
rubrix-aws - amazonec2 Running tcp://52.213.178.33:2376 v20.10.7
Save asigned machine ip¶
In our case, the assigned ip is 52.213.178.33
Connect to remote docker machine¶
To enable the connection between the local docker client and the remote daemon, we must type following command:
eval $(docker-machine env rubrix-aws)
Define a docker-compose.yaml¶
# docker-compose.yaml
version: "3"
services:
rubrix:
image: recognai/rubrix:v0.14.0
ports:
- "80:80"
environment:
ELASTICSEARCH: <elasticsearch-host_and_port>
restart: unless-stopped
Pull image¶
docker-compose pull
Launch docker container¶
docker-compose up -d
Accessing Rubrix¶
In our case http://52.213.178.33
Install from master¶
If you want the cutting-edge version of Rubrix with the latest changes and experimental features, follow the steps below in your terminal. Be aware that this version might be unstable!
First, you need to install the master version of our python client:
pip install -U git+https://github.com/recognai/rubrix.git
Then, the easiest way to get the master version of our web app up and running is via docker-compose:
Note
For now, we only provide the master version of our web app via docker. If you want to run the web app of the master branch without docker, we refer you to our Development setup.
# get the docker-compose yaml file
mkdir rubrix && cd rubrix
wget -O docker-compose.yml https://raw.githubusercontent.com/recognai/rubrix/master/docker-compose.yaml
# use the master image of the rubrix container instead of the latest
sed -i 's/rubrix:latest/rubrix:master/' docker-compose.yml
# start all services
docker-compose up
If you want to use vanilla docker (and have your own Elasticsearch instance running), you can just use our master image:
docker run -p 6900:6900 -e "ELASTICSEARCH=<your-elasticsearch-endpoint>" --name rubrix recognai/rubrix:master