
The Metadata Storage is an external dependency of Druid. Druid uses it to store various metadata about the system, but not to store the actual data. There are a number of tables used for various purposes. Derby is the default metadata store for Druid, however, it is not suitable for production. MySQL and PostgreSQL are more production suitable metadata stores. In this post, I will show you how to configure druid to use MySQL as the metadata storage.
Setting up MySQL
- Install MySQL
- Connect to MySQL from the machine where it is installed.
mysql -u root -p Enter password:
- Create a druid database and user.
CREATE DATABASE druid DEFAULT CHARACTER SET utf8; CREATE USER 'druid' IDENTIFIED BY 'diurd'; GRANT ALL ON druid.* TO 'druid';
Configure mysql-metadata-storage
extension:
The mysql-metadata-storage
extension is not packaged within the main Druid tarball; it is packaged in a separate tarball that can be downloaded from here. You can also get it using pull-deps, or you can build it from source code. But if you are using the imply distribution, mysql-metadata-storage
extension is already packaged within the dist/druid/extensions
directory.
Now in conf/druid/_common/common.runtime.properties
, add mysql-metadata-storage
to druid.extensions.loadList
. If for example the list already contains “druid-parser-route” and “druid-s3-extensions” , the final property should look like:
druid.extensions.loadList=["mysql-metadata-storage", "druid-s3-extensions", "druid-parser-route"]
Comment out the configurations for Derby under “Metadata storage” section and add appropriate values for MySQL. After this, “Metadata storage” section should look like:
# # Metadata storage # # For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over): # druid.metadata.storage.type=derby # druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527/var/druid/metadata.db;create=true # druid.metadata.storage.connector.host=localhost # druid.metadata.storage.connector.port=1527 # For MySQL: druid.metadata.storage.type=mysql druid.metadata.storage.connector.connectURI=jdbc:mysql://localhost:3306/druid?allowPublicKeyRetrieval=TRUE druid.metadata.storage.connector.user=druid druid.metadata.storage.connector.password=diurd # For PostgreSQL: # druid.metadata.storage.type=postgresql # druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid # druid.metadata.storage.connector.user=... # druid.metadata.storage.connector.password=...
Note: If you are running latest version of MySQL, then make sure to replace the mysql-connector-java
jar file with latest one inside extensions/mysql-metadata-storage
directory.
You’re done. Now restart the servers to take effect. To test if it is working, load the sample data in druid and see druid metadata created in MySQL.
mysql> use druid Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> show tables; +-----------------------+ | Tables_in_druid | +-----------------------+ | druid_audit | | druid_config | | druid_dataSource | | druid_pendingSegments | | druid_rules | | druid_segments | | druid_supervisors | | druid_tasklocks | | druid_tasklogs | | druid_tasks | +-----------------------+ 10 rows in set (0.01 sec)
Congratulations! You have successfully configured Druid to use MySQL as the metadata store. For highly-available metadata storage, it is recommended to use MySQL with replication and failover enabled. For more details see the Druid documentation.
If you’re interested to use PostgreSQL or Microsoft SQL Server as your Druid metadata storage, follow my below posts on how to do that.