• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
Clean Programmer

Clean Programmer

Programming & DevOps Resources

  • Home
  • About
  • Contact

How To Configure Druid To Use MySQL As Metadata Storage

June 22, 2018 Monzurul Haque Shimul

The Metadata Storage is an external dependency of Druid. Druid uses it to store various metadata about the system, but not to store the actual data. There are a number of tables used for various purposes. Derby is the default metadata store for Druid, however, it is not suitable for production. MySQL and PostgreSQL are more production suitable metadata stores. In this post, I will show you how to configure druid to use MySQL as the metadata storage.

Setting up MySQL

  1. Install MySQL
  2. Connect to MySQL from the machine where it is installed.
    mysql -u root -p 
    Enter password:
  3. Create a druid database and user.
    CREATE DATABASE druid DEFAULT CHARACTER SET utf8; 
    CREATE USER 'druid' IDENTIFIED BY 'diurd'; 
    GRANT ALL ON druid.* TO 'druid';
    

Configure mysql-metadata-storage extension:

The mysql-metadata-storage extension is not packaged within the main Druid tarball; it is packaged in a separate tarball that can be downloaded from here. You can also get it using pull-deps, or you can build it from source code. But if you are using the imply distribution, mysql-metadata-storage extension is already packaged within the dist/druid/extensions directory.

Now in conf/druid/_common/common.runtime.properties, add mysql-metadata-storage to druid.extensions.loadList. If for example the list already contains “druid-parser-route” and “druid-s3-extensions” , the final property should look like:

druid.extensions.loadList=["mysql-metadata-storage", "druid-s3-extensions", "druid-parser-route"]

Comment out the configurations for Derby under “Metadata storage” section and add appropriate values for MySQL. After this, “Metadata storage” section should look like:

#
# Metadata storage
#

# For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
# druid.metadata.storage.type=derby
# druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527/var/druid/metadata.db;create=true
# druid.metadata.storage.connector.host=localhost
# druid.metadata.storage.connector.port=1527

# For MySQL:
druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://localhost:3306/druid?allowPublicKeyRetrieval=TRUE
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=diurd

# For PostgreSQL:
# druid.metadata.storage.type=postgresql
# druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
# druid.metadata.storage.connector.user=...
# druid.metadata.storage.connector.password=...

Note: If you are running latest version of MySQL, then make sure to replace the mysql-connector-java jar file with latest one inside extensions/mysql-metadata-storage directory.

You’re done. Now restart the servers to take effect. To test if it is working, load the sample data in druid and see druid metadata created in MySQL.

mysql> use druid
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+-----------------------+
| Tables_in_druid       |
+-----------------------+
| druid_audit           |
| druid_config          |
| druid_dataSource      |
| druid_pendingSegments |
| druid_rules           |
| druid_segments        |
| druid_supervisors     |
| druid_tasklocks       |
| druid_tasklogs        |
| druid_tasks           |
+-----------------------+
10 rows in set (0.01 sec)

Congratulations! You have successfully configured Druid to use MySQL as the metadata store. For highly-available metadata storage, it is recommended to use MySQL with replication and failover enabled. For more details see the Druid documentation.

If you’re interested to use PostgreSQL or Microsoft SQL Server as your Druid metadata storage, follow my below posts on how to do that.

  • How to Configure Druid to Use PostgreSQL as Metadata Storage
  • How to Configure Druid to Use Microsoft SQL Server as Metadata Storage

Related Posts

Count Unique in Druid Using DataSketches HLL Sketch
Count Unique in Druid Using DataSketches Theta Sketch
Getting Unique Counts from Druid Using HyperLogLog

Categories: Druid Tags: druid.io, imply.io, mysql

Primary Sidebar

Categories

  • Apache Kafka
  • Druid
  • Git
  • Java
  • Java EE
  • Redis
  • Spring
  • Uncategorized
  • Weblogic
  • Wildfly

Featured Posts

How to Configure Logging in Druid

Loading Json Data From Batch File Into Druid

Customizing the wsdl url for ejb web services for JBoss Wildfly and Oracle Weblogic servers

Setting Up A Single Node Redis Server

SDKMAN! Managing SDKs Were Never So Smart

Generating Fibonacci sequence using Java Stream API

Footer

Monzurul Haque Shimul

I’m a full-stack software engineer with 10 years of experience in design and development of large scaled Enterprise Software Systems built on Java and Java EE related tools and technologies. I’m also a contributor on GitHub, Stack Overflow, DZone. My core expertise lies in building JVM-based, scalable, reactive, data-driven applications.

Follow

  • Email
  • GitHub
  • LinkedIn
  • Twitter

© 2019 CLEAN PROGRAMMER

  • Home
  • Archive
  • About
  • Contact
  • Privacy Policy
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.OKNoRead more
Revoke Cookies