This is a simple wrapper/enabler for running Apache Drill on Apache Mesos.
Dromedar (DRill On MEsos aDAptoR) gets launched via Marathon and whenever a query request comes in, it launches a number of Drillbits, depending on the dataset size under query. The query-scale-factor (QSF) determines how many Drillbits are launched in relation to the dataset size and defaults to 1 Drillbit per 100MB (1:100) or qsf=100, for short.
Dromedar's architecture is as follows:
+----------------+ +----------------------------------------+ | Marathon | | Mesos worker node | | | | | | | | +-------------------------+ | | | | | | | | | | | Drillbit <---------[3]-------> SQL client | | | +------------+------------+ | | | | [2] | | | | +------------+------------+ | | | | | | | | +----[2]--> drillbit.sh start | | | | | +-------------------------+ | | | | | | | | | | | | +-------------------------+ | | | | | | | | | |HTTP API | | | <----[2]--+ qsf.py <---------[1]-------- [QSF] | | | | | | | | | +-------------------------+ | +----------------+ +----------------------------------------+ Dromedar's underlying long-runing service is qsf.py which itself is initially deployed through dromedar.py, using Marathon. Once qsf.py is running as a Web service it performs the following steps:
- As an input it takes a QSF via its HTTP interface on port
9876. - It uses the Marathon HTTP API to trigger on-demand Drillbits creation using the
drillbit.sh startcommand. - The SQL client connects to (one of) the Drillbit(s) and executes the SQL query.
- Apache Mesos 0.22.x
- Marathon 0.8.1
- Apache Drill 0.8.0
- marathon-python
Note that Apache Drill and the Marathon Python package are installed via Dromedar, directly. The only two things that are assumed to be available are Mesos and Marathon itself.
$ ./launch.sh Then, go to the Marathon UI where you should see something like the following:
- Bootstrap (install Drill, launch Dromedar via Marathon)
- Implement QSF HTTP API
- Implement Drillbit launch/teardown based on requests
- Clarify relation/communication between QSF and SQL client (out of band??)
- Strata implementation cross-check
- Cluster deployment and testing
- HAProxy deployment?
- Examples and video walkthrough
