Skip to content

themrityunjaypathak/armada

 
 

Repository files navigation

Armada

Scale Your Bots & Scrapers. Effortlessly.

Version 0.9.0 License: AGPL v3 Link to Full Documentation

Write once, deploy in seconds. Let the orchestrator handle the rest.
See the full documentation here

Kubernetes Deployment Flow

Demo

armada-video.mp4

First Try — reproduce the demo above on a Minikube Cluster

For a proper Quick-start on a real K8s cluser, see Quick Start - Installation Guide.

Requirements : Linux, Python, Azure SQL Database instance, Docker, Kubernetes on Minikube (local) with Kubectl and Helm

1. Expose the testing first-try-website locally with Docker

cd first-try/first-try-website docker build -t twittor . docker run -p 5010:5010 twittor cd ../.. 

The App becomes accessible at http://localhost:5010 (or http://host.minikube.internal:5010 in Minikube)

2. Complete .env with mandatory values

SQL_SERVER_USER=your_user SQL_SERVER_PASSWORD=your_password SQL_SERVER_DB=your_database SQL_SERVER_NAME=your_server.database.windows.net DOCKER_HUB_USERNAME=armadasvc

See Environment Variables for the complete reference.

3. Bootstrap resources

cd bootstrap pip install -r requirements.txt python bootstrap_database.py python bootstrap_secrets.py python bootstrap_cluster_resources.py cd ..

Select option 3. See Bootstrap Scripts for details.

4. Create your first project using first-try mode

bash create-project.sh

and choose option 2. First Try project, and choose a directory where you want to store the project folder. Note : Python, JSON, and CSV files are ready to run, no manual modif required for first-try (configuration: 10 messages across 3 agents).

For production project setup and create your own project. See Setting Up a Project for JSON, CSV, and Python file configuration for more details.

5. Launch and monitor

In Launch Tab, open Armada Dashboard :

kubectl port-forward svc/armada-frontend 8080:8080

Drag and drop your first-try-project folder and click Launch

Switch to Monitor tab to watch your run in real time

Note : without any pre-existing cache (for example, on the very first run of Aramada), there will be a cold start caused by the cluster spin-up, as it provisions heavy resources (such as browser), so it is normal to see Ǹo Jobs for a while in Jobs Monitoring Tab

Under the Hood

1. A Batteries-Included Service Ecosystem

Armada ships with a full suite of dedicated microservices that agents can leverage out of the box. A Proxy Service selects, rotates and curates proxies, while a Fingerprint Service fetches and repackages browser fingerprint tokens — both backed by a shared SQL Server. The Orchestrator sits at the center, coordinating everything, and a Monitoring Service feeds real-time job and event data down to a live Dashboard for full observability. See Architecture Overview and Services Reference.

Full System Architecture

2. A Simple Yet Powerful Input Pipeline

The Armada Orchestrator ingests three user-provided inputs : arbitrary Python code for automation logic, a central JSON configuration for infrastructure and agent/job defaults, and optional CSV files for fine-grained per-agent or per-job overrides. Then, it merges them together into fully resolved, executable tasks that are distributed to Kubernetes pods. See Configuration Pipeline.

Configuration Pipeline

3. Built-In Scale-Out Distribution

Scaling from one to hundreds of workers requires zero changes to your code. The Orchestrator handles the heavy lifting: it first seeds each pod with its own tailored configuration via Redis, then floods a RabbitMQ queue with jobs that pods consume on demand — faster workers naturally pick up more tasks, ensuring optimal throughput. See Run Lifecycle.

Redis & RabbitMQ Flow

Key Features

Redis & RabbitMQ Flow

Supported Drivers & Introducing Fantomas

Supported Drivers: Playwright, Selenium, nodriver, Fantomas

Fantomas

Fantomas is Armada's in-house browser automation library, built on top of nodriver and purpose-built to run inside agent pods.

  • Human emulation out of the box. Every click follows a physics-based curved trajectory (WindMouse), and every keystroke is typed with randomized delays — no teleporting cursors or instant text injection.
  • Two interaction strategies. Native mode operates through Chrome's CDP protocol with emulation layers on top. XDO mode goes deeper, firing events at the OS level via xdotool, making them virtually indistinguishable from real user input.
  • Seamless Armada integration. Fantomas is designed around Armada's two-tier lifecycle: a browser instance is launched once at the agent level and reused across all jobs, avoiding the cost of spinning up Chrome for every single task. Move it to the job context instead if you needm full isolation between tasks.
  • Full nodriver compatibility. Every native nodriver method (get, query_selector, tabs, ...) remains directly available — Fantomas extends the API without replacing it. See Fantomas Documentation and API Reference.

Fantomas Lib

Documentation

For detailed guides, API reference, and advanced configuration, visit the Full Documentation.

Initial Public Release

We are starting with a clean Git history for this first public release (0.9.0), aligning the project with a fresh SemVer strategy and avoiding exposure of legacy commits or releases that no longer reflect the current state of the codebase.

Contributing

Please read CONTRIBUTING.md before submitting a pull request.

Code of Conduct

This project follows the guidelines defined in CODE_OF_CONDUCT.md.

License

See LICENSE for details.

About

Kubernetes-native framework for bot and scraper orchestration

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 58.2%
  • TypeScript 33.4%
  • HTML 4.5%
  • CSS 2.5%
  • Shell 0.9%
  • Dockerfile 0.3%
  • JavaScript 0.2%