Skip to content
View anands282's full-sized avatar
💭
coding
💭
coding

Block or report anands282

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
anands282/README.md

💻 Anand Satheesh

Banner

Software Architect at IBM India
Big Data Enthusiast | Data Engineering Specialist | Microservices Architect


🚀 About Me

Hello! I'm Anand Satheesh, a seasoned software architect with over 12 years of experience in building robust big data and data engineering solutions. Currently, I am working at IBM India, where I leverage my expertise to architect and implement scalable data pipelines, data lakes, and microservices architectures.


🛠️ My Skill Set

Big Data Technologies:

  • Apache Spark: Proficient in both PySpark and Scala
  • Kafka: Real-time data streaming and processing
  • Hadoop: Distributed storage and processing

Data Engineering:

  • Data Lakes: Building and managing large-scale data lakes
  • Data Warehousing: Designing efficient and scalable data warehouses
  • ETL Pipelines: Developing end-to-end ETL workflows

Microservices:

  • Architecture: Designing and implementing microservices-based systems
  • Containerization: Docker, Kubernetes for deploying scalable applications
  • APIs: Creating and managing RESTful APIs

Programming Languages:

  • Python: Data processing, scripting, automation
  • Scala: Big data processing with Apache Spark
  • Java: Backend development

Cloud Platforms:

  • IBM Cloud: Cloud-native solutions and services
  • AWS: Data engineering and cloud services

📚 Latest Blog Posts


📫 Let's Connect


Profile Views

Pinned Loading

  1. webscrapper webscrapper Public

    Webscrapper repository contains the code to use python packages like Playwright to scrape data from Goolge maps to return a list of businesses that we want to search for in a particular area. The r…

    Python

  2. billionrowsstreaming billionrowsstreaming Public

    A program to utilize apache spark and kafka to process a stream of a billion recording in real time. An attempt inspired by the billion record challenge, this project attempt to use distributed pro…

    Python

  3. airflow_sensors airflow_sensors Public

    A project to test various possibilities with airflow sensors, the file sensor is tested here which will listen for a file delivery at a specific location and trigger a workflow based on the conditi…

    Python

  4. StockMarketAnalysis StockMarketAnalysis Public

    This repository aims to use pypark to process a stream of stock market data, other areas include Setting up a local kafka and zookeeper docker containers, Setting up a producer job to imitate a str…

    Python 1