Skip to main content

Questions tagged [apache-hadoop]

Hadoop is an Apache open-source project that provides software for reliable and scalable distributed computing. The project itself includes a variety of other complementary additions.

0 votes
1 answer
177 views

It seems Hadoop, Spark, and different versions of Clouds offer facilities to store and analyze big data. There are some articles comparing Hadoop and Spark (for example, this article). There are also ...
Tara's user avatar
  • 1
0 votes
1 answer
60 views

Is it possible to update the source of data found in a Data Lake or Data Blob? What about while using HDInsight or Azure Databricks?
JF0001's user avatar
  • 101
0 votes
1 answer
524 views

I'm an infra person working on a storage product. I've been googling quite a bit to find an answer to the following question but unable to do so. Hence, I am attemping to ask the question here. I am ...
user855's user avatar
  • 101
0 votes
1 answer
42 views

I am trying to learn hadoop, would like to know if for basic single node installation 1gb RAM system would be enough or we need more RAM. It would be helpful if someone can share what other minimum ...
Gaurav Parek's user avatar
5 votes
1 answer
1k views

I recently read the following about Hadoop vs. Spark: Insist upon in-memory columnar data querying. This was the killer-feature that let Apache Spark run in seconds the queries that would take Hadoop ...
Ironclad's user avatar
0 votes
1 answer
35 views

I've been asked to lead a program to understand why our Hadoop storage is constantly near capacity. What questions should I ask? Data age, Data size? Housekeeping schedule? How do we identify the ...
vwdewaal's user avatar
  • 101
1 vote
0 answers
49 views

I am trying to implement a simple C/C++ program for the HDFS file system like word count, it takes a file from the input path puts it into HDFS (where it gets split), processed my map-reduce function ...
n0unc3's user avatar
  • 11
3 votes
1 answer
631 views

I've created a BERT model. What are the ways to do the deployment of this model? Is it possible to use it with Spark, Hadoop or Docker?
illuminato's user avatar

15 30 50 per page
1
2 3 4 5
8