Questions tagged [apache-hadoop]

Ask Question

Hadoop is an Apache open-source project that provides software for reliable and scalable distributed computing. The project itself includes a variety of other complementary additions.

118 questions

0 votes

1 answer

177 views

Hadoop, Spark and Cloud

It seems Hadoop, Spark, and different versions of Clouds offer facilities to store and analyze big data. There are some articles comparing Hadoop and Spark (for example, this article). There are also ...

Tara

asked Apr 16, 2023 at 7:28

0 votes

1 answer

60 views

Can I update the source of Data found in a Data Lake or Data Blob

Is it possible to update the source of data found in a Data Lake or Data Blob? What about while using HDInsight or Azure Databricks?

JF0001

asked Dec 8, 2022 at 3:21

0 votes

1 answer

524 views

Storage of N-dimensional matrices (tensors) as part of machine learning pipelines

I'm an infra person working on a storage product. I've been googling quite a bit to find an answer to the following question but unable to do so. Hence, I am attemping to ask the question here. I am ...

user855

asked Feb 22, 2022 at 20:04

0 votes

1 answer

42 views

Can Single Node Hadoop Cluster be installed on a system with 1gb RAM

I am trying to learn hadoop, would like to know if for basic single node installation 1gb RAM system would be enough or we need more RAM. It would be helpful if someone can share what other minimum ...

Gaurav Parek

asked Feb 24, 2021 at 9:02

5 votes

1 answer

1k views

What is the main difference between Hadoop and Spark? [closed]

I recently read the following about Hadoop vs. Spark: Insist upon in-memory columnar data querying. This was the killer-feature that let Apache Spark run in seconds the queries that would take Hadoop ...

Ironclad

asked Sep 5, 2020 at 11:28

0 votes

1 answer

35 views

What are common problems around HADOOP storage?

I've been asked to lead a program to understand why our Hadoop storage is constantly near capacity. What questions should I ask? Data age, Data size? Housekeeping schedule? How do we identify the ...

vwdewaal

asked Mar 31, 2020 at 14:49

1 vote

0 answers

49 views

Loading file into and out of HDFS via system call/cmd line vs using libhdfs

I am trying to implement a simple C/C++ program for the HDFS file system like word count, it takes a file from the input path puts it into HDFS (where it gets split), processed my map-reduce function ...

n0unc3

asked Mar 27, 2020 at 16:48

3 votes

1 answer

631 views

BERT in production

I've created a BERT model. What are the ways to do the deployment of this model? Is it possible to use it with Spark, Hadoop or Docker?

illuminato

asked Feb 27, 2020 at 17:16

15 30 50 per page

2 3 4 5

…

8 Next

Stack Exchange Network

Questions tagged [apache-hadoop]

Hadoop, Spark and Cloud

Can I update the source of Data found in a Data Lake or Data Blob

Storage of N-dimensional matrices (tensors) as part of machine learning pipelines

Can Single Node Hadoop Cluster be installed on a system with 1gb RAM

What is the main difference between Hadoop and Spark? [closed]

What are common problems around HADOOP storage?

Loading file into and out of HDFS via system call/cmd line vs using libhdfs

BERT in production

Hot Network Questions