Subscribe to RSS

Question 1

I'm running flink job and on my local machine I dont see any issue of streaming the data to Azure blob, but when I deploy on dev environment I'm seeing an error in the console like Caused by: org....

Question 2

I have a flink job which streams data to azure using hadoop fs. Currently I'm able to push the data and create a new file but I want to roll the new file when there is a date change(like from 2025-03-...

Question 3

I write a hadoop streaming job, that uses python code to transform the data.But the job occurred some error.when the input file is larger(e.g. 70M bytes), it will hange on the reduce stage.When I ...

Question 4

I need help for a school project. For the labs I've did, I've written the mapper and reducer scripts in python (version 3) and I was able to run hadoop streaming with no problems there. Then I edited ...

Question 5

core-site.xml config : <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://Master:9000</value> </property> </...

Question 6

If I use NLineInputFormat in hadoop streaming, how to specify N？ hadoop jar /home/Software/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar \ -D stream.non.zero.exit.is.failure=false \ -D ...

Question 7

I am currently trying to use Hadoop streaming. I have a file called diamonds.txt that contains the carat of a diamond and its price beside it, all separated by commas (csv). An example of the first ...

Question 8

I'm trying to convert xml files through a mapreduce job and receive the error : 2023-04-04 09:41:52,515 INFO mapreduce.Job: map 0% reduce 0% 2023-04-04 09:42:12,676 INFO mapreduce.Job: Task Id : ...

Question 9

bash file code I formatted the mapper and the reducer to be the same so I can skip the mapping steps and just continue to reduce it. IN this case I am only doing two reduce jobs. It works fine using ...

Question 10

I am fairly new to using hadoop and I've been get these exceptions when I run a file on hadoop.Please help. this is the command: hadoop jar /home/eeman/hadoop-3.2.4/share/hadoop/tools/lib/hadoop-...

Question 11

This is my first time using hadoop for anything so I started working with basic program which is word count. On my local machine it works perfectly fine. Real issue is that I am unable to run in on ...

Question 12

Below are the runtime versions in pycharm. Java Home /Library/Java/JavaVirtualMachines/jdk-11.0.16.1.jdk/Contents/Home Java Version 11.0.16.1 (Oracle Corporation) Scala Version version 2.12.15 ...

Question 13

I'm new to Hadoop, and trying to use streaming option to develop some jobs using Python on windows 10 localy. After double checking my pathes given, and even my program, I encounter an Exception that ...

Question 14

I'm Hadoop in Colab and I have two documents that I've made in Pycharm, one with the mapper and another one with the reducer part. This is the code: !apt-get install -y openjdk-11-jdk-headless -qq >...

Question 15

I am trying to write a code that would calculate average temperature (reducer.py) based on ncdc weather. 0057011060999991928010112004+67500+012067FM-12+001199999V0202001N012319999999N0500001N9+00281+...

Collectives™ on Stack Overflow

Unable to stream data to azure blob using flink job

How to change the file name with updated date in flink job

hadoop streaming job hanged at reduce side merge stage

Python - How to run Hadoop stream passing command line arguments

MapReduce Troubleshoot with python script as mapper and reducer using hadoop-streaming-3.3.6.jar

Specify N in hadoop streaming when use NLineInputFormat

Unable to process text file using mapreduce on linux

Hadoop mapreduce error : PipeMapRed.waitOutputThreads(): subprocess failed with code 1

How to execute multiple reduce jobs with one mapper using bash file in Hadoop using Python as the base?

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2

Word count application is not running on hadoop

How to fix "java.lang.ClassNotFoundException: org.apache.spark.internal.io.cloud.PathOutputCommitProtocol" Pyspark

Hadoop Streaming Exception (No FileSystem for Scheme "C")

Caused by: java.io.IOException: error=2, No such file or directory error in Colab Hadoop

Calculate average temperature in reducer

Hot Network Questions