Skip to main content
improved explanation
Source Link
mrsrinivas
  • 35.7k
  • 13
  • 133
  • 132

You can parse decimal token with decimal.Decimal()

Here we are binding the code inside a UDF then using df.withColumn

import decimal from pyspark.sql.types import IntType def is_valid_decimal(s): try: # return (0 if decimal.Decimal(val) == int(decimal.Decimal(val)) else 1) return (0 if decimal.Decimal(val)._isinteger() else 1) except decimal.InvalidOperation: return 0 # register the UDF for usage sqlContext.udf.register("is_valid_decimal", is_valid_decimal, IntType()) # Using the UDF df.withColumn("result", is_valid_decimal("test_column")) 

You can parse decimal token with decimal.Decimal()

Here we are binding the code inside a UDF then using df.withColumn

import decimal from pyspark.sql.types import IntType def is_valid_decimal(s): try: return (0 if decimal.Decimal(val) == int(decimal.Decimal(val)) else 1) except decimal.InvalidOperation: return 0 # register the UDF for usage sqlContext.udf.register("is_valid_decimal", is_valid_decimal, IntType()) # Using the UDF df.withColumn("result", is_valid_decimal("test_column")) 

You can parse decimal token with decimal.Decimal()

Here we are binding the code inside a UDF then using df.withColumn

import decimal from pyspark.sql.types import IntType def is_valid_decimal(s): try: # return (0 if decimal.Decimal(val) == int(decimal.Decimal(val)) else 1) return (0 if decimal.Decimal(val)._isinteger() else 1) except decimal.InvalidOperation: return 0 # register the UDF for usage sqlContext.udf.register("is_valid_decimal", is_valid_decimal, IntType()) # Using the UDF df.withColumn("result", is_valid_decimal("test_column")) 
updated answer
Source Link
mrsrinivas
  • 35.7k
  • 13
  • 133
  • 132

You can parse decimal token with decimal.Decimal()

Here we are binding the code inside a UDF then using df.withColumn

import decimal from pyspark.sql.types import IntType def is_valid_decimal(s): try: return (0 if decimal.Decimal(val)  == int(decimal.Decimal(val)) returnelse 1) except decimal.InvalidOperation: return 0 # register the UDF for usage sqlContext.udf.register("is_valid_decimal", is_valid_decimal, IntType()) # Using the UDF df.withColumn("result", is_valid_decimal("test_column")) 

You can parse decimal token with decimal.Decimal()

Here we are binding the code inside a UDF then using df.withColumn

import decimal from pyspark.sql.types import IntType def is_valid_decimal(s): try: decimal.Decimal(val)  return 1 except decimal.InvalidOperation: return 0 # register the UDF for usage sqlContext.udf.register("is_valid_decimal", is_valid_decimal, IntType()) # Using the UDF df.withColumn("result", is_valid_decimal("test_column")) 

You can parse decimal token with decimal.Decimal()

Here we are binding the code inside a UDF then using df.withColumn

import decimal from pyspark.sql.types import IntType def is_valid_decimal(s): try: return (0 if decimal.Decimal(val) == int(decimal.Decimal(val)) else 1) except decimal.InvalidOperation: return 0 # register the UDF for usage sqlContext.udf.register("is_valid_decimal", is_valid_decimal, IntType()) # Using the UDF df.withColumn("result", is_valid_decimal("test_column")) 
Source Link
mrsrinivas
  • 35.7k
  • 13
  • 133
  • 132

You can parse decimal token with decimal.Decimal()

Here we are binding the code inside a UDF then using df.withColumn

import decimal from pyspark.sql.types import IntType def is_valid_decimal(s): try: decimal.Decimal(val) return 1 except decimal.InvalidOperation: return 0 # register the UDF for usage sqlContext.udf.register("is_valid_decimal", is_valid_decimal, IntType()) # Using the UDF df.withColumn("result", is_valid_decimal("test_column"))