pyspark.pandas.Series.transform#
- Series.transform(func, axis=0, *args, **kwargs)[source]#
- Call - funcproducing the same type as self with transformed values and that has the same axis length as input.- Note - this API executes the function once to infer the type which is potentially expensive, for instance, when the dataset is created after aggregations or sorting. - To avoid this, specify return type in - func, for instance, as below:- >>> def square(x) -> np.int32: ... return x ** 2 - pandas-on-Spark uses return type hint and does not try to infer the type. - Parameters
- funcfunction or list
- A function or a list of functions to use for transforming the data. 
- axisint, default 0 or ‘index’
- Can only be set to 0 now. 
- *args
- Positional arguments to pass to func. 
- **kwargs
- Keyword arguments to pass to func. 
 
- Returns
- An instance of the same type with self that must have the same length as input.
 
 - See also - Series.aggregate
- Only perform aggregating type operations. 
- Series.apply
- Invoke function on Series. 
- DataFrame.transform
- The equivalent function for DataFrame. 
 - Examples - >>> s = ps.Series(range(3)) >>> s 0 0 1 1 2 2 dtype: int64 - >>> def sqrt(x) -> float: ... return np.sqrt(x) >>> s.transform(sqrt) 0 0.000000 1 1.000000 2 1.414214 dtype: float64 - Even though the resulting instance must have the same length as the input, it is possible to provide several input functions: - >>> def exp(x) -> float: ... return np.exp(x) >>> s.transform([sqrt, exp]) sqrt exp 0 0.000000 1.000000 1 1.000000 2.718282 2 1.414214 7.389056 - You can omit the type hint and let pandas-on-Spark infer its type. - >>> s.transform([np.sqrt, np.exp]) sqrt exp 0 0.000000 1.000000 1 1.000000 2.718282 2 1.414214 7.389056