| DataFrame.__getattr__(name)
 | Returns the Columndenoted byname. | 
| DataFrame.__getitem__(item)
 | Returns the column as a Column. | 
| DataFrame.agg(*exprs)
 | Aggregate on the entire DataFramewithout groups (shorthand fordf.groupBy().agg()). | 
| DataFrame.alias(alias)
 | Returns a new DataFramewith an alias set. | 
| DataFrame.approxQuantile(col, probabilities, ...)
 | Calculates the approximate quantiles of numerical columns of a DataFrame. | 
| DataFrame.cache()
 | Persists the DataFramewith the default storage level (MEMORY_AND_DISK_DESER). | 
| DataFrame.checkpoint([eager])
 | Returns a checkpointed version of this DataFrame. | 
| DataFrame.coalesce(numPartitions)
 | Returns a new DataFramethat has exactly numPartitions partitions. | 
| DataFrame.colRegex(colName)
 | Selects column based on the column name specified as a regex and returns it as Column. | 
| DataFrame.collect()
 | Returns all the records in the DataFrame as a list of Row. | 
| DataFrame.columns
 | Retrieves the names of all columns in the DataFrameas a list. | 
| DataFrame.corr(col1, col2[, method])
 | Calculates the correlation of two columns of a DataFrameas a double value. | 
| DataFrame.count()
 | Returns the number of rows in this DataFrame. | 
| DataFrame.cov(col1, col2)
 | Calculate the sample covariance for the given columns, specified by their names, as a double value. | 
| DataFrame.createGlobalTempView(name)
 | Creates a global temporary view with this DataFrame. | 
| DataFrame.createOrReplaceGlobalTempView(name)
 | Creates or replaces a global temporary view using the given name. | 
| DataFrame.createOrReplaceTempView(name)
 | Creates or replaces a local temporary view with this DataFrame. | 
| DataFrame.createTempView(name)
 | Creates a local temporary view with this DataFrame. | 
| DataFrame.crossJoin(other)
 | Returns the cartesian product with another DataFrame. | 
| DataFrame.crosstab(col1, col2)
 | Computes a pair-wise frequency table of the given columns. | 
| DataFrame.cube(*cols)
 | Create a multi-dimensional cube for the current DataFrameusing the specified columns, allowing aggregations to be performed on them. | 
| DataFrame.describe(*cols)
 | Computes basic statistics for numeric and string columns. | 
| DataFrame.distinct()
 | Returns a new DataFramecontaining the distinct rows in thisDataFrame. | 
| DataFrame.drop(*cols)
 | Returns a new DataFramewithout specified columns. | 
| DataFrame.dropDuplicates([subset])
 | Return a new DataFramewith duplicate rows removed, optionally only considering certain columns. | 
| DataFrame.dropDuplicatesWithinWatermark([subset])
 | Return a new DataFramewith duplicate rows removed, | 
| DataFrame.drop_duplicates([subset])
 | drop_duplicates()is an alias fordropDuplicates().
 | 
| DataFrame.dropna([how, thresh, subset])
 | Returns a new DataFrameomitting rows with null or NaN values. | 
| DataFrame.dtypes
 | Returns all column names and their data types as a list. | 
| DataFrame.exceptAll(other)
 | Return a new DataFramecontaining rows in thisDataFramebut not in anotherDataFramewhile preserving duplicates. | 
| DataFrame.executionInfo
 | Returns a ExecutionInfo object after the query was executed. | 
| DataFrame.explain([extended, mode])
 | Prints the (logical and physical) plans to the console for debugging purposes. | 
| DataFrame.fillna(value[, subset])
 | Returns a new DataFramewhich null values are filled with new value. | 
| DataFrame.filter(condition)
 | Filters rows using the given condition. | 
| DataFrame.first()
 | Returns the first row as a Row. | 
| DataFrame.foreach(f)
 | Applies the ffunction to allRowof thisDataFrame. | 
| DataFrame.foreachPartition(f)
 | Applies the ffunction to each partition of thisDataFrame. | 
| DataFrame.freqItems(cols[, support])
 | Finding frequent items for columns, possibly with false positives. | 
| DataFrame.groupBy(*cols)
 | Groups the DataFrameby the specified columns so that aggregation can be performed on them. | 
| DataFrame.groupingSets(groupingSets, *cols)
 | Create multi-dimensional aggregation for the current DataFrameusing the specified grouping sets, so we can run aggregation on them. | 
| DataFrame.head([n])
 | Returns the first nrows. | 
| DataFrame.hint(name, *parameters)
 | Specifies some hint on the current DataFrame. | 
| DataFrame.inputFiles()
 | Returns a best-effort snapshot of the files that compose this DataFrame. | 
| DataFrame.intersect(other)
 | Return a new DataFramecontaining rows only in both thisDataFrameand anotherDataFrame. | 
| DataFrame.intersectAll(other)
 | Return a new DataFramecontaining rows in both thisDataFrameand anotherDataFramewhile preserving duplicates. | 
| DataFrame.isEmpty()
 | Checks if the DataFrameis empty and returns a boolean value. | 
| DataFrame.isLocal()
 | Returns Trueif thecollect()andtake()methods can be run locally (without any Spark executors). | 
| DataFrame.isStreaming
 | Returns Trueif thisDataFramecontains one or more sources that continuously return data as it arrives. | 
| DataFrame.join(other[, on, how])
 | Joins with another DataFrame, using the given join expression. | 
| DataFrame.limit(num)
 | Limits the result count to the number specified. | 
| DataFrame.localCheckpoint([eager])
 | Returns a locally checkpointed version of this DataFrame. | 
| DataFrame.mapInPandas(func, schema[, ...])
 | Maps an iterator of batches in the current DataFrameusing a Python native function that is performed on pandas DataFrames both as input and output, and returns the result as aDataFrame. | 
| DataFrame.mapInArrow(func, schema[, ...])
 | Maps an iterator of batches in the current DataFrameusing a Python native function that is performed on pyarrow.RecordBatchs both as input and output, and returns the result as aDataFrame. | 
| DataFrame.melt(ids, values, ...)
 | Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set. | 
| DataFrame.na
 | Returns a DataFrameNaFunctionsfor handling missing values. | 
| DataFrame.observe(observation, *exprs)
 | Define (named) metrics to observe on the DataFrame. | 
| DataFrame.offset(num)
 | Returns a new :class: DataFrame by skipping the first n rows. | 
| DataFrame.orderBy(*cols, **kwargs)
 | Returns a new DataFramesorted by the specified column(s). | 
| DataFrame.persist([storageLevel])
 | Sets the storage level to persist the contents of the DataFrameacross operations after the first time it is computed. | 
| DataFrame.printSchema([level])
 | Prints out the schema in the tree format. | 
| DataFrame.randomSplit(weights[, seed])
 | Randomly splits this DataFramewith the provided weights. | 
| DataFrame.rdd
 | Returns the content as an pyspark.RDDofRow. | 
| DataFrame.registerTempTable(name)
 | Registers this DataFrameas a temporary table using the given name. | 
| DataFrame.repartition(numPartitions, *cols)
 | Returns a new DataFramepartitioned by the given partitioning expressions. | 
| DataFrame.repartitionByRange(numPartitions, ...)
 | Returns a new DataFramepartitioned by the given partitioning expressions. | 
| DataFrame.replace(to_replace[, value, subset])
 | Returns a new DataFramereplacing a value with another value. | 
| DataFrame.rollup(*cols)
 | Create a multi-dimensional rollup for the current DataFrameusing the specified columns, allowing for aggregation on them. | 
| DataFrame.sameSemantics(other)
 | Returns True when the logical query plans inside both DataFrames are equal and therefore return the same results. | 
| DataFrame.sample([withReplacement, ...])
 | Returns a sampled subset of this DataFrame. | 
| DataFrame.sampleBy(col, fractions[, seed])
 | Returns a stratified sample without replacement based on the fraction given on each stratum. | 
| DataFrame.schema
 | Returns the schema of this DataFrameas apyspark.sql.types.StructType. | 
| DataFrame.select(*cols)
 | Projects a set of expressions and returns a new DataFrame. | 
| DataFrame.selectExpr(*expr)
 | Projects a set of SQL expressions and returns a new DataFrame. | 
| DataFrame.semanticHash()
 | Returns a hash code of the logical query plan against this DataFrame. | 
| DataFrame.show([n, truncate, vertical])
 | Prints the first nrows of the DataFrame to the console. | 
| DataFrame.sort(*cols, **kwargs)
 | Returns a new DataFramesorted by the specified column(s). | 
| DataFrame.sortWithinPartitions(*cols, **kwargs)
 | Returns a new DataFramewith each partition sorted by the specified column(s). | 
| DataFrame.sparkSession
 | Returns Spark session that created this DataFrame. | 
| DataFrame.stat
 | Returns a DataFrameStatFunctionsfor statistic functions. | 
| DataFrame.storageLevel
 | Get the DataFrame's current storage level. | 
| DataFrame.subtract(other)
 | Return a new DataFramecontaining rows in thisDataFramebut not in anotherDataFrame. | 
| DataFrame.summary(*statistics)
 | Computes specified statistics for numeric and string columns. | 
| DataFrame.tail(num)
 | Returns the last numrows as alistofRow. | 
| DataFrame.take(num)
 | Returns the first numrows as alistofRow. | 
| DataFrame.to(schema)
 | Returns a new DataFramewhere each row is reconciled to match the specified schema. | 
| DataFrame.toArrow()
 | Returns the contents of this DataFrameas PyArrowpyarrow.Table. | 
| DataFrame.toDF(*cols)
 | Returns a new DataFramethat with new specified column names | 
| DataFrame.toJSON([use_unicode])
 | Converts a DataFrameinto aRDDof string. | 
| DataFrame.toLocalIterator([prefetchPartitions])
 | Returns an iterator that contains all of the rows in this DataFrame. | 
| DataFrame.toPandas()
 | Returns the contents of this DataFrameas Pandaspandas.DataFrame. | 
| DataFrame.transform(func, *args, **kwargs)
 | Returns a new DataFrame. | 
| DataFrame.union(other)
 | Return a new DataFramecontaining the union of rows in this and anotherDataFrame. | 
| DataFrame.unionAll(other)
 | Return a new DataFramecontaining the union of rows in this and anotherDataFrame. | 
| DataFrame.unionByName(other[, ...])
 | Returns a new DataFramecontaining union of rows in this and anotherDataFrame. | 
| DataFrame.unpersist([blocking])
 | Marks the DataFrameas non-persistent, and remove all blocks for it from memory and disk. | 
| DataFrame.unpivot(ids, values, ...)
 | Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set. | 
| DataFrame.where(condition)
 | where()is an alias forfilter().
 | 
| DataFrame.withColumn(colName, col)
 | Returns a new DataFrameby adding a column or replacing the existing column that has the same name. | 
| DataFrame.withColumns(*colsMap)
 | Returns a new DataFrameby adding multiple columns or replacing the existing columns that have the same names. | 
| DataFrame.withColumnRenamed(existing, new)
 | Returns a new DataFrameby renaming an existing column. | 
| DataFrame.withColumnsRenamed(colsMap)
 | Returns a new DataFrameby renaming multiple columns. | 
| DataFrame.withMetadata(columnName, metadata)
 | Returns a new DataFrameby updating an existing column with metadata. | 
| DataFrame.withWatermark(eventTime, ...)
 | Defines an event time watermark for this DataFrame. | 
| DataFrame.write
 | Interface for saving the content of the non-streaming DataFrameout into external storage. | 
| DataFrame.writeStream
 | Interface for saving the content of the streaming DataFrameout into external storage. | 
| DataFrame.writeTo(table)
 | Create a write configuration builder for v2 sources. | 
| DataFrame.mergeInto(table, condition)
 | Merges a set of updates, insertions, and deletions based on a source table into a target table. | 
| DataFrame.pandas_api([index_col])
 | Converts the existing DataFrame into a pandas-on-Spark DataFrame. | 
| DataFrameNaFunctions.drop([how, thresh, subset])
 | Returns a new DataFrameomitting rows with null or NaN values. | 
| DataFrameNaFunctions.fill(value[, subset])
 | Returns a new DataFramewhich null values are filled with new value. | 
| DataFrameNaFunctions.replace(to_replace[, ...])
 | Returns a new DataFramereplacing a value with another value. | 
| DataFrameStatFunctions.approxQuantile(col, ...)
 | Calculates the approximate quantiles of numerical columns of a DataFrame. | 
| DataFrameStatFunctions.corr(col1, col2[, method])
 | Calculates the correlation of two columns of a DataFrameas a double value. | 
| DataFrameStatFunctions.cov(col1, col2)
 | Calculate the sample covariance for the given columns, specified by their names, as a double value. | 
| DataFrameStatFunctions.crosstab(col1, col2)
 | Computes a pair-wise frequency table of the given columns. | 
| DataFrameStatFunctions.freqItems(cols[, support])
 | Finding frequent items for columns, possibly with false positives. | 
| DataFrameStatFunctions.sampleBy(col, fractions)
 | Returns a stratified sample without replacement based on the fraction given on each stratum. |