pyspark.sql.DataFrame.orderBy#
- DataFrame.orderBy(*cols, **kwargs)#
- Returns a new - DataFramesorted by the specified column(s).- New in version 1.3.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- Returns
- DataFrame
- Sorted DataFrame. 
 
- Other Parameters
- ascendingbool or list, optional, default True
- boolean or list of boolean. Sort ascending vs. descending. Specify list for multiple sort orders. If a list is specified, the length of the list must equal the length of the cols. 
 
 - Notes - A column ordinal starts from 1, which is different from the 0-based - __getitem__(). If a column ordinal is negative, it means sort descending.- Examples - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([ ... (2, "Alice"), (5, "Bob")], schema=["age", "name"]) - Sort the DataFrame in ascending order. - >>> df.sort(sf.asc("age")).show() +---+-----+ |age| name| +---+-----+ | 2|Alice| | 5| Bob| +---+-----+ - >>> df.sort(1).show() +---+-----+ |age| name| +---+-----+ | 2|Alice| | 5| Bob| +---+-----+ - Sort the DataFrame in descending order. - >>> df.sort(df.age.desc()).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| +---+-----+ - >>> df.orderBy(df.age.desc()).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| +---+-----+ - >>> df.sort("age", ascending=False).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| +---+-----+ - >>> df.sort(-1).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| +---+-----+ - Specify multiple columns - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([ ... (2, "Alice"), (2, "Bob"), (5, "Bob")], schema=["age", "name"]) >>> df.orderBy(sf.desc("age"), "name").show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| | 2| Bob| +---+-----+ - >>> df.orderBy(-1, "name").show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| | 2| Bob| +---+-----+ - >>> df.orderBy(-1, 2).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| | 2| Bob| +---+-----+ - Specify multiple columns for sorting order at ascending. - >>> df.orderBy(["age", "name"], ascending=[False, False]).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2| Bob| | 2|Alice| +---+-----+ - >>> df.orderBy([1, "name"], ascending=[False, False]).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2| Bob| | 2|Alice| +---+-----+ - >>> df.orderBy([1, 2], ascending=[False, False]).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2| Bob| | 2|Alice| +---+-----+