Learning Swift - Top rank per group example

alx1 · June 12, 2021, 8:48pm

Hello everyone,

I'm learning Swift as my second programming language (I have basic knowledge of imperative and OOP part in Java).
Currently, I'm trying to understand "Top rank per group" example from Rosetta code https://www.rosettacode.org/wiki/Top_rank_per_group#Swift

I understand that we have struct Employee, array employees with the list of employees, id's and departments; we have also highestSalaries function, for loop and print function.

But I'm I'm stuck with the logic of highestSalaries function (I understand that we have reduce into function and accumulator, but the whole step-by step execution flow is unclear for me):

func highestSalaries(employees: [Employee], n: Int = 1) -> [String: [Employee]] {
  return employees.reduce(into: [:], {acc, employee in
    guard var cur = acc[employee.department] else {
      acc[employee.department] = [employee]
 
      return
    }
 
    if cur.count < n {
      cur.append(employee)
    } else if cur.last!.salary < employee.salary {
      cur[n - 1] = employee
    }
 
    acc[employee.department] = cur.sorted(by: { $0.salary > $1.salary })
  })

Could someone please explain to me step-by-step execution flow of that function? I think I've got lost...

Thanks a lot !

jflan · June 12, 2021, 9:19pm

Here's my breakdown of what's happening in their example:

func highestSalaries(employees: [Employee], n: Int = 1) -> [String: [Employee]] {
    return employees.reduce(into: [:], {acc, employee in
        // Try to assign the existing array of employees for this employees
        // department to `cur`.
        guard var cur = acc[employee.department] else {
            // If there isn't one create a new array with only this employee
            // and add it to the dictionary
            acc[employee.department] = [employee]
            
            // Nothing else to do
            return
        }
        
        if cur.count < n {
            // If the current list is shorter than `n`, append the new employee
            // since they must be in the top `n` salaries so far.
            cur.append(employee)
        } else if cur.last!.salary < employee.salary {
            // `cur` is kept sorted by the line after this block, therefore if
            // the new employee's salary is higher than the lowest paid
            // employee in `cur` the new employee belongs in the list instead.
            // Replace the lowest paid employee in `cur` with the new employee.
            cur[n - 1] = employee
        }
        
        // Re-sort `cur` to ensure the new employee is in the correct position
        // and assign it back into `acc`
        acc[employee.department] = cur.sorted(by: { $0.salary > $1.salary })
    })
}

If I were implementing a function to accomplish the same work I'd probably write it like this:

func highestSalaries(employees: [Employee], n: Int = 1) -> [String: [Employee]] {
    // Group the employees into a dictionary by department
    Dictionary(grouping: employees, by: \.department)
        .mapValues {
            // Sort each group of employees by salary and take the first `n`
            // employees. `.prefix` returns an `ArraySlice` which can must
            // be converted back into an `Array`.
            Array($0.sorted(by: { $0.salary > $1.salary }).prefix(n))
        }
}

These two implementations would have different performance and memory characteristics, but I find the second implementation a lot more readable so unless it came up as a performance bottleneck I'd prefer the more succinct implementation.

jflan · June 12, 2021, 9:28pm

Here's a de-sugared version of the second implementation that's doing effectively the same thing, but without using the special $0 and $1 parameters:

func highestSalaries(employees: [Employee], n: Int = 1) -> [String: [Employee]] {
    // Group the employees into a dictionary by department
    Dictionary(grouping: employees, by: \.department)
        .mapValues { departmentEmployees in
            // Sort each group of employees by salary.
            let sortedEmployees = departmentEmployees.sorted { employee1, employee2 in
                employee1.salary > employee2.salary
            }
            
            // Return the first `n` employees. `.prefix` returns an `ArraySlice`
            // which can must be converted back into an `Array`.
            return Array(sortedEmployees.prefix(n))
        }
}

toph42 · June 13, 2021, 1:54pm

I had to read your logic to find out what n meant. Consider naming your functions so that they read naturally to explain their inputs and outputs.

AlexanderM · June 13, 2021, 2:25pm

This algorithm is available in the Swift Algorithms package under the name max(count:). You should check that out! The best part is that it's open source, and you can see how they do it.

I think it's better to try to express your algorithm more generically (not in the type system generics sense, necessarily), and then applying that generic algorithm to your usecase. What you're doing here is effectively intertwining the generic algorithm (get the max n values out of a collection according to some comparison function) and its application (using it specifically for employee salaries).

This limits reuse. For example, even though the code to get the 5 oldest tenure employees would be almost identical to this code, you wouldn't be reuse your existing function for it.

alx1 · June 13, 2021, 8:09pm

Thanks a lot!

alx1 · June 15, 2021, 8:57am

Thank you very much!