Why it’s hard for programmers to write a program to flatten a list?


I take many programming interviews at my current organisation. These days one of my favourite interview question is to flatten a nested list structure. I give user an input [1,[2,3], [4, [5,6]]] and ask them to write a program that will give output [1,2,3,4,5,6]. So, far I have asked this question to at least 20 candidates and 18 out of them have failed to write this program. I find this question better than canonical FizzBuzz problem as flattening a list is comparatively more involved problem.

This is how interview goes:

  1. I ask candidate to write a program that takes [1,[2,3], [4, [5,6]]] as input and return [1,2,3,4,5,6] as output. I specifically say you have to flatten a list.
  2. They can choose any programming language they like. Usually they are Java programmers so they end up choosing that.
  3. I ask them to write test case as well.

The common patterns I have noticed in most interviews:

  1. Candidate fail to write proper method signature. They get confused about what type of list they should use. Some start with List of integers List<Integer> ints. They fail to see how they will store a List<Integer> to a List<Integer>.
    public List<Integer> flatten(List<Integer> numbers)
  2. Some candidates start with a primitive array. Very quickly they realize array is fixed size data structure so they should use a List.
  3. They give method bad name like getList or processList etc. While explaining the problem, I explicitly mention that they have to flatten a list still they don’t use flatten as name of the method.
  4. Most programmers fail to get that they can recursion to solve the problem. Candidates who solved this problem get that they have to use recursion in a few seconds.
  5. Some programmers have hard time storing result in a List using the recursion approach. They use recursion but forget to store result. Most don’t know difference between recursion and tail recursion.
  6. Some programmers have hard time thinking how they should check instance is of some type.
  7. Most of them make flatten instance method. Programmers don’t think whether they should make method static. Programmers for some reason think static is bad.
  8. No one starts with a test. Most jump straight away to code. You have to pause them and ask can you please write down in plain English what you are trying to achieve.
  9. I tend to give a lot of help and pointers to the candidate. I want to make sure that they solve this problem. Candidates end up taking close to 45-60 mins to finish the solution. Some fail to get it working even after an hour.
  10. No one thinks about generic program so that solution will work across all types.
  11. No one used Java 8. Candidates tell they have heard some features of Java 8 but they don’t use it. I doubt adoption of Java 8 has reached a critical mass.

I am not sure what makes this problem tough for candidates. May be pair programming with interviewer just does not work. What are your thoughts?

Advertisements

21 thoughts on “Why it’s hard for programmers to write a program to flatten a list?

  1. Nalle Rooth (@NalleRooth)

    I find this disturbing, as it is something that really shouldn’t be that hard to do. I think that anyone who is not a junior developer should be able to get something working very quickly – even though it might not be a beautiful/efficient solution at that point.

    Now, if the question was to find “the most efficient way to flatten a list” – I’d understand why people would have some issues providing a quick solution (whatever “efficient” means in this case is an excellent counter question).

    FWIW, here’s a five minute hack written in javascript.

    function flatten(arr) {
    if (!(arr instanceof Array)) { return []; }

    const ret = [];

    arr.forEach((el) => {
    if (el instanceof Array) {
    flatten(el).forEach((child) => {
    ret.push(child);
    });
    } else {
    ret.push(el);
    }
    });

    return ret;
    }

    console.log(flatten([1,[2,3], [4, [5,6]]]));

    Reply
  2. Christopher E. Stith

    I don’t understand the difficulty. If they have the freedom to choose a language and Java makes it difficult, then perhaps they should choose another language. Here’s a trivial Perl5 program.:

    use strict;
    use warnings;
    use Data::Dumper ();
    my $nested = [ 1, [ 2, 3 ], [ 4, [ 5, 6 ] ] ];

    sub flatten {
    my @f;
    push @f, (ref $_ ? flatten( $_ ) : $_ ) for @{ $_[0] };
    return @f;
    }

    print STDOUT (join ‘ ‘, flatten( $nested )) . “\n”;

    my @flat = flatten( $nested );
    print Data::Dumper::Dumper \@flat;

    Reply
  3. chairloaf

    Thanks for this article. That’s a nice programming interview question.

    You seemed to contrast “developer” with “programmer.” What’s the difference? I’m on the programmers’ side with regard to preferring static methods: instance variables are for storing state, which wouldn’t help for this data manipulation task.

    Reply
  4. D. A. Hosek (@dahosek)

    I’ve found that the vast majority of candidates I’ve interviewed (at multiple companies, many with years of experience as programmers), cannot do simple programming tasks. Fizzbuzz is beyond them. I’ve had candidates fresh out of computer science and software engineering MASTERS programs who can’t write simple recursive functions. It mystifies me.

    Reply
    1. Andrew Whitworth

      A problem that I find a lot with FizzBuzz is that many candidates become paralyzed because they are looking for some kind of elegant solution, when one doesn’t exist. They spend time looking for some kind of “trick” to it, that they never get started just writing the necessary loops and conditionals. The interview setting seems to create mental roadblocks in people that they just can’t recover from.

      Reply
  5. Jeremie M

    I think one problem is how to represent the nested lists in strongly typed languages.
    While in Python or Javascript is may be easy to think that way because the types can contain anything, candidates in C++ or Java will have to think way more to come up with a recursively defined data structure.

    In such language the structure of the nested list would usually look like the following recursively defined type:

    class NestedList {
    Vector list;
    int value;
    }

    It’s even less obvious in C++ since they will have to use dynamically allocated memory to be able to represent that only one of the field can be set, or have to deal with the awkwardness of default values and a flag to determine what to process.

    You talk about making things generic and it’s again making it harder for people using strongly typed languages. If everybody has to use such a language then the difficulty is set, but if not then Python and Javascript code will be the easier solution (Java generics and C++ templates are not inherently hard but they are not used by everybody very frequently).

    So you end-up with something like this:

    class NestedList {
    Vector list;
    T value;
    }

    Then you have programmers who may want to design an elegant solution. It’s pretty obvious to see that the flattening of such a nested list would be elegantly written as a method on the object:

    public class NestedList {
    public Vector list;
    public T value;

    public Vector flatten() {
    return flatten(new Vector());
    }
    private void flatten(Vector res) {
    if (list == null) res.add(value);
    else {
    for (T v : list) flatten(v);
    }
    }
    }

    The solution is quite short but not that simple to write for mediocre engineers who don’t have formal algorithm background because they will struggle to deal with: 1/ inherently recursive solution, 2/ designing a recursively defined type.

    If you were to expose the type for the beginning and show how to construct the list, it would make strongly typed languages on par with non strongly typed ones.

    Reply
  6. Teddy Knox

    You’re hitting an achilles heel. Many developers prefer typesafe languages for their familiarity but don’t feel comfortable reasoning about types. Your problem statement is phrased in dynamically-typed way but in order to solve it in java it must be reformulated around static types.

    > I ask candidate to write a program that takes `[1,[2,3], [4, [5,6]]]` as input and return `[1,2,3,4,5,6]` as output.

    This is accidentally a trick question because when naive candidates see `[1,[2,3], [4, [5,6]]]`, they think “this is a valid literal”. It is, *in python*. To implement the same Java (without resorting to List), it’s not just the literal that’s different, it’s that the list has a type parameter, and it’s for an wrapper class.

    So candidates never realize they’ve accidentally opted-in to a problem on generics until it’s tripped them up so much that they’re nervous and can’t think straight enough to finally realize the caliber of problem they’re dealing with.

    Reply
  7. Ash

    > Most programmers failed to get that they can recursion to solve the problem. Candidates who solved this problem get that they **have** to use recursion in a few seconds.

    imho the best solution at hacker news was this one by dllthomas:

    { echo [; sed s/[][]//g; echo ]; }

    Reply
  8. Colby Russell

    No one starts with a test. Most jump straight away to code. You have to pause them and ask can you please write down in plain English what they are trying to achieve.

    Please don’t interrupt after laying out the task and indicating that it’s okay to start. It might be okay if you suspect that they’re stuck, and then you hold off for a bit to confirm that it’s true and that you’re not just interrupting their thoughts. Even then, ask, “Do you mind if I make a suggestion?”, pause for a response, and then ask, “How about starting with some simple test cases?” if they answered in the affirmative.

    Even if it’s a personal preference of yours and/or a corporate policy that tests come first, it’s the kind of thing that can be brought up in discussion afterwards.

    Reply
  9. Andrew Whitworth

    What level are these candidates? Is this entry level or something higher?

    What format is the input in? If you’re giving the input list as a string, a trivial solution is to parse over the string and pull out all digit characters (maybe with a little bit of logic to get multi-digit numbers). This doesn’t work with generic types, but then again over-generalizing and over-engineering a piece of code isn’t exactly a virtue.

    Part of the problem, I think, is that the problem as stated is weird. The input list you describe isn’t strongly typed because any node in the tree can be either an int, a List of int or a List of other Lists, etc. Almost any solution for modern OO languages is going to require speculative casts and reflection, which isn’t a great place to be. Strongly typed languages expect you to be able to unambiguously describe the type of your data, and having to start out with a list of objects and check each one to see what type it is at runtime is pretty far removed from accepted best practice. It might be a different story if your input was [[[1]],[[2,3]], [[4], [5,6]]], because at least you could use the type system to model your data correctly. In this case, I would expect most candidates to at least get the method signature correct.

    Also, I’m not sure I would jump right to a recursive solution without knowing something about complexity bounds on the possible inputs. Are we going to overflow stack with a deeply nested structure? If that’s not a concern, recursion does seem like the best option. But then again, are you testing recursion specifically? I find that recursion isn’t applicable to many use-cases, so many candidates just might not have a lot of experience with that technique to begin with. I wouldn’t expect professional coders to be solving Fibonacci sequences and Tower of Hanoi problems all day. If the average coder is using recursion at all, it’s in the rare case where you need to traverse a nested data structure AND there isn’t already a Visitor pattern implementation available. If your use-case specifically requires somebody who is a master of recursion, this problem would seem ok, but otherwise I suspect you’re testing people on a subject that they don’t know and (mostly) won’t need to know. Many coders use Quicksort implementations but vanishingly few coders need to actually write one, etc.

    I worked with a company where one of the hiring managers insisted on grilling candidates on Big O notation and asymptotic complexity, when the positions we were trying to fill weren’t expected to use those skills at any point. I shudder to think how many good candidates were sent home because they didn’t know something that we didn’t need them to know in the first place. I would keep that in mind when thinking about your own interview questions, because you might be filtering based on a criteria which isn’t germane to the position at hand.

    Reply
  10. Dany Wilson

    Interesting that candidates can spend upwards to 45mins on this problem, in Python my solution is 10 lines of code…

    def flatten_list (list_to_be_fatten):
    result = []
    for elt in list_to_be_flatten:
    if isinstance(elt, int):
    result.append(elt)
    else
    temp_list = flatten_list(elt)
    result = result + temp_list
    return result

    Reply
    1. Christopher E. Stith

      If conciseness is your goal…

      sub flatten { map { ref ? flatten( @$_ ) : $_ } @_ }

      Of course, Perl doesn’t care about most of that whitespace, either, so it could be golfed rather readily.

      sub f{map{ref?f(@$_):$_}@_}

      Oh, and the comment box broke your Python because it’s leading whitespace sensitive. It’s a rare drawback with a good editor or IDE, but it is a drawback in some cases.

      Reply
    2. Josh Nam

      Why 10 when you can in 4?
      def flatten(L):
      “””Return the list L flattened”””
      if not L: return []
      if type(L) is not list: return [L]
      return flatten(L[0]) + flatten(L[1:])

      and some test cases:
      assert flatten([]) == []
      assert flatten([‘a’]) == [‘a’]
      assert flatten([1]) == [1]
      assert flatten([1,2]) == [1,2]
      assert flatten([1,2,3]) == [1,2,3]
      assert flatten([[]]) == []
      assert flatten([1,[2]]) == [1,2]
      assert flatten([1,[2],3]) == [1,2,3]
      assert flatten([1,[2],[3]]) == [1,2,3]
      assert flatten([1,[[2],[3]]]) == [1,2,3]
      assert flatten([[1],[2],[3]]) == [1,2,3]
      assert flatten([[1,[2]],[3]]) == [1,2,3]
      assert flatten([[[1,[2],[3]]]]) == [1,2,3]
      assert flatten([[[1,[2]],[[3]]]]) == [1,2,3]
      assert flatten([[[1,[2,3],[4]]]]) == [1,2,3,4]

      Reply
  11. Evan Zamir

    The Haskell version is beautifully concise, and of course, uses recursion (h/t to random dude on SO):

    {-# LANGUAGE MultiParamTypeClasses, OverlappingInstances, FlexibleInstances #-}

    module Flatten where

    class Flatten i o where
    flatten :: [i] -> [o]

    instance Flatten a a where
    flatten = id

    instance Flatten i o => Flatten [i] o where
    flatten = concatMap flatten

    Reply
  12. Gareth Rees

    I think that there are two difficulties to this problem. The first difficulty is that although the problem statement says “flatten a list”, “list” here actually means “tree”. (In Lisp, a list is a tree, but that’s not the case in most modern strongly typed typed languages with built-in lists.)

    The second is that hardly anyone uses trees any more. The main application of trees is the efficient search of ordered data, but for most applications hash tables (maps, dictionaries) are superior, and unless you work on something specialized like a parser or a database index or a file system you can spend a career in programming without having to implement a tree algorithm. (Who today is familiar with Sleator & Tarjan’s “splay” algorithm for optimizing future searches of a binary tree on the same key? Or Morris’s constant-space tree traversal via pointer reversal? Or Stout & Warren’s algorithm for balancing a binary tree in linear time and constant space? These kinds of technique were once the bread and butter of systems programmers.)

    Reply
  13. Ahmad Alhashemi

    Python 3 with reduce

    from functools import reduce
    def flatten(lst):
    return reduce(lambda xs, x: xs + (flatten(x) if isinstance(x, list) else [x]), lst, [])

    Python 3 without reduce

    def flatten(lst):
    flat = []
    for x in lst:
    flat += flatten(x) if isinstance(x, list) else [x]
    return flat

    Reply
  14. vorotato

    I think the root of the problem if you’re asking a Java programmer. They were born from OOP, breathe it and live it. OOP methodology is particularly bad at solving this exact kind of problem. By contrast a functional programmer would have no problem, recursion in our languages have TCO so they won’t overflow. For example your suggestion of a recursive Flatten will actually overflow in java for sufficiently high lists, instead you really should be using an imperative solution using loops and yes it is possible. I don’t think this is much of a condemnation of the java programmer, but you really shouldn’t be hiring a java programmer to solve functional style problems. Give them tests relative to the language at hand and the work you intend them to do, or at least give them a language which can adequately solve the tests provided.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s