- Have a very few basic data structures. This is the philosophy of Scheme.
- Have a rich type system that is organized into a hierarchy. Haskel is famous for this approach.
- Have a rich type system that is using union types. This is the Ceylon way.
- Have a rich type system with traits. This approach is taken by Rust, Swift, Python and Go.
But once you have many data types, you frequently find yourself in a need to write the same function but for multiple data types. An illustrative example would be a sum function that should be able to work on integer vector and floating point vector. But it can quickly become tiresome to maintain multiple copies of the same code. Languages with the rich data type system have to provide some way how to avoid plain duplication of the code. Haskell solves it by providing data types in a hierarchy - if you need to write a function that works on numbers, just write it for the common ancestor of all numbers.
The hierarchical type system is, however, quite restrictive. If there are types that fulfill requirement α and some types fulfill requirement β, then we can represent such types in a Venn diagram:

But if all 3 sections, A, B and A∩B are non-empty, we cannot say that α⊆β or β⊆α. In other words, with a hierarchical system we cannot use both, α and β characteristics, to describe the types. We can, at best, just pick one of these characteristics to describe the data types.
A concrete example of this dilemma are integers in databases. Should we treat them as additive or as "groupable"? Integers are both. But continuous numbers are, arguably, just additive. While characters are, arguably, only groupable. In a hierarchical system, we generally take the smaller of the evils and accept that we can test exact equality even on floating point numbers. Union type systems (systems that rely on union/sum types) and trait type systems (as in protocols in Swift or traits in Python or intersection types in algebraic data types) allow you to tackle this dilemma. Btw., if you are using tags to organize your bookmarks, files or photos, you are already using a trait system. A nice critique of hierarchical systems is given in Goodbye, Object Oriented Programming.
What is the difference between the union type system and the trait type system? Union type systems assume that we have a fixed set of types (e.g.: float, double, integer, ...). And each variable (e.g.: a function argument) specifies a set of data types that it accepts. On the other end, trait type systems assume that we have a fixed set of traits (e.g.: comparable, ordinal, additive, ...) and each variable defines traits that the data type must fulfill.
