In this post and its sequel, I’d like to explain a new perspective on matroid theory which was introduced in this recent preprint which I wrote with Nathan Bowler. The main observation is that by working with algebraic structures called hyperfields, in which addition is allowed to be multi-valued, linear subspaces, matroids, valuated matroids, and oriented matroids become special cases of a single general concept. In the process of explaining this observation, I also hope to advertise how natural hyperfields are, for example in the context of tropical geometry.
The notion of an algebraic structure in which addition is allowed to be multi-valued goes back to Frédéric Marty, who introduced hypergroups in 1934. Later on, in the mid-1950’s, Marc Krasner developed the theory of hyperrings and hyperfields in the context of approximating non-Archimedean fields, and in the 1990’s Murray Marshall explored connections to the theory of real spectra and spaces of orderings. For the most part, however, the theory of hyperstructures was largely ignored by the mathematical community at large until Connes and Consani started advocating its potential utility in connection with -geometry and the Riemann hypothesis. There now seems to be a reappraisal of sorts going on within the math community of the “bias” against multi-valued operations. As Oleg Viro puts it in this paper:
Krasner, Marshall, Connes and Consani and the author came to hyperfields for different reasons, motivated by different mathematical problems, but we came to the same conclusion: the hyperrings and hyperfields are great, very useful and very underdeveloped in the mathematical literature… Probably, the main obstacle for hyperfields to become a mainstream notion is that a multivalued operation does not fit to the tradition of set-theoretic terminology, which forces to avoid multivalued maps at any cost. I believe the taboo on multivalued maps has no real ground, and eventually will be removed. Hyperfields, as well as multigroups, hyperrings and multirings, are legitimate algebraic objects related in many ways to the classical core of mathematics… I believe hyperfields are to displace the tropical semifield in the tropical geometry. They suit the role better. In particular, with hyperfields the varieties are defined by equations, as in other branches of algebraic geometry.
I will attempt to provide my own justification for the naturalness and utility of hyperfields by explaining how they unify in a conceptually very satisfying way various flavors of matroids.
Examples of hyperfields
Roughly speaking, a hyperfield is an algebraic structure satisfying similar axioms to a field, but where addition is allowed to be multi-valued. Like fields, hyperfields come equipped with an additive identity element called zero and a negation map . However, one does not require that the hypersum of and is zero, only that zero is contained in the hypersum. Before giving a precise axiomatic definition, let us give some motivating examples.
- (Fields) Any field is in particular a hyperfield.
- (The Krasner hyperfield) The Krasner hyperfield records the arithmetic of “zero” versus “nonzero”. Specifically, define together with the binary operation and the binary hyperoperation . If we think of as representing “nonzero”, the hyperaddition rules just say that zero plus zero is zero and zero plus something nonzero is always nonzero, but the sum of two things which are nonzero could be either zero or nonzero. The negation operator is the identity map, since negative zero is zero and the negative of something nonzero is again nonzero.
- (The hyperfield of signs) The hyperfield of signs records the arithmetic of “zero”, “positive”, and “negative”, represented by the symbols , respectively. The product is defined in the obvious way for . Addition is also defined in the “obvious” way except we have since the sum of a positive number and a negative number could be either zero, positive, or negative. The negation operator takes to itself and interchanges and .
- (The tropical hyperfield) The tropical hyperfield records the arithmetic of valuations. More precisely, if is the valuation map on a valued field with value group , the hypersum consists of all possible values of where are elements of with and . (Note that the axioms for a valuation tell us that .) Concretely, this means that we should define (with the evident conventions when one of equals ), and we define if and . The multiplicative identity element is and the additive identity element is . The negation operator is the identity map, since the unique value of such that is .
All of these examples are special cases of the following construction due to Krasner. Let be a multiplicative semigroup, let be a field, and let be a surjective homomorphism of multiplicative semigroups such that . The induced hyperfield structure on is defined by the hyperaddition law
and the negation rule .
The above examples all have an additional property not shared by all hyperfields: they are doubly distributive (see below for the definition).
Definition of hyperrings and hyperfields
A hyperoperation on a set is a map from to the collection of non-empty subsets of .
If are non-empty subsets of , we define
and we say that is associative if for all .
A (commutative) hyperring is a set together with:
- A commutative and associative binary operation
- A commutative and associative binary hyperoperation
- Distinguished elements
- is a commutative monoid.
- and for all
- For every there is a unique element of (denoted ) such that
- for all
A hyperring is called a hyperfield if and every non-zero element of has a multiplicative inverse.
There are examples of hyperfields which do not arise from Krasner’s construction given above, but they rarely appear in practice.
A hyperfield is called doubly distributive if it satisfies for all .
Zero sets of polynomials
Given an associative hyperoperation on a set , we define the hypersum of for recursively by the formula
If is a hyperfield and is a polynomial in variables with coefficients , the zero set of is
For example, setting with , we have:
- If , then iff does not have exactly one element.
- If , then if and only if all or the nonzero ‘s are not all equal.
- If , then if and only if the minimum of the occurs (at least) twice.
The third of these examples is particularly interesting in the context of tropical algebra / geometry. Indeed, the zero set of a polynomial with coefficients in , in the above hyperfield sense, is what tropical geometers usually call the bend locus of : the set of such that the maximum of the terms is achieved at least twice. The bend locus appears completely naturally in the context of hyperfields, whereas it has to be introduced artificially in the context of “min-plus algebra”.
Grassmann-Plücker relations and functions
With an eye toward generalizing the notion of “linear subspace” from fields to hyperfields, we first recall how one can encode an -dimensional linear subspace of , where is a field, as a certain kind of alternating multilinear function.
Let be an -dimensional linear subspace of , and let be a basis for . Write the vectors as the rows of an matrix with coefficients in . Let , and for each let be the column of . For , let be the determinant of the matrix whose column is . The function is nonzero (since has rank ), alternating, and if we replace by another basis for and by the corresponding matrix , the corresponding functions satisfy for some nonzero . So the projective equivalence class of of the function depends only on the subspace . In addition to being alternating, the function satisfies the following somewhat less obvious collection of algebraic relations:
(Grassmann-Plücker relations) For any two subsets and of ,
Conversely, it is a well-known classical fact that any nonzero alternating function satisfying the Grassmann-Plücker relations is of the form for some matrix or rank , and two functions and are projectively equivalent if and only if the row spaces of and are equal. (This is usually phrased in terms of Plücker vectors and the Grassmannian variety , but the two formulations are easily seen to be equivalent.)
Grassmann-Plücker functions over hyperfields
Now let be a hyperfield. A function is called a Grassmann-Plücker function if it is nonzero, alternating (meaning that and if for some ), and it satisfies the following generalization of the Grassmann-Plücker relations:
(GP) For any two subsets and of ,
Note that the negation operation on is used both to make sense of what it means to be alternating and in writing down (GP).
As noted above, if is a field then a projective equivalence class of Grassmann-Plücker functions is the same thing as an -dimensional subspace of .
We now come to the first big punchline of our story:
Proposition 1: If is the Krasner hyperfield, a projective equivalence class of Grassmann-Plücker functions is the same thing as a rank matroid on .
Indeed, it follows from Example 1 in the section “Zero sets of polynomials” that if is a nonzero alternating function and denotes the set of -element subsets of such that , then satisfies the Grassmann-Plücker relations (GP) if and only if
(BE) Given and , there exists such that .
But condition (BE) is precisely the Basis Exchange property for matroids! In other words, is the set of bases of a rank matroid on . Conversely, given such a matroid , we can define by setting if is a basis of and if not. By the exchange property (BE), the function will satisfy (GP). Since two nonzero functions are projectively equivalent over if and only if they are equal, the Proposition now follows easily.
Motivated by Proposition 1, if is a doubly distributive hyperfield we define an -matroid (or matroid over ) of rank on to be a projective equivalence class of Grassmann-Plücker functions . Thus matroids over fields are the same as linear subspaces, and matroids over the Krasner hyperfield are the same as matroids in the usual sense.
We have the following additional interesting examples of matroids over hyperfields:
Proposition 2: If is the hyperfield of signs, a rank matroid over is the same thing as a rank oriented matroid on .
Proposition 3: If is the tropical hyperfield, a rank matroid over is the same thing as a rank valuated matroid on .
We previously discussed valuated matroids (also known as tropical linear spaces) in this blog post. For more on oriented matroids, see for example this AMS blog post. In the theory of oriented matroids, Grassmann-Plücker functions are called chirotopes.
In a sequel post, we will discuss duality theory for matroids over hyperfields as well as a cryptomorphic characterization of matroids over hyperfields in terms of circuits. We will also say a bit more about the algebra and geometry of hyperrings.
Note added January 2017: In a forthcoming update to my paper with Nathan Bowler, we will explain in detail why the double distributivity condition appears in the above definition of -matroid above.