Roughly, something is a monoid exactly when a parallel reduce type of algorithm can be used. The associativity lets you break it into sub-problems, and the unit lets you insert padding where necessary to get same-sized blocks for parallel processors. It's also a useful concept to know for library design. e.g. when there's a "combine" or "reduce" operation on some data type, it should occur to you that your users will probably want a neutral "do-nothing" element and that your operation should give you a monoid. APIs without one are usually annoying to work with and require extra if statements/special casing.
More generically, named concepts like this give you a way to compress knowledge, which makes it easier to learn new things in the future. You get comfortable with the idea of a monoid, and when you meet a new one in the future, you immediately have an intuitive ground to build on to understand how your new thing behaves.