The kernel of a filter would be it's impulse response, which is what you convolve by to get the filter response. That's where the sloppy terminology comes from. A kernel though does not need to be a filter.
A kernel is a function whose product maps a point in one domain onto another domain. For example, the Fourier transform has a kernel of e^jwt. The integral (or sum if discrete) of these products over the function is the transform because it maps the entire function into it's new space. A filter is a function typically defined as having product behavior in the frequency(transformed) domain, which is equivalent to convolution in the time(original) domain. A window is a function that has product behavior in the time(original) domain, and thus convolution behavior in the frequency(transformed) domain.
Particularly in linear algebra (matrix math), if something is a kernel function, there are certain mathematical implications.
Another confusing bit here is that the convolution they are performing to project the original function (the larger image matrix) onto the smaller one isn't a proper convolution, it has a hidden window function in the way the operation is being performed to restrict the output to only the fully overlapped area of an otherwise linear 2D convolution. This is typically called a cropped convolution in image processing.